Interpol Hashing
Requirements:
- Operating system(s): all
- Any restrictions to use by non-academics: none
Download (R)
Please cite:
TBA
License:
MIT License
Usage:
The zip file contains three files:
- README.txt
- InterpolHashing.R
- Interpol_1.3.2.tar.gz
IMPORTANT
1) Make sure that you installed Interpol 1.3.2 from the Interpol_1.3.2.tar.gz.
install.packages("Interpol_1.3.2.tar.gz")
2) Install Seqinr from CRAN
install.packages("seqinr")
3) Download a fasta file as database, e.g., Swissprot from https://www.uniprot.org/help/downloads
4) Source the InterpolHashing.R file
source("InterpolHashing.R")
5) Create Interpol Hashing Database with the following commands:
database <- seqinr::read.fasta("uniprot_sprot.fasta.gz",as.string=TRUE, forceDNAtolower = FALSE)
database <- createDatabase(database)
database <- encodeDatabase(database, length_factor = 300)
6) Now you can search for sequences. Please make sure that you define the variable „query“ as a protein sequence, e.g.,
query <- "ALGATIIAGASLTFKILDEV"
getSequence(query, database, percentage = 0.01)
The result should look like this:
identifier sequence length score avgScore p
sp|P58689|21DD_HETMG ALAGTIIAGASLTFKILDEV 20 2.549747 13.21368 1.831873e-218