Help  
Virus Family:
A virus is a small, infectious, obligate intracellular parasite, capable of replicating itself in a host cell. Virions are formed by de novo assembly from newly synthesized components: the genome and a number of copies of at least one viral protein (capsid, or coat protein). Then the virions exit the cell and enter new cells, thereby beginning a new infectious cycle. A viral genome consists of either single-stranded or double-stranded DNA or RNA in either linear or circular form, and can comprise one or more segments.
The viruses in PEPTIBASE are grouped acording to the structure of their genetic material. The first stage of selection is the choice of a virus familly. One cannot choose a virus without first choosing its familly.

Virus:
Once a familly was chosen, one can choose the virus of interest. The Viruses included in PEPTIBASE are all the viruses from the NCBI database at http://www.ncbi.nlm.nih.gov/genomes/VIRUSES/viruses.html . We currently present a single sequence from each virus. One cannot choose a virus before selecting the appropriate familly.

Protein:
Once a virus/carrier was chosen, one can choose the protein(s) of interest. The proteins included in PEPTIBASE are all the proteins from the appropriate Entrez records. One cannot choose a proteinbefore selecting the appropriate host. The protein choice is a multiple choice, and one can simultanesouly select multiple proteins, up to the full list of protein from the host.

MHC-I allele:
The Alleles used are all the alleles for which the BIMAS algorithm provides coefficient tables. These include 9 HLA-A alleles, 20 HLA-B alleles, 4 HLA-C alleles and 6 mouse MHC alleles. The alleles names are acording to the latest nomenclature in the NCBI dbMHC. The HLA B*4001 (ex HLA B*60) and HLA B*40 are treated as different alleles for historical reasones, although HLA B*4001 currently belong to the HLA B*40 familly.

Cutoffs:
All the algorithms used in PEPTIBASE provide a score for each nonamer. A high score represents a higher probability that the appropriate nonamer does indeed pass the appropriate processing stage (cleavage, TAP. MHC binding). A cutoff is applied to all processing stages to select only peptides succesfully processed and presented.

Cleavage Score:
We have developed a score representing the probability that a peptide should be produced by the proteasome. Given a peptide and its two flanking regions FN-P1P2?..Pn-FC, we compute:
S(peptide)=s1(FN)+S2(P1)+S3(P2)+...+S3(P8) +S4(P9)+S5(FC)
FN and FC are the N and C termini flanking regions, while P2-P9 are the residues within the peptide. A peptide wisth a high score, S, has a high probability of being produced, while a low S score predicts a low creation probability.

TAP Score:
The TAP algorithm is a weight matrix algorithm. The matrix is composed of 20 rows (one row for each amino acid), and 9 columns. The algorithms output is a score composed of the sum of the corresponding components of the matrix, according to the position of every amino acid in the peptide. The algorithm was developed by Peters et al(see related links).

MHC-I score:
The MHC-I prediction is based on the BIMAS algorithm (see related links). A different algorithm was developed for each allele. All the algorithms are weight matrice algorithms. We do not use the cutoff values proposed by the BIMAS algorithm ,but provide our own cutoffs.

Conservation score:
Conservation is currently not implemented.

Return to default values
Two validation sets were developed. A neutral validation set composed of 5,000 random peptides and a positive validation set composed of all naturally processed peptides in six different databases (SYFPEITHI, MHCPEP, AntiJen, MHCBN, HLALigand\Motif database, MPID (see related links)). For each prediction method and HLA, we computed the score for each peptide in the two sets. We found the cutoff maximizing the fraction of peptides from the positive validation set with supra-cutoff scores and the number of peptides from the random validation with infra-cutoff scores. The resulting cutoff produces low levels of type I and type II errors for the big majority of HLA alleles using the Parker MHC-I prediction algorithm. Pushing this button will set the cutoff values to the computed optimal values.

Output options:
--------------------
These parameters define the format of the desired output.
with Flankings: The proteasomal cleavage is determinde according to the two flanking amino acids. The resulting epitopes can be presented either with or without the flanking positions
Protein: Present/not present the protein name(s) Position in Protein: Present/not present the position of the eptitope in the Protein(s)
Cleavage Score: Present/not present the cleavage prediction algorithm score.
Tap Score: Present/not present the TAP prediction algorithm score.
MHC-I Score: Present/not present the BIMAS MHC-I binding prediction algorithm score.
Conservation Score: Not implemented currently.

Open in this window/new window/send by email
The results are presented as a list of epitopes. The epitope list can be either presented in the current window or in a new window in order to allow for a new query. Often the resulting list can be too long and the handling of a huge html file can be cumbersome. We propose the option of sending the results as atachment by email. The user should then provide a name for the file and a valid email adress.