Figure 1. Average SIR score of HIV-1 genes for the most frequent alleles at the human population. One can see that the order of the genes which observed by the average SIR score is conserved in these alleles indicated by the systematic trend.
Figure 2. Difference between SIR score in serotyped hosts and the average SIR score. For regulatory genes, HIV in hosts with a given HLA allele will have a lower SIR score for this allele on average. The opposite occurs for virion associated genes.
We have explored the evolution of the Gag protein in the human population using a phylogenic analysis. 465 amino acid sequences of the Gag protein from the LANL database were used to build an un-rooted phylogeny tree based on the maximum parsimony algorithm by the PHYLIP program protpars (http://evolution.genetics.washington.edu/phylip.html).
In order to define the root we added two sequences of the SIVcpz. The program output contained the sequences at all nodes of the tree. Only the root node contained a full sequence. The rest of the sequences contained a '.' which indicated same amino acid as in the node below it on tree or a '?' which indicated an equivalent probability between two amino acids. We completed the sequence in each node and inherited the father amino acid for the '?' case. The first sequence was completed by its leaf son. The SIR score was computed for all the full sequences. The level of each node was determined by its distance from the root node.