Immune repertoire analysis
Maintenance of long-term memory in the adaptive repertoire is a balance between completeness of coverage, the need to maintain large clones against common pathogens and parsimony of memory storage in a finite number of memory cells.
The B and T cells, key players in the adaptive immune system, are typically activated by antigen contact via their receptors. The receptors are diversified through a sequence of mechanisms that maximize this diversity to enable a potential response to every presented peptide. Heavy–light chain and beta–alpha chain genes, generating the B-cell and T-cell heterodimer receptor, respectively, undergo non-precise V(D)J segment rearrangements, templated and non-templated nucleotide additions and deletions. Immunoglobulin chains further diversify through somatic hypermutations – a process of stepwise incorporation of single nucleotide substitutions into the V gene, underpinning much of the antibody diversity and affinity maturation. This immense theoretical combinatorial diversity challenges immunology.
Until recently, immune receptor repertoire studies were limited by low throughput of classic sequencing. Now, with the rapid advance of NGS, we are able to perform large scale BCR and TCR repertoire studies, with a high throughput of over a million sequences per host. However, analyzing these data sets require novel mathematical methods.
We aim to develop mathematical tools to answer the following fundamental questions:
A) What elements shape the T cell repertoire in the thymus and its transition to the periphery?
B) What drives the initial B cell repertoire formation in the bone marrow?
C) What characterizes the extreme clonal amplification in B cell lymphoma?
D) How can we estimate the repertoire size?