Abstract
Parasites are ubiquitous in nature and pose a heavy fitness cost to the host. Host populations undergo adaptive genetic changes to resist the parasite, while the parasite population adapts to exploit the host. Such reciprocal, adaptive genetic changes are called host-parasite coevolution. Local adaptation is said to have occurred when a population is better adapted to its own environment than to a remote environment, and host-parasite coevolution can result in the host’s local adaptation to its parasite environment. When different environments exert different selective pressures on populations with connected geographic ranges, these populations can adapt locally and diverge from one another, eventually leading to reproductive isolation or ecological speciation. For a trait to play a role in this, it must influence the carrier’s ecological interactions, such as salinity tolerance, and gill rakers. This trait may additionally have a pleiotropic effect upon mate choice, and are sometimes called “magic traits”. One excellent model for the genomic basis of ecological speciation under parasite mediated selection that may be a magic trait is the Major Histocompatibility Complex (MHC) genomic region, which contains the MHC genes. The MHC gene products are cell-surface proteins that bind small peptides- called epitopes- and present them to other parts of the immune system, thus helping distinguish between self and non-self antigens, and forming a major part of the vertebrate’s adaptive immunity. MHC genes are divided into 3 classes: Class-I, Class-II, and Class-III. In different parasite environments, populations face different selective pressures, and this causes them to adapt to their own environments, and diverge from each other. Here, we study the MHC Class-II genomic region in the three-spined stickleback (Gasterosteus aculeatus), a bony fish (teleost) species native to the Northern Hemisphere. Pairs of populations of sticklebacks can be thought to exist in a speciation continuum, from a panmictic population, through partial and reversible 7 isolation, to irreversible ecological speciation. This gives us the opportunity to tease apart the factors that move population pairs towards either end of this continuum. However, the genomic basis of traits that contribute to reproductive isolation still needs extensive study. The purpose of this thesis was to fill these gaps in our knowledge, leading to a deeper understanding of the genomics of speciation. The primary objective of this PhD thesis is to create a multi-layered understanding of the MHC region in the genome of the three-spined stickleback. The first layer was the identification and characterization of this region in the context of the stickleback reference genome. It was found that the MHC Class-II region of the three-spined is located at one telomeric end of Chromosome VII, and is split into two parts, a Classical region 0.3 MB from the end, and the Non-Classical region 2.4 MB away, closer to the centromere. Many MHC Class-II antigen presentation pathway genes were found on other Chromosomes such as Chr III, and Chr IV. The next part of this work was to characterize the enormous amount of allelic and haplotypic variation at the MHC Class-IIB loci. For this, we annotated the genes of the stickleback MHC Class-II region using novel genomic sequences of new haplotypes with different numbers of functional MHC loci, using BAC (Bacterial Artificial Chromosome) library sequence data. After annotating functional genes, we turned our attention to regulatory elements in this region, which tells us about differences in regulation between various MHC alleles. Only one confirmed Immune System gene, Compliment Factor 1, was found in the vicinity of the Classical MHC Class-II loci. The annotation allowed us to define bounding genes for both the Classical and Non-Classical MHC Class-II regions, between which lie functionally similar DNA sequences. These were extracted and compared in the next section. Previous studies of MHC Class-II sequence variants in different stickleback populations together had revealed that the number of functional MHC alleles that are inherited together varies between haplotypes, indicating potential Copy Number Variation. This led to a more detailed study of synteny in this region, 8 finding blocks of recombination that account for differences in structure between haplotypes. The annotation and synteny work led to the creation of the stickleback MHC Class-II region map, showing genes, promoters, and collinear blocks in three different haplotypes. However, we still wanted to understand the process by which these differences were created. So, we studied transposable elements found in this region. We found transposable elements that accounted for indels, as well as, possibly, the number and arrangement of MHC loci. This gave us a model for the origin of the extensive CNV in this region. Finally, we turned our attention to population-level variation in the stickleback MHC Class-II region. From a previously generated whole-genome-sequencing dataset, we were able to study sequence variation in the assembled region, as well as other genomic regions as controls. We performed tests for selection and neutrality, giving us an insight about the evolutionary history of the stickleback species, with respect to parasite-mediated selection. The three-spined stickleback has undergone recent, repeated adaptive radiations. In this process, sticklebacks must have encountered many novel parasite environments, and fast adaptation was critical. Because of the genomic organization in the MHC Class-II regions, stickleback were able to make novel haplotypes from old ones quickly, and have excessive allelic and copy number variation. This would allow for populations in different environments to have distinct allele pools. This could be one of the reasons why sticklebacks were able colonize different environments so successfully. As such, the organization of the stickleback MHC Class-II genomic region would be a factor that could move populations quickly along the speciation continuum.