Abstract
In this thesis, an approach to calling single nucleotide polymorphisms (SNPs)
in genomes of diploid individuals is developed based on estimations of the global per-base
error rate and heterozygosity. This new method is implemented along with an alternative
calling strategy based on per-site base error rates. The implemented procedures are evaluated
on simulated sequencing data and real sequencing data of inbred mice, and compared
with established SNP calling tools. While the newly developed method works well on simulated
data, the results on real data do not match those of the established tools. According
to population genetic theory, inbred mouse genomes should be completely homozygous,
which is, however, not the case. The hypothesis that SNPs in inbred mice are located in
critical genes, is investigated, because mutations at these sites are potentially lethal. The
Overrepresentation Enrichment Analysis performed on sequencing data of inbred mouse
strain C57BL/6NJ shows some support for the hypothesis, but remains inconclusive.