Evaluation of Population-Based Haplotype Phasing Algorithms

Sethi, Riccha

アイテム詳細

登録内容を編集ファイル形式で保存

一時保存へ追加

タグ情報を表示リリース履歴を表示詳細要約

公開

学位論文

Evaluation of Population-Based Haplotype Phasing Algorithms

MPS-Authors

/persons/resource/persons202171

Sethi, Riccha
International Max Planck Research School, MPI for Informatics, Max Planck Society;
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society;

External Resource

There are no locators available

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

フルテキスト (公開)

公開されているフルテキストはありません

付随資料 (公開)

There is no public supplementary material available

引用

Sethi, R. (2016). Evaluation of Population-Based Haplotype Phasing Algorithms. Master Thesis, Universität des Saarlandes, Saarbrücken.

引用: https://hdl.handle.net/11858/00-001M-0000-002C-41DA-7

要旨

The valuable information in correct order of alleles on the haplotypes has many applications in GWAS studies and population genetics. A considerable number of computational and statistical algorithms have been developed for haplotype phasing. Historically, these algorithms were compared using the simulated population data with less dense markers which was inspired by genotype data from the HapMap project. Currently due to the advancement and reduction in cost of NGS, thousands of individuals across the world have been sequenced in 1000 Genomes Project. This has generated the genotype information of individuals from different ethnicity along with much denser genetic variations in them. Here, we have developed a scalable approach to assess state-of-the-art population-based haplotype phasing algorithms with benchmark data designed by simulation of the population (unrelated and related individuals), NGS pipeline and genotype calling. The most accurate algorithm was MVNCall (v1) for phase inference in unrelated individuals while DuoHMM approach of Shapeit (v2) had lowest switch error rate of 0.298 %(with true genotype likelihoods) in the related individuals. Moreover, we also conducted a comprehensive assessment of algorithms for the imputation of missing genotypes in the population with a reference panel. For this metrics, Impute2 (v2.3.2) and Beagle (v4.1) both performed competitively under different imputation scenarios and had genotype concordance rate of >99%. However, Impute2 was better in imputation of genotypes with minor allele frequency of <0.025 in the reference panel.