De novo and haplotype assembly of polyploid genomes

Moeinzadeh, Mohammadhossein

doi:10.17169/refubium-2712

Local TagsRelease HistoryDetailsSummary

De novo and haplotype assembly of polyploid genomes

Moeinzadeh, M. (2018). De novo and haplotype assembly of polyploid genomes. PhD Thesis. doi:10.17169/refubium-2712.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/21.11116/0000-0003-7462-C Version Permalink: https://hdl.handle.net/21.11116/0000-000F-13E6-8

Genre: Thesis

Files

show Files

Locators

show

Creators

show

hide

Creators:
Moeinzadeh, Mohammadhossein^{1, 2}, Author
Vingron, Martin¹, Referee

Affiliations:
1Transcriptional Regulation (Martin Vingron), Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society, ou_1479639
2Fachbereich Mathematik und Informatik der Freien Universität Berlin, ou_persistent22

Content

show

hide

Free keywords: Haplotype reconstruction; polyploid genomes; genome assembly

Abstract: In this thesis, we focus on the problem of reconstructing haplotypes for polyploid genomes and the utilization of called haplotypes in de novo assembly of these genomes. We approach this topic exploring short read sequence data of the highly heterozygous hexaploid sweet potato genome. First, we investigate the role of heterozygosity and ploidy number in reconstructing haplotypes with short reads. In short, higher heterozygosity provides higher number of useful reads for reconstructing haplotypes while being polyploid introduces a challenge in assembling reads into longer sequences; we called it the problem of Ambiguity of Merging fragments. However, we address this problem and show that reads can be assembled into haplotypes with high accuracy using short reads. To this end, we propose a new algorithm, called Ranbow, and evaluate its performance on real and simulated datasets from tetraploid Capsella bursa-pastoris (Shepherd's Purse), and hexaploid Ipomoea batatas (sweet potato) genomes. We are able to show that our method achieves higher accuracy and longer assembled haplotypes than the other methods. Next, we present the de novo assembly pipeline of the sweet potato genome utilizing computed haplotypes for genome assembly improvement. This novel approach, called haplo-scaffolders, uses the assembled haplotypes in order to rescue a set of potential connections which were hidden due to the differences of true haplotypes and the reference sequence. These connections are obtained by mapping the reads into haplotypes and transforming the connection information to the reference level. This process can be repeated by updating the scaffold set to further improve the genome assembly. We show that this strategy improves substantially the N50 and maximum scaffold length of assembled sweet potato genome.

Details

show

hide

Language(s): eng - English

Dates: Accepted: 2018Published Online: 2019-07-01

Publication Status: Published online

Pages: vii, 157 S.

Publishing info: -

Table of Contents: -

Rev. Type: -

Identifiers: DOI: 10.17169/refubium-2712
URI: https://refubium.fu-berlin.de/handle/fub188/24952

Degree: PhD

Event

show

Legal Case

show

Project information

show

Source

show