English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
 
 
DownloadE-Mail
  Reference-guided assembly of four diverse Arabidopsis thaliana genomes

Schneeberger, K., Ossowski, S., Ott, F., Klein, J., Wang, X., Lanz, C., et al. (2011). Reference-guided assembly of four diverse Arabidopsis thaliana genomes. Proceedings of the National Academy of Sciences of the United States of America, 108(25), 10249-10254. doi:10.1073/pnas.1107739108.

Item is

Files

show Files

Locators

show

Creators

show
hide
 Creators:
Schneeberger, K1, Author           
Ossowski, S1, Author           
Ott, F1, Author           
Klein, JD, Author
Wang, X1, Author           
Lanz, C1, Author           
Smith, LM1, Author           
Cao, J1, Author           
Fitz, J1, Author           
Warthmann, N1, Author           
Henz, SR1, Author           
Huson, DH, Author           
Weigel, D1, Author           
Affiliations:
1Department Molecular Biology, Max Planck Institute for Developmental Biology, Max Planck Society, ou_3375790              

Content

show
hide
Free keywords: -
 Abstract: We present whole-genome assemblies of four divergent Arabidopsis thaliana strains that complement the 125-Mb reference genome sequence released a decade ago. Using a newly developed reference-guided approach, we assembled large contigs from 9 to 42 Gb of Illumina short-read data from the Landsberg erecta (Ler-1), C24, Bur-0, and Kro-0 strains, which have been sequenced as part of the 1,001 Genomes Project for this species. Using alignments against the reference sequence, we first reduced the complexity of the de novo assembly and later integrated reads without similarity to the reference sequence. As an example, half of the noncentromeric C24 genome was covered by scaffolds that are longer than 260 kb, with a maximum of 2.2 Mb. Moreover, over 96% of the reference genome was covered by the reference-guided assembly, compared with only 87% with a complete de novo assembly. Comparisons with 2 Mb of dideoxy sequence reveal that the per-base error rate of the reference-guided assemblies was below 1 in 10,000. Our assemblies provide a detailed, genomewide picture of large-scale differences between A. thaliana individuals, most of which are difficult to access with alignment-consensus methods only. We demonstrate their practical relevance in studying the expression differences of polymorphic genes and show how the analysis of sRNA sequencing data can lead to erroneous conclusions if aligned against the reference genome alone. Genome assemblies, raw reads, and further information are accessible through http://1001genomes.org/projects/assemblies.html.

Details

show
hide
Language(s):
 Dates: 2011-06
 Publication Status: Issued
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: Peer
 Identifiers: DOI: 10.1073/pnas.1107739108
PMID: 21646520
 Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show
hide
Title: Proceedings of the National Academy of Sciences of the United States of America
Source Genre: Journal
 Creator(s):
Affiliations:
Publ. Info: WASHINGTON : NATL ACAD SCIENCES
Pages: - Volume / Issue: 108 (25) Sequence Number: - Start / End Page: 10249 - 10254 Identifier: ISSN: 0027-8424