Abstract
The medicinal plant Madagascar periwinkle, Catharanthus roseus (L.) G. Don, produces hundreds of biologically active monoterpene-derived indole alkaloid (MIA) metabolites and is the sole source of the potent, expensive anti-cancer compounds vinblastine and vincristine. Access to a genome sequence would enable insights into the biochemistry, control, and evolution of genes responsible for MIA biosynthesis. However, generation of a near-complete, scaffolded genome is prohibitive to small research communities due to the expense, time, and expertise required. In this study, we generated a genome assembly for C.roseus that provides a near-comprehensive representation of the genic space that revealed the genomic context of key points within the MIA biosynthetic pathway including physically clustered genes, tandem gene duplication, expression sub-functionalization, and putative neo-functionalization. The genome sequence also facilitated high resolution co-expression analyses that revealed three distinct clusters of co-expression within the components of the MIA pathway. Coordinated biosynthesis of precursors and intermediates throughout the pathway appear to be a feature of vinblastine/vincristine biosynthesis. The C.roseus genome also revealed localization of enzyme-rich genic regions and transporters near known biosynthetic enzymes, highlighting how even a draft genome sequence can empower the study of high-value specialized metabolites.
Significance Statement It is now possible to generate within a single research lab a plant genome sequence that provides a near complete representation of genic regions: here, we report on the genome assembly of the medicinal plant Catharanthus roseus, a producer of many specialized metabolites, including several anti-cancer compounds. We show how the draft genome sequence can facilitate identification, discovery, and an improved understanding of the genetic repertoire involved in specialized secondary metabolism.