Abstract
Background: The systematic analysis of a large number of comparable plant trait data can support investigations
into phylogenetics and ecological adaptation, with broad applications in evolutionary biology, agriculture,
conservation, and the functioning of ecosystems. Floras, i.e., books collecting the information on all known plant
species found within a region, are a potentially rich source of such plant trait data. Floras describe plant traits with a
focus on morphology and other traits relevant for species identification in addition to other characteristics of plant
species, such as ecological affinities, distribution, economic value, health applications, traditional uses, and so on.
However, a key limitation in systematically analyzing information in Floras is the lack of a standardized vocabulary for
the described traits as well as the difficulties in extracting structured information from free text.
Results: We have developed the Flora Phenotype Ontology (FLOPO), an ontology for describing traits of plant
species found in Floras. We used the Plant Ontology (PO) and the Phenotype And Trait Ontology (PATO) to extract
entity-quality relationships from digitized taxon descriptions in Floras, and used a formal ontological approach based
on phenotype description patterns and automated reasoning to generate the FLOPO. The resulting ontology consists
of 25,407 classes and is based on the PO and PATO. The classified ontology closely follows the structure of Plant
Ontology in that the primary axis of classification is the observed plant anatomical structure, and more specific traits
are then classified based on parthood and subclass relations between anatomical structures as well as subclass
relations between phenotypic qualities.
Conclusions: The FLOPO is primarily intended as a framework based on which plant traits can be integrated
computationally across all species and higher taxa of flowering plants. Importantly, it is not intended to replace
established vocabularies or ontologies, but rather serve as an overarching framework based on which different
application- and domain-specific ontologies, thesauri and vocabularies of phenotypes observed in flowering plants can be integrated.