English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Journal Article

Recommending plant taxa for supporting on-site species identification

MPS-Authors
/persons/resource/persons62597

Wäldchen,  Jana
Flora Incognita, Dr. Jana Wäldchen, Department Biogeochemical Integration, Prof. Dr. M. Reichstein, Max Planck Institute for Biogeochemistry, Max Planck Society;

/persons/resource/persons186269

Rzanny,  Michael
Flora Incognita, Dr. Jana Wäldchen, Department Biogeochemical Integration, Prof. Dr. M. Reichstein, Max Planck Institute for Biogeochemistry, Max Planck Society;

External Resource
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)

BGC2856.pdf
(Publisher version), 2MB

Supplementary Material (public)
There is no public supplementary material available
Citation

Wittich, H. C., Seeland, M., Wäldchen, J., Rzanny, M., & Mäder, P. (2018). Recommending plant taxa for supporting on-site species identification. BMC Bioinformatics, 19: 190. doi:10.1186/s12859-018-2201-7.


Cite as: https://hdl.handle.net/21.11116/0000-0001-6450-4
Abstract
Background
Predicting a list of plant taxa most likely to be observed at a given geographical location and time is useful for many scenarios in biodiversity informatics. Since efficient plant species identification is impeded mainly by the large number of possible candidate species, providing a shortlist of likely candidates can help significantly expedite the task. Whereas species distribution models heavily rely on geo-referenced occurrence data, such information still remains largely unused for plant taxa identification tools.
Results

In this paper, we conduct a study on the feasibility of computing a ranked shortlist of plant taxa likely to be encountered by an observer in the field. We use the territory of Germany as case study with a total of 7.62M records of freely available plant presence-absence data and occurrence records for 2.7k plant taxa. We systematically study achievable recommendation quality based on two types of source data: binary presence-absence data and individual occurrence records. Furthermore, we study strategies for aggregating records into a taxa recommendation based on location and date of an observation.
Conclusion

We evaluate recommendations using 28k geo-referenced and taxa-labeled plant images hosted on the Flickr website as an independent test dataset. Relying on location information from presence-absence data alone results in an average recall of 82%. However, we find that occurrence records are complementary to presence-absence data and using both in combination yields considerably higher recall of 96% along with improved ranking metrics. Ultimately, by reducing the list of candidate taxa by an average of 62%, a spatio-temporal prior can substantially expedite the overall identification problem.