English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  Unsupervised literature mining approaches for extracting relationships pertaining to habitats and reproductive conditions of plant species

Gabud, R., Lapitan, P., Mariano, V., Mendoza, E. R., Pampolina, N., Clariño, M. A. A., et al. (2024). Unsupervised literature mining approaches for extracting relationships pertaining to habitats and reproductive conditions of plant species. Frontiers in Artificial Intelligence, 7: 1371411. doi:10.3389/frai.2024.1371411.

Item is

Files

show Files

Locators

show

Creators

show
hide
 Creators:
Gabud, R., Author
Lapitan, P., Author
Mariano, V., Author
Mendoza, Eduardo R.1, Author           
Pampolina, N., Author
Clariño, M. A. A., Author
Batista-Navarro, R., Author
Affiliations:
1Oesterhelt, Dieter / Membrane Biochemistry, Max Planck Institute of Biochemistry, Max Planck Society, ou_1565164              

Content

show
hide
Free keywords: relation extraction information extraction unsupervised methods rule-based methods transformer models biodiversity forests Computer Science
 Abstract: Introduction Fine-grained, descriptive information on habitats and reproductive conditions of plant species are crucial in forest restoration and rehabilitation efforts. Precise timing of fruit collection and knowledge of species' habitat preferences and reproductive status are necessary especially for tropical plant species that have short-lived recalcitrant seeds, and those that exhibit complex reproductive patterns, e.g., species with supra-annual mass flowering events that may occur in irregular intervals. Understanding plant regeneration in the way of planning for effective reforestation can be aided by providing access to structured information, e.g., in knowledge bases, that spans years if not decades as well as covering a wide range of geographic locations. The content of such a resource can be enriched with literature-derived information on species' time-sensitive reproductive conditions and location-specific habitats.Methods We sought to develop unsupervised approaches to extract relationships pertaining to habitats and their locations, and reproductive conditions of plant species and corresponding temporal information. Firstly, we handcrafted rules for a traditional rule-based pattern matching approach. We then developed a relation extraction approach building upon transformer models, i.e., the Text-to-Text Transfer Transformer (T5), casting the relation extraction problem as a question answering and natural language inference task. We then propose a novel unsupervised hybrid approach that combines our rule-based and transformer-based approaches.Results Evaluation of our hybrid approach on an annotated corpus of biodiversity-focused documents demonstrated an improvement of up to 15 percentage points in recall and best performance over solely rule-based and transformer-based methods with F1-scores ranging from 89.61 to 96.75% for reproductive condition - temporal expression relations, and ranging from 85.39% to 89.90% for habitat - geographic location relations. Our work shows that even without training models on any domain-specific labeled dataset, we are able to extract relationships between biodiversity concepts from literature with satisfactory performance.

Details

show
hide
Language(s): eng - English
 Dates: 2024-05-23
 Publication Status: Issued
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: Other: WOS:001241781100001
DOI: 10.3389/frai.2024.1371411
 Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show
hide
Title: Frontiers in Artificial Intelligence
Source Genre: Journal
 Creator(s):
Affiliations:
Publ. Info: Lausanne, Switzerland : Frontiers Research Foundation
Pages: - Volume / Issue: 7 Sequence Number: 1371411 Start / End Page: - Identifier: ISSN: 2624-8212
CoNE: https://pure.mpg.de/cone/journals/resource/2624-8212