CorDeep [Web Application]

Valleriani, Matteo; Büttner, Jochen; Martinetz, Julius; El-Hajj, Hassan

Datensatz

DATENSATZ AKTIONENEXPORT

Zur Ablage hinzufügen

Lokale TagsFreigabegeschichteDetailsÜbersicht

Freigegeben

Software

CorDeep [Web Application]

MPG-Autoren

/persons/resource/persons194421

Valleriani, Matteo
Department Structural Changes in Systems of Knowledge, Max Planck Institute for the History of Science, Max Planck Society;

/persons/resource/persons193936

Büttner, Jochen
Department Structural Changes in Systems of Knowledge, Max Planck Institute for the History of Science, Max Planck Society;

/persons/resource/persons267461

Martinetz, Julius
Department Structural Changes in Systems of Knowledge, Max Planck Institute for the History of Science, Max Planck Society;

/persons/resource/persons258730

El-Hajj, Hassan
Department Structural Changes in Systems of Knowledge, Max Planck Institute for the History of Science, Max Planck Society;

Externe Ressourcen

https://cordeep.mpiwg-berlin.mpg.de/
(beliebiger Volltext)

Volltexte (beschränkter Zugriff)

Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.

Volltexte (frei zugänglich)

Es sind keine frei zugänglichen Volltexte in PuRe verfügbar

Ergänzendes Material (frei zugänglich)

Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Zitation

Valleriani, M., Büttner, J., Martinetz, J., & El-Hajj, H. (2022). CorDeep [Web Application].

Zitierlink: https://hdl.handle.net/21.11116/0000-000B-547F-9

Zusammenfassung

CorDeep is a machine-learning based web application to extract visual elements from historical sources and to classify pages that contain numerical and alphanumerical tables. It locates and classifies visual elements into the following categories: “Content Illustrations,” “Initials,” “Decorations,” and “Printers's Marks”. CorDeep is trained on the Sphaera corpus, which is a collection of 359 early modern treatises containing about 78,000 pages, 30,000 visual elements, and 10,000 pages containing tables. The collection is constituted by early modern textbooks on geocentric cosmology (https://sphaera.mpiwg-berlin.mpg.de). The visual elements were manually annotated with bounding boxes and semantic labels whereas the pages with tables were identified semiautomatically by an incrementally improved model supervised by a human expert. CorDeep reaches an average precision of up to 98% concerning the detection of visual elements and an accuracy of 94% concerning the classification of pages containing tables. These values might change depending on the style, content, and quality of inputted images.