Datensatz

DATENSATZ AKTIONEN

Dieser Datensatz wurde verworfen!FreigabegeschichteDetailsÜbersicht

Verworfen

Preprint

Managing historical linguistic data for computational phylogenetics and computer-assisted language comparison

MPG-Autoren

/persons/resource/persons96313

Forkel, Robert
Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Max Planck Society;

/persons/resource/persons138255

Gray, Russell D.
Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Max Planck Society;

/persons/resource/persons185771

Greenhill, Simon J.
Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Max Planck Society;

/persons/resource/persons201886

List, Johann-Mattis
CALC, Max Planck Institute for the Science of Human History, Max Planck Society;

/persons/resource/persons222944

Rzymski, Christoph
Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Max Planck Society;

/persons/resource/persons220957

Tresoldi, Tiago
CALC, Max Planck Institute for the Science of Human History, Max Planck Society;

Externe Ressourcen

Es sind keine externen Ressourcen hinterlegt

Volltexte (beschränkter Zugriff)

Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.

Volltexte (frei zugänglich)

(Kein Zugriff möglich)

Ergänzendes Material (frei zugänglich)

Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Zitation

Forkel, R., Gray, R. D., Greenhill, S. J., List, J.-M., Rzymski, C., & Tresoldi, T. (2019). Managing historical linguistic data for computational phylogenetics and computer-assisted language comparison. Humanities Commons. doi:10.17613/pwva-kz72.

Zusammenfassung

The popularisation of computer-based methods in comparative linguistics has led to a greater awareness of issues resulting from limited data sustainability and proper data management. In this use-case and its accompanying tutorial, we present principles of data management as applied to computational phylogenetics and computer-assisted language comparison, showcasing the solutions we recommend. Instead of enumerating the many possibilities to code and use linguistic data to conduct a phylogenetic analysis, we illustrate our suggestions for phylogenetic data management in a workflow based on a concrete analysis, showing how data should be managed with the help of a published dataset, exploring the information, file formats, processes, and software involved, explaining and showing how to collect and store cross-linguistic information, how to guarantee that datasets are cross-linguistically comparable, how to store intermediate and final results of the analyses, and how to share data in a reusable form by relying in the tools and principles of the CLDF initiative.