English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONS
  This item is discarded!Release HistoryDetailsSummary

Discarded

Preprint

Managing historical linguistic data for computational phylogenetics and computer-assisted language comparison

MPS-Authors
/persons/resource/persons96313

Forkel,  Robert
Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Max Planck Society;

/persons/resource/persons138255

Gray,  Russell D.
Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Max Planck Society;

/persons/resource/persons185771

Greenhill,  Simon J.
Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Max Planck Society;

/persons/resource/persons201886

List,  Johann-Mattis
CALC, Max Planck Institute for the Science of Human History, Max Planck Society;

/persons/resource/persons222944

Rzymski,  Christoph
Linguistic and Cultural Evolution, Max Planck Institute for the Science of Human History, Max Planck Society;

/persons/resource/persons220957

Tresoldi,  Tiago
CALC, Max Planck Institute for the Science of Human History, Max Planck Society;

External Resource
No external resources are shared
Fulltext (public)

(No access)

Supplementary Material (public)
There is no public supplementary material available
Citation

Forkel, R., Gray, R. D., Greenhill, S. J., List, J.-M., Rzymski, C., & Tresoldi, T. (2019). Managing historical linguistic data for computational phylogenetics and computer-assisted language comparison. Humanities Commons. doi:10.17613/pwva-kz72.


Abstract
The popularisation of computer-based methods in comparative linguistics has led to a greater awareness of issues resulting from limited data sustainability and proper data management. In this use-case and its accompanying tutorial, we present principles of data management as applied to computational phylogenetics and computer-assisted language comparison, showcasing the solutions we recommend. Instead of enumerating the many possibilities to code and use linguistic data to conduct a phylogenetic analysis, we illustrate our suggestions for phylogenetic data management in a workflow based on a concrete analysis, showing how data should be managed with the help of a published dataset, exploring the information, file formats, processes, and software involved, explaining and showing how to collect and store cross-linguistic information, how to guarantee that datasets are cross-linguistically comparable, how to store intermediate and final results of the analyses, and how to share data in a reusable form by relying in the tools and principles of the CLDF initiative.