hide
Free keywords:
-
Abstract:
The popularisation of computer-based methods in comparative linguistics has led to a greater awareness of issues resulting from limited data sustainability and proper data management. In this use-case and its accompanying tutorial, we present principles of data management as applied to computational phylogenetics and computer-assisted language comparison, showcasing the solutions we recommend. Instead of enumerating the many possibilities to code and use linguistic data to conduct a phylogenetic analysis, we illustrate our suggestions for phylogenetic data management in a workflow based on a concrete analysis, showing how data should be managed with the help of a published dataset, exploring the information, file formats, processes, and software involved, explaining and showing how to collect and store cross-linguistic information, how to guarantee that datasets are cross-linguistically comparable, how to store intermediate and final results of the analyses, and how to share data in a reusable form by relying in the tools and principles of the CLDF initiative.