English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
 
 
DownloadE-Mail
  Making genealogical language classifications available for phylogenetic analysis: Newick trees, unified identifiers, and branch length

Dediu, D. (2018). Making genealogical language classifications available for phylogenetic analysis: Newick trees, unified identifiers, and branch length. Language Dynamics and Change, 8(1), 1-21. doi:10.1163/22105832-00801001.

Item is

Files

show Files
hide Files
:
Dediu_2018_Making_genealogical.pdf (Publisher version), 767KB
Name:
Dediu_2018_Making_genealogical.pdf
Description:
-
OA-Status:
Visibility:
Public
MIME-Type / Checksum:
application/pdf / [MD5]
Technical Metadata:
Copyright Date:
-
Copyright Info:
-
License:
-

Locators

show
hide
Locator:
https://github.com/ddediu/lgfam-newick (Supplementary material)
Description:
-
OA-Status:

Creators

show
hide
 Creators:
Dediu, Dan1, Author           
Affiliations:
1Language and Genetics Department, MPI for Psycholinguistics, Max Planck Society, ou_792549              

Content

show
hide
Free keywords: Branch length; Language family; Newick; Phylogenetics
 Abstract: One of the best-known types of non-independence between languages is caused by genealogical relationships due to descent from a common ancestor. These can be represented by (more or less resolved and controversial) language family trees. In theory, one can argue that language families should be built through the strict application of the comparative method of historical linguistics, but in practice this is not always the case, and there are several proposed classifications of languages into language families, each with its own advantages and disadvantages. A major stumbling block shared by most of them is that they are relatively difficult to use with computational methods, and in particular with phylogenetics. This is due to their lack of standardization, coupled with the general non-availability of branch length information, which encapsulates the amount of evolution taking place on the family tree. In this paper I introduce a method (and its implementation in R) that converts the language classifications provided by four widely-used databases (Ethnologue, WALS, AUTOTYP and Glottolog) intothe de facto Newick standard generally used in phylogenetics, aligns the four most used conventions for unique identifiers of linguistic entities (ISO 639-3, WALS, AUTOTYP and Glottocode), and adds branch length information from a variety of sources (the tree's own topology, an externally given numeric constant, or a distance matrix). The R scripts, input data and resulting Newick trees are available under liberal open-source licenses in a GitHub repository (https://github.com/ddediu/lgfam-newick), to encourage and promote the use of phylogenetic methods to investigate linguistic diversity and its temporal dynamics.

Details

show
hide
Language(s): eng - English
 Dates: 20172018
 Publication Status: Published online
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: Peer
 Identifiers: DOI: 10.1163/22105832-00801001
 Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show
hide
Title: Language Dynamics and Change
Source Genre: Journal
 Creator(s):
Affiliations:
Publ. Info: Leiden : Brill
Pages: - Volume / Issue: 8 (1) Sequence Number: - Start / End Page: 1 - 21 Identifier: -