Signal or Noise? Investigating the impact of highly variable sites on phylogenetic inference

Klaere, S., Geldschlaeger, O., Basten, J., & Shepherd, D. (2017). Signal or Noise? Investigating the impact of highly variable sites on phylogenetic inference. In 21st New Zealand Phylogenomics Meeting (Waiheke 2017) (pp. 13-13).

Tools to select a best model from a set of given models are abundant in phylogenetics. However, as has been pointed out repeatedly since the 1980s, an essential step in a statistical inference, the data-to- model fitness remains under appreciated. There have been few attempts at introducing general tests of fitness with mixed success and little to no implementation. Residual diagnostic tools on the other hand have been studied, primarily in the framework of tree of life inference, where the common ancestor is so old that some sites accumulated so many substitutions that they might support an alternative history. This field has produce multiple indices to assess the noise-level of a site, and thus its variability. The usual process is the to remove noisy sites until the inferred topology remains stable. There seems to be a strong disagreement about which approach is most suitable for the questions asked or how to provide an automated framework to provide a quantitative approach for de-noising. Here, we will use a mixture of proposed indices to explore their relatedness, visualise the signal level in an alignment and provide a few conclusions and challenges.