English
 
User Manual Privacy Policy Disclaimer Contact us
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Journal Article

Association Plots: Visualizing associations in high-dimensional correspondence analysis biplots

MPS-Authors
/persons/resource/persons228520

Gralinska,  Elzbieta
Gene regulation (Martin Vingron), Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society;

/persons/resource/persons50613

Vingron,  Martin
Gene regulation (Martin Vingron), Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society;

External Ressource
No external resources are shared
Fulltext (public)
Supplementary Material (public)
There is no public supplementary material available
Citation

Gralinska, E., & Vingron, M. (2020). Association Plots: Visualizing associations in high-dimensional correspondence analysis biplots. bioRxiv (Preprint Server). doi:10.1101/2020.10.23.352096.


Cite as: http://hdl.handle.net/21.11116/0000-0007-8241-C
Abstract
In molecular biology, just as in many other fields of science, data often come in the form of matrices or contingency tables with many measurements (rows) for a set of variables (columns). While projection methods like Principal Component Analysis or Correspondence Analysis can be applied for obtaining an overview of such data, in cases where the matrix is very large the associated loss of information upon projection into two or three dimensions may be dramatic. However, when the set of variables can be grouped into clusters, this opens up a new angle on the data. We focus on the question which measurements are associated to a cluster and distinguish it from other clusters. Correspondence Analysis employs a geometry geared towards answering this question. We exploit this feature in order to introduce Association Plots for visualizing cluster-specific measurements in complex data. Association Plots are two-dimensional, independent of the size of data matrix or cluster, and depict the measurements associated to a cluster of variables. We demonstrate our method first on a small data set and then on a genomic example comprising more than 10,000 conditions. We will show that Association Plots can clearly highlight those measurements which characterize a cluster of variables.