English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Journal Article

Improved visualization of high-dimensional data using the distance-of-distance transformation

MPS-Authors
/persons/resource/persons274130

Vinck,  Martin       
Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, Max Planck Society;
Vinck Lab, Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max Planck Society, Max Planck Society;

Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)

Liu_2022_ImprovedVisualization.pdf
(Publisher version), 4MB

Supplementary Material (public)
There is no public supplementary material available
Citation

Liu, J., & Vinck, M. (2022). Improved visualization of high-dimensional data using the distance-of-distance transformation. PLoS Computational Biology, 18(2): e1010764. doi:10.1371/journal.pcbi.1010764.


Cite as: https://hdl.handle.net/21.11116/0000-000C-85B0-6
Abstract
Author summary Biological datasets are often high-dimensional, e.g. the genetic profile of cells or the firing pattern of neural populations. Dimensionality reduction methods like t-SNE are commonly used to represent the high-dimensional data in a low-dimensional embedding space. The visualization helps us to identify the underlying clustering patterns and shed light on the information hidden within the data. We show that in situations where there exist scattering noise points, clustering patterns in the data tend to be heavily distorted. Here, we show that using a distance-of-distance (DoD) transformation of the dissimilarity matrix between data points, the influence of scattering noise is effectively removed. This neighborhood-based transformation is most effective when the dimensionality of the dataset is high. We show that this technique improves low-dimensional embedding for several high-dimensional datasets, such as the convolutional neural network representation of natural images or the neuronal population representation of visual stimuli.