English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Thesis

Finding Reusable Modules Using Sparse Matrix Decompositions

MPS-Authors
/persons/resource/persons296700

Mireles Chavez,  Victor
IMPRS for Biology and Computation (Anne-Dominique Gindrat), Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society;
Fachbereich Mathematik und Informatik der Freien Universität Berlin;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available
Citation

Mireles Chavez, V. (2022). Finding Reusable Modules Using Sparse Matrix Decompositions. PhD Thesis. doi:10.17169/refubium-34702.


Cite as: https://hdl.handle.net/21.11116/0000-000E-83AA-E
Abstract
Biological systems are often described as being composed of a set of semi-independent modules, each of which can be ascribed its own function, evolutionary history, developmental origin, or a combination thereof. One commonly accepted property of such modules is that they are redeployed across different conditions, so that sets of elements that have been jointly subject to evolutionary processes are re-purposed. This property of being composed of reusable modules has been suggested as a hallmark of biological systems, and its significance, both in an evolutionary setting and from a purely epistemic point of view, has been long debated in literature.

In this work, a formalization of the notion of module reusability is provided, along with an algorithm and a series of measurements that can be used to study it. The final objective is to provide a concise and mathematically-expressible vocabulary with which to express statements about reusability, along with the mathematical and computational tools to assert their validity. For this purpose, references in literature to the reusable nature of biological modules are organized, and a common minimum description is proposed.

In brief, systems are represented by a presence-absence matrix, whose columns represent conditions and whose rows represent elements. This matrix is then decomposed into a product of two matrices, using a stochastic gradient descent algorithm, one of which represents the compositions of modules, and the other one representing the usage of this modules across different conditions. This decomposition is such that the resulting modules are maximally reusable, and so upper bounds can be estimated for many properties related to reusability. Analytical results are provided that help describe the space of decompositions of a system, and which relate the problem at hand with other, related, problems studied with the use of matrices.

Example applications of this framework are provided in this work, both for synthetic and real biological data. The conclusion of these experiments is that the amount of module reusability observed in a system is dependant on the reusability of individual elements. Furthermore, it is suggested that biological systems exhibit modules which are not particularly reusable when compared to randomly-generated systems. Finally, the results presented here suggest that a feature specific of biological systems is the distribution of such reusabilities, with a large amount of condition specific and constitutive modules being present.

Biologische Systeme werden oft so beschrieben, dass sie aus einer Reihe von halb-unabhängigen Modulen bestehen, denen jeweils eine eigene Funktion, eine eigene Evolutionsgeschichte, ein eigener Entwicklungsursprung oder eine Kombination davon zugeschrieben werden kann. Eine allgemein akzeptierte Eigenschaft solcher Module ist, dass sie unter verschiedenen Bedingungen wiederverwendet werden, so dass Sätze von Elementen, die gemeinsam evolutionären Prozessen unterworfen waren, wiederverwendet werden. Diese Eigenschaft, aus wiederverwendbaren Modulen zu bestehen, wurde als Kennzeichen biologischer Systeme vorgeschlagen, und ihre Bedeutung, sowohl in einer evolutionären Umgebung als auch aus rein epistemischer Sicht, wird in der Fachliteratur schon lange diskutiert.