hide
Free keywords:
-
Abstract:
In vertebrate genomes, non-repeated sequence regions are associated with genetic
functions. Measuring uniqueness of sequence regions can be done with the match
complexity. The change in match complexity along a genome sequence is determined
by a sliding window analysis. In this way, the complexity for partial sequences of the
genome is obtained. The greater the proportion of unique segments of a partial sequence,
the higher its match complexity. Repeating partial sequences therefore have a low match
complexity, unique sequences a high match complexity.
This thesis examines the relationship between complexity and function in vertebrate
genomes. Here I show that regions with high match complexity in the genomes of frog,
chicken, cow, mouse, human, rat, chimpanzee and zebrafish have up to 38 times more
developmental genes than expected. This extends the previous knowledge on mammalian
genomes in mice and human to a representative group of vertebrates. For small window
sizes of around 10 kilobases, high-complexity regions of mammals contain an aboveaverage
number of genes. In long areas of high complexity, the Ttn gene and genes of the
Hox clusters are found in vertebrates. It could be shown that high-complexity regions
of chicken, mouse, human, rat and zebrafish are enriched with developmental processes.
The observed developmental processes mainly contribute to embryonic development or
the formation of anatomical structures.
The results demonstrate a significant correlation between high complexity of genomic
sequences and developmental functions in vertebrates. They thus broaden our understanding
of the relationship between nucleic acid sequences and biological function.
Complexity measurements can be used as a starting point for functional analysis of
vertebrate genomes. Here, highly complex regions without a known function would be
promising candidates for future experimental studies.