Deutsch
 
Hilfe Datenschutzhinweis Impressum
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT
  Self-analysis of repeat proteins reveals evolutionarily conserved patterns

Merski, M., Młynarczyk, K., Ludwiczak, J., Skrzeczkowski, J., Dunin-Horkawicz, S., & Górna, M. (2020). Self-analysis of repeat proteins reveals evolutionarily conserved patterns. BMC Bioinformatics, 21(1): 179. doi:10.1186/s12859-020-3493-y.

Item is

Basisdaten

einblenden: ausblenden:
Genre: Zeitschriftenartikel

Externe Referenzen

einblenden:

Urheber

einblenden:
ausblenden:
 Urheber:
Merski, M, Autor
Młynarczyk, K, Autor
Ludwiczak, J, Autor                 
Skrzeczkowski, J, Autor
Dunin-Horkawicz, S1, Autor                 
Górna, MW, Autor
Affiliations:
1External Organizations, ou_persistent22              

Inhalt

einblenden:
ausblenden:
Schlagwörter: -
 Zusammenfassung: Background: Protein repeats can confound sequence analyses because the repetitiveness of their amino acid sequences lead to difficulties in identifying whether similar repeats are due to convergent or divergent evolution. We noted that the patterns derived from traditional "dot plot" protein sequence self-similarity analysis tended to be conserved in sets of related repeat proteins and this conservation could be quantitated using a Jaccard metric.
Results: Comparison of these dot plots obviated the issues due to sequence similarity for analysis of repeat proteins. A high Jaccard similarity score was suggestive of a conserved relationship between closely related repeat proteins. The dot plot patterns decayed quickly in the absence of selective pressure with an expected loss of 50% of Jaccard similarity due to a loss of 8.2% sequence identity. To perform method testing, we assembled a standard set of 79 repeat proteins representing all the subgroups in RepeatsDB. Comparison of known repeat and non-repeat proteins from the PDB suggested that the information content in dot plots could be used to identify repeat proteins from pure sequence with no requirement for structural information. Analysis of the UniRef90 database suggested that 16.9% of all known proteins could be classified as repeat proteins. These 13.3 million putative repeat protein chains were clustered and a significant amount (82.9%) of clusters containing between 5 and 200 members were of a single functional type.
Conclusions: Dot plot analysis of repeat proteins attempts to obviate issues that arise due to the sequence degeneracy of repeat proteins. These results show that this kind of analysis can efficiently be applied to analyze repeat proteins on a large scale.

Details

einblenden:
ausblenden:
Sprache(n):
 Datum: 2020-05
 Publikationsstatus: Online veröffentlicht
 Seiten: -
 Ort, Verlag, Ausgabe: -
 Inhaltsverzeichnis: -
 Art der Begutachtung: -
 Identifikatoren: DOI: 10.1186/s12859-020-3493-y
PMID: 32381046
 Art des Abschluß: -

Veranstaltung

einblenden:

Entscheidung

einblenden:

Projektinformation

einblenden:

Quelle 1

einblenden:
ausblenden:
Titel: BMC Bioinformatics
Genre der Quelle: Zeitschrift
 Urheber:
Affiliations:
Ort, Verlag, Ausgabe: BioMed Central
Seiten: 17 Band / Heft: 21 (1) Artikelnummer: 179 Start- / Endseite: - Identifikator: ISSN: 1471-2105
CoNE: https://pure.mpg.de/cone/journals/resource/111000136905000