Uniclust databases of clustered and deeply annotated protein sequences and 
alignments.

Mirdita, M.; von den Driesch, L.; Galiez, C.; Martin, M. J.; Söding, J.; Steinegger, M.

doi:10.1093/nar/gkw1081

Local TagsRelease HistoryDetailsSummary

Uniclust databases of clustered and deeply annotated protein sequences and alignments.

Mirdita, M., von den Driesch, L., Galiez, C., Martin, M. J., Söding, J., & Steinegger, M. (2017). Uniclust databases of clustered and deeply annotated protein sequences and alignments. Nucleic Acids Research, 45(D1), D170-D176. doi:10.1093/nar/gkw1081.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/11858/00-001M-0000-002C-1521-2 Version Permalink: https://hdl.handle.net/21.11116/0000-0001-3B85-7

Genre: Journal Article

Files

show Files

hide Files

:

2368324.pdf (Publisher version), 1015KB

View Save

File Permalink:
https://hdl.handle.net/11858/00-001M-0000-002C-6487-7

Name:
2368324.pdf

Description:
-

OA-Status:

Visibility:
Public

MIME-Type / Checksum:
application/pdf / [MD5]

Technical Metadata:

View

Copyright Date:
-

Copyright Info:
-

License:
http://creativecommons.org/licenses/by/4.0/

:

2368324_Suppl_1.zip (Supplementary material), 24KB

View Save

File Permalink:
https://hdl.handle.net/11858/00-001M-0000-002C-6488-5

Name:
2368324_Suppl_1.zip

Description:
-

OA-Status:

Visibility:
Public

MIME-Type / Checksum:
application/zip / [MD5]

Technical Metadata:

View

Copyright Date:
-

Copyright Info:
-

License:
-

:

2368324_Suppl_2.pdf (Supplementary material), 23KB

View Save

File Permalink:
https://hdl.handle.net/11858/00-001M-0000-002C-6489-3

Name:
2368324_Suppl_2.pdf

Description:
-

OA-Status:

Visibility:
Public

MIME-Type / Checksum:
application/pdf / [MD5]

Technical Metadata:

View

Copyright Date:
-

Copyright Info:
-

License:
-

Locators

show

Creators

show

hide

Creators:
Mirdita, M.¹, Author
von den Driesch, L., Author
Galiez, C., Author
Martin, M. J., Author
Söding, J.¹, Author
Steinegger, M.¹, Author

Affiliations:
1Research Group of Computational Biology, MPI for Biophysical Chemistry, Max Planck Society, ou_1933286

Content

show

hide

Free keywords: -

Abstract: We present three clustered protein sequence databases, Uniclust90, Uniclust50, Uniclust30 and three databases of multiple sequence alignments (MSAs), Uniboost10, Uniboost20 and Uniboost30, as a resource for protein sequence analysis, function prediction and sequence searches. The Uniclust databases cluster UniProtKB sequences at the level of 90%, 50% and 30% pairwise sequence identity. Uniclust90 and Uniclust50 clusters showed better consistency of functional annotation than those of UniRef90 and UniRef50, owing to an optimised clustering pipeline that runs with our MMseqs2 software for fast and sensitive protein sequence searching and clustering. Uniclust sequences are annotated with matches to Pfam, SCOP domains, and proteins in the PDB, using our HHblits homology detection tool. Due to its high sensitivity, Uniclust contains 17% more Pfam domain annotations than UniProt. Uniboost MSAs of three diversities are built by enriching the Uniclust30 MSAs with local sequence matches from MMseqs2 profile searches through Uniclust30. All databases can be downloaded from the Uniclust server at uniclust.mmseqs.com. Users can search clusters by keywords and explore their MSAs, taxonomic representation, and annotations. Uniclust is updated every two months with the new UniProt release.

Details

show

hide

Language(s): eng - English

Dates: Published Online: 2016-11-29Date issued: 2017-01-04

Publication Status: Issued

Pages: -

Publishing info: -

Table of Contents: -

Rev. Type: Peer

Identifiers: DOI: 10.1093/nar/gkw1081

Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show

hide

Title: Nucleic Acids Research

Source Genre: Journal

Creator(s):

Affiliations:

Publ. Info: -

Pages: - Volume / Issue: 45 (D1) Sequence Number: - Start / End Page: D170 - D176 Identifier: -