English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  A simple refined DNA minimizer operator enables twofold faster computation

Pan, C., & Reinert, K. (2024). A simple refined DNA minimizer operator enables twofold faster computation. Bioinformatics, btae045. doi:10.1093/bioinformatics/btae045.

Item is

Files

show Files
hide Files
:
Bioinformatics_Pan_Reinert_2024.pdf (Publisher version), 856KB
Name:
Bioinformatics_Pan_Reinert_2024.pdf
Description:
Accepted manuscript.
OA-Status:
Not specified
Visibility:
Public
MIME-Type / Checksum:
application/pdf / [MD5]
Technical Metadata:
Copyright Date:
-
Copyright Info:
© The Author(s) 2024

Locators

show

Creators

show
hide
 Creators:
Pan, Chenxu , Author
Reinert, Knut1, Author                 
Affiliations:
1Efficient Algorithms for Omics Data (Knut Reinert), Max Planck Fellow Group, Max Planck Institute for Molecular Genetics, Max Planck Society, ou_2385698              

Content

show
hide
Free keywords: -
 Abstract: Motivation: The minimizer concept is a data structure for sequence sketching. The standard canonical minimizer selects a subset of k-mers from the given DNA sequence by comparing the forward and reverse k-mers in a window simultaneously according to a predefined selection scheme. It is widely employed by sequence analysis such as read mapping and assembly. k-mer density, k-mer repetitiveness (e.g. k-mer bias), and computational efficiency are three critical measurements for minimizer selection schemes. Though there exist trade-offs between kinds of minimizer variants. Generic, effective and efficient are always the requirements for high-performance minimizer algorithms.

Results: We propose a simple minimizer operator as a refinement of the standard canonical minimizer. It takes only a few operations to compute. However, it can improve the k-mer repetitiveness, especially for the lexicographic order. It applies to other selection schemes of total orders (e.g. random orders). Moreover, it is computationally efficient and the density is close to that of the standard minimizer. The refined minimizer may benefit high-performance applications like binning and read mapping.

Details

show
hide
Language(s): eng - English
 Dates: 2024-01-222024-01-25
 Publication Status: Published online
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: DOI: 10.1093/bioinformatics/btae045
PMID: 38269626
 Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show
hide
Title: Bioinformatics
Source Genre: Journal
 Creator(s):
Affiliations:
Publ. Info: Oxford : Oxford University Press
Pages: - Volume / Issue: - Sequence Number: btae045 Start / End Page: - Identifier: ISSN: 1367-4803
CoNE: https://pure.mpg.de/cone/journals/resource/954926969991