hide
Free keywords:
-
Abstract:
Motivation: During microarray production, several thousands of oligonucleotides (short DNA sequences) are synthesized in parallel, one nucleotide at a time. We are interested in finding the shortest possible nucleotide deposition sequence to synthesize all oligos in order to reduce production time and increase oligo quality. Thus we study the shortest common supersequence problem of several thousand short strings over a four-letter alphabet.
Results: We present a statistical analysis of the basic ALPHABET-LEFTMOSTapproximation algorithm, and propose several practical heuristics to reduce the length of the supersequence. Our results show that it is hard to beat ALPHABET-LEFTMOSTin the microarray production setting by more than 2 characters, but these savings can improve overall oligo quality by more than four percent.