Superstrings have many applications in data compression and genetics. However, the decision version of the shortest superstring problem is script N sign℘-complete. In this paper we examine the complexity of approximating shortest superstrings. There are two basic measures of the approximations: the length factor and the compression factor. The well known and practical approximation algorithm is the sequential algorithm GREEDY. It approximates the shortest superstring with the compression factor of 1/2 and with the length factor of 4. Our main results are: (1) A sequential length approximation algorithm which achieves a length factor of 2.83. This result improves the best previously known bound of 2.89 due to Teng and Yao. Very recently, this bound was improved by Kosaraju, Park, and Stein to 2.79, and by Armen and Stein to 2.75. (2) A proof that the algorithm GREEDY is not parallelizable, the computation of its output is ℘-complete. (3) An script N signscript C sign algorithm which achieves the compression factor of 1/(4 + ε). (4) The design of an ℛscript N signscript C sign algorithm with constant length factor and an script N signscript C sign algorithm with logarithmic length factor.
ASJC Scopus subject areas
- Control and Optimization
- Computational Mathematics
- Computational Theory and Mathematics