TY - GEN

T1 - Randomized efficient algorithms for compressed strings

T2 - 7th Annual Symposium on Combinatorial Pattern Matching, CPM 1996

AU - Gasieniec, Leszek

AU - Karpinski, Marek

AU - Plandowski, Wojciech

AU - Rytter, Wojciech

PY - 1996/1/1

Y1 - 1996/1/1

N2 - Denote by LZ(w) the coded form of a string w produced by Lempel-Ziv encoding algorithm. We consider several classical algorithmic problems for texts in the compressed setting. The first of them is the equality-testing: given LZ(w) and integers i,j,k test the equality: w[i…i+ k] = w[j… j + k]. We give a simple and efficient randomized algorithm for this problem using the finger-printing idea. The equality testing is reduced to the equivalence of certain context-free grammars generating single strings. The equality-testing is the bottleneck in other algorithms for compressed texts. We relate the time complexity of several classical problems for texts to the complexity Eq(n) of equality-testing. Assume n = |LZ(T)|, m = |LZ(P)| and U = |T|. Then we can compute the compressed representations of the sets of occurrences of P in T, periods of T, palindromes of T, and squares of T respectively in times O(n log2 U · Eq(m) + n2 log U), O(n log2 U · Eq(n) + n2 log U), O(n log2 U ·Eq(n) + n2 log U) and O(n2 log3 U · Eq(n) + n3 log2 U), where Eq(n) = O(n log log n). The randomization improves considerably upon the known deterministic algorithms.

AB - Denote by LZ(w) the coded form of a string w produced by Lempel-Ziv encoding algorithm. We consider several classical algorithmic problems for texts in the compressed setting. The first of them is the equality-testing: given LZ(w) and integers i,j,k test the equality: w[i…i+ k] = w[j… j + k]. We give a simple and efficient randomized algorithm for this problem using the finger-printing idea. The equality testing is reduced to the equivalence of certain context-free grammars generating single strings. The equality-testing is the bottleneck in other algorithms for compressed texts. We relate the time complexity of several classical problems for texts to the complexity Eq(n) of equality-testing. Assume n = |LZ(T)|, m = |LZ(P)| and U = |T|. Then we can compute the compressed representations of the sets of occurrences of P in T, periods of T, palindromes of T, and squares of T respectively in times O(n log2 U · Eq(m) + n2 log U), O(n log2 U · Eq(n) + n2 log U), O(n log2 U ·Eq(n) + n2 log U) and O(n2 log3 U · Eq(n) + n3 log2 U), where Eq(n) = O(n log log n). The randomization improves considerably upon the known deterministic algorithms.

UR - http://www.scopus.com/inward/record.url?scp=84957638409&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84957638409&partnerID=8YFLogxK

U2 - 10.1007/3-540-61258-0_3

DO - 10.1007/3-540-61258-0_3

M3 - Conference contribution

AN - SCOPUS:84957638409

SN - 3540612580

SN - 9783540612582

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 39

EP - 49

BT - Combinatorial Pattern Matching - 7th Annual Symposium, CPM 1996, Proceedings

A2 - Myers, Gene

A2 - Hirschberg, Dan

PB - Springer Verlag

Y2 - 10 June 1996 through 12 June 1996

ER -