TY - GEN
T1 - Efficiency of fast parallel pattern searching in highly compressed texts
AU - Gąsieniec, Leszek
AU - Gibbons, Alan
AU - Rytter, Wojciech
PY - 1999/1/1
Y1 - 1999/1/1
N2 - We consider efficiency of NC-algorithms for pattern-searching in highly compressed one- and two-dimensional texts. “Highly compressed” means that the text can be exponentially large with respect to its compressed version, and “fast” means “in polylogarithmic time”. Given an uncompressed pattern P and a compressed version of a text T, the compressed matching problem is to test if P occurs in T. Two types of closely related compressed representations of 1-dimensional texts are considered: the Lempel-Ziv encodings (LZ, in short) and restricted LZ encodings (RLZ, in short). For highly compressed texts there is a small difference between them, in extreme situations both of them compress text exponentially, e.g. Fibonacci words of size N have compressed versions of size O(logN) for LZ and Restricted LZ encodings. Despite similarities we prove that LZ-compressed matching is P-complete while RLZ-compressed matching is rather trivially in NC. We show how to improve a naive straightforward NC algorithm and obtain almost optimal parallel RLZ-compressed matching applying tree-contraction techniques to directed acyclic graphs with polynomial tree-size. As a corollary we obtain an almost optimal parallel algorithm for LZW-compressed matching which is simpler than the (more general) algorithm in [11]. Highly compressed 2-dimensional texts are also considered.
AB - We consider efficiency of NC-algorithms for pattern-searching in highly compressed one- and two-dimensional texts. “Highly compressed” means that the text can be exponentially large with respect to its compressed version, and “fast” means “in polylogarithmic time”. Given an uncompressed pattern P and a compressed version of a text T, the compressed matching problem is to test if P occurs in T. Two types of closely related compressed representations of 1-dimensional texts are considered: the Lempel-Ziv encodings (LZ, in short) and restricted LZ encodings (RLZ, in short). For highly compressed texts there is a small difference between them, in extreme situations both of them compress text exponentially, e.g. Fibonacci words of size N have compressed versions of size O(logN) for LZ and Restricted LZ encodings. Despite similarities we prove that LZ-compressed matching is P-complete while RLZ-compressed matching is rather trivially in NC. We show how to improve a naive straightforward NC algorithm and obtain almost optimal parallel RLZ-compressed matching applying tree-contraction techniques to directed acyclic graphs with polynomial tree-size. As a corollary we obtain an almost optimal parallel algorithm for LZW-compressed matching which is simpler than the (more general) algorithm in [11]. Highly compressed 2-dimensional texts are also considered.
UR - http://www.scopus.com/inward/record.url?scp=84949211315&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84949211315&partnerID=8YFLogxK
U2 - 10.1007/3-540-48340-3_5
DO - 10.1007/3-540-48340-3_5
M3 - Conference contribution
AN - SCOPUS:84949211315
SN - 3540664084
SN - 9783540664086
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 48
EP - 58
BT - Mathematical Foundations of Computer Science 1999 - 24th International Symposium, MFCS 1999, Proceedings
A2 - kutyłowski, Mirosław
A2 - Pacholski, Leszek
A2 - Wierzbicki, Tomasz
PB - Springer Verlag
T2 - 24th International Symposium on Mathematical Foundations of Computer Science, MFCS 1999
Y2 - 6 September 1999 through 10 September 1999
ER -