Efficient algorithms for Lempel-Ziv encoding

Leszek Gasieniec; Marek Karpinski; Wojciech Plandowski; Wojciech Rytter

Efficient algorithms for Lempel-Ziv encoding

Leszek Gasieniec, Marek Karpinski, Wojciech Plandowski, Wojciech Rytter

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

We consider several basic problems for texts and show that if the input texts are given by their Lempel-Ziv codes then the problems can be solved deterministically in polynomial time in the case when the original (uncompressed) texts are of exponential size. The growing importance of massively stored information requires new approaches to algorithms for compressed texts without decompressing. Denote by LZ(w) the version of a string w produced by Lempel-Ziv encoding algorithm. For given compressed strings LZ(T), LZ(P) we give the first known deterministic polynomial time algorithms to compute compressed representations of the set of all occurrences of the pattern P in T, all periods of T, all palindromes of T, and all squares of T. Then we consider several classical language recognition problems: •regular language recognition: given LZ(T) and a language L described by a regular expression, test if T ∈ L, •extended regular language recognition: given LZ(T) and a language L described by a L.Z-compressed regular expression, test if T ∈ L, the alphabet is unary, •context-free language recognition: given LZ(T) and a language L described by a context-free grammar, test if T ∈ L, the alphabet is unary. We show that the first recognition problem has a polynomial time algorithm and the other two problems are MV-hard. We show also that the LZ encoding can be computed on-line in polynomial time delay and small space (i.e. proportional to the size of the compressed text). Also the compressed representation of a pattern-matching automaton for the compressed pattern is computed in polynomial time.

Original language	English (US)
Title of host publication	Algorithm Theory - SWAT 1996 - 5th Scandinavian Workshop on Algorithm Theory, Proceedings
Editors	Rolf Karlsson, Andrzej Lingas
Publisher	Springer Verlag
Pages	404-415
Number of pages	12
ISBN (Print)	3540614222, 9783540614227
State	Published - 1996
Externally published	Yes
Event	5th Scandinavian Workshop on Algorithm Theory, SWAT 1996 - Reykjavik, Iceland Duration: Jul 3 1996 → Jul 5 1996

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	1097
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	5th Scandinavian Workshop on Algorithm Theory, SWAT 1996
Country/Territory	Iceland
City	Reykjavik
Period	7/3/96 → 7/5/96

ASJC Scopus subject areas

Theoretical Computer Science
General Computer Science

Cite this

Gasieniec, L., Karpinski, M., Plandowski, W., & Rytter, W. (1996). Efficient algorithms for Lempel-Ziv encoding. In R. Karlsson, & A. Lingas (Eds.), Algorithm Theory - SWAT 1996 - 5th Scandinavian Workshop on Algorithm Theory, Proceedings (pp. 404-415). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 1097). Springer Verlag.

Efficient algorithms for Lempel-Ziv encoding. / Gasieniec, Leszek; Karpinski, Marek; Plandowski, Wojciech et al.
Algorithm Theory - SWAT 1996 - 5th Scandinavian Workshop on Algorithm Theory, Proceedings. ed. / Rolf Karlsson; Andrzej Lingas. Springer Verlag, 1996. p. 404-415 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 1097).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Gasieniec, L, Karpinski, M, Plandowski, W & Rytter, W 1996, Efficient algorithms for Lempel-Ziv encoding. in R Karlsson & A Lingas (eds), Algorithm Theory - SWAT 1996 - 5th Scandinavian Workshop on Algorithm Theory, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 1097, Springer Verlag, pp. 404-415, 5th Scandinavian Workshop on Algorithm Theory, SWAT 1996, Reykjavik, Iceland, 7/3/96.

Gasieniec L, Karpinski M, Plandowski W, Rytter W. Efficient algorithms for Lempel-Ziv encoding. In Karlsson R, Lingas A, editors, Algorithm Theory - SWAT 1996 - 5th Scandinavian Workshop on Algorithm Theory, Proceedings. Springer Verlag. 1996. p. 404-415. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

Gasieniec, Leszek ; Karpinski, Marek ; Plandowski, Wojciech et al. / Efficient algorithms for Lempel-Ziv encoding. Algorithm Theory - SWAT 1996 - 5th Scandinavian Workshop on Algorithm Theory, Proceedings. editor / Rolf Karlsson ; Andrzej Lingas. Springer Verlag, 1996. pp. 404-415 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{5d99fdc51efe464f95e5f92b45916330,

title = "Efficient algorithms for Lempel-Ziv encoding",

abstract = "We consider several basic problems for texts and show that if the input texts are given by their Lempel-Ziv codes then the problems can be solved deterministically in polynomial time in the case when the original (uncompressed) texts are of exponential size. The growing importance of massively stored information requires new approaches to algorithms for compressed texts without decompressing. Denote by LZ(w) the version of a string w produced by Lempel-Ziv encoding algorithm. For given compressed strings LZ(T), LZ(P) we give the first known deterministic polynomial time algorithms to compute compressed representations of the set of all occurrences of the pattern P in T, all periods of T, all palindromes of T, and all squares of T. Then we consider several classical language recognition problems: •regular language recognition: given LZ(T) and a language L described by a regular expression, test if T ∈ L, •extended regular language recognition: given LZ(T) and a language L described by a L.Z-compressed regular expression, test if T ∈ L, the alphabet is unary, •context-free language recognition: given LZ(T) and a language L described by a context-free grammar, test if T ∈ L, the alphabet is unary. We show that the first recognition problem has a polynomial time algorithm and the other two problems are MV-hard. We show also that the LZ encoding can be computed on-line in polynomial time delay and small space (i.e. proportional to the size of the compressed text). Also the compressed representation of a pattern-matching automaton for the compressed pattern is computed in polynomial time.",

author = "Leszek Gasieniec and Marek Karpinski and Wojciech Plandowski and Wojciech Rytter",

note = "Publisher Copyright: {\textcopyright} Springer-Verlag Berlin Heidelberg 1996.; 5th Scandinavian Workshop on Algorithm Theory, SWAT 1996 ; Conference date: 03-07-1996 Through 05-07-1996",

year = "1996",

language = "English (US)",

isbn = "3540614222",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Verlag",

pages = "404--415",

editor = "Rolf Karlsson and Andrzej Lingas",

booktitle = "Algorithm Theory - SWAT 1996 - 5th Scandinavian Workshop on Algorithm Theory, Proceedings",

}

TY - GEN

T1 - Efficient algorithms for Lempel-Ziv encoding

AU - Gasieniec, Leszek

AU - Karpinski, Marek

AU - Plandowski, Wojciech

AU - Rytter, Wojciech

N1 - Publisher Copyright: © Springer-Verlag Berlin Heidelberg 1996.

PY - 1996

Y1 - 1996

N2 - We consider several basic problems for texts and show that if the input texts are given by their Lempel-Ziv codes then the problems can be solved deterministically in polynomial time in the case when the original (uncompressed) texts are of exponential size. The growing importance of massively stored information requires new approaches to algorithms for compressed texts without decompressing. Denote by LZ(w) the version of a string w produced by Lempel-Ziv encoding algorithm. For given compressed strings LZ(T), LZ(P) we give the first known deterministic polynomial time algorithms to compute compressed representations of the set of all occurrences of the pattern P in T, all periods of T, all palindromes of T, and all squares of T. Then we consider several classical language recognition problems: •regular language recognition: given LZ(T) and a language L described by a regular expression, test if T ∈ L, •extended regular language recognition: given LZ(T) and a language L described by a L.Z-compressed regular expression, test if T ∈ L, the alphabet is unary, •context-free language recognition: given LZ(T) and a language L described by a context-free grammar, test if T ∈ L, the alphabet is unary. We show that the first recognition problem has a polynomial time algorithm and the other two problems are MV-hard. We show also that the LZ encoding can be computed on-line in polynomial time delay and small space (i.e. proportional to the size of the compressed text). Also the compressed representation of a pattern-matching automaton for the compressed pattern is computed in polynomial time.

AB - We consider several basic problems for texts and show that if the input texts are given by their Lempel-Ziv codes then the problems can be solved deterministically in polynomial time in the case when the original (uncompressed) texts are of exponential size. The growing importance of massively stored information requires new approaches to algorithms for compressed texts without decompressing. Denote by LZ(w) the version of a string w produced by Lempel-Ziv encoding algorithm. For given compressed strings LZ(T), LZ(P) we give the first known deterministic polynomial time algorithms to compute compressed representations of the set of all occurrences of the pattern P in T, all periods of T, all palindromes of T, and all squares of T. Then we consider several classical language recognition problems: •regular language recognition: given LZ(T) and a language L described by a regular expression, test if T ∈ L, •extended regular language recognition: given LZ(T) and a language L described by a L.Z-compressed regular expression, test if T ∈ L, the alphabet is unary, •context-free language recognition: given LZ(T) and a language L described by a context-free grammar, test if T ∈ L, the alphabet is unary. We show that the first recognition problem has a polynomial time algorithm and the other two problems are MV-hard. We show also that the LZ encoding can be computed on-line in polynomial time delay and small space (i.e. proportional to the size of the compressed text). Also the compressed representation of a pattern-matching automaton for the compressed pattern is computed in polynomial time.

UR - http://www.scopus.com/inward/record.url?scp=84947917249&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84947917249&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84947917249

SN - 3540614222

SN - 9783540614227

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 404

EP - 415

BT - Algorithm Theory - SWAT 1996 - 5th Scandinavian Workshop on Algorithm Theory, Proceedings

A2 - Karlsson, Rolf

A2 - Lingas, Andrzej

PB - Springer Verlag

T2 - 5th Scandinavian Workshop on Algorithm Theory, SWAT 1996

Y2 - 3 July 1996 through 5 July 1996

ER -

Efficient algorithms for Lempel-Ziv encoding

Abstract

Publication series

Conference

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this