Randomized efficient algorithms for compressed strings: The finger-print approach

Leszek Gasieniec, Marek Karpinski, Wojciech Plandowski, Wojciech Rytter

Research output: Chapter in Book/Report/Conference proceedingConference contribution

25 Scopus citations

Abstract

Denote by LZ(w) the coded form of a string w produced by Lempel-Ziv encoding algorithm. We consider several classical algorithmic problems for texts in the compressed setting. The first of them is the equality-testing: given LZ(w) and integers i,j,k test the equality: w[i…i+ k] = w[j… j + k]. We give a simple and efficient randomized algorithm for this problem using the finger-printing idea. The equality testing is reduced to the equivalence of certain context-free grammars generating single strings. The equality-testing is the bottleneck in other algorithms for compressed texts. We relate the time complexity of several classical problems for texts to the complexity Eq(n) of equality-testing. Assume n = |LZ(T)|, m = |LZ(P)| and U = |T|. Then we can compute the compressed representations of the sets of occurrences of P in T, periods of T, palindromes of T, and squares of T respectively in times O(n log2 U · Eq(m) + n2 log U), O(n log2 U · Eq(n) + n2 log U), O(n log2 U ·Eq(n) + n2 log U) and O(n2 log3 U · Eq(n) + n3 log2 U), where Eq(n) = O(n log log n). The randomization improves considerably upon the known deterministic algorithms.

Original languageEnglish (US)
Title of host publicationCombinatorial Pattern Matching - 7th Annual Symposium, CPM 1996, Proceedings
EditorsGene Myers, Dan Hirschberg
PublisherSpringer Verlag
Pages39-49
Number of pages11
ISBN (Print)3540612580, 9783540612582
DOIs
StatePublished - 1996
Externally publishedYes
Event7th Annual Symposium on Combinatorial Pattern Matching, CPM 1996 - Laguna Beach, United States
Duration: Jun 10 1996Jun 12 1996

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume1075
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference7th Annual Symposium on Combinatorial Pattern Matching, CPM 1996
Country/TerritoryUnited States
CityLaguna Beach
Period6/10/966/12/96

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Randomized efficient algorithms for compressed strings: The finger-print approach'. Together they form a unique fingerprint.

Cite this