Communication and memory optimal parallel data cube construction

Ruoming Jin, Ge Yang, K. Vaidyanathan, G. Agrawal

Research output: Chapter in Book/Report/Conference proceedingConference contribution

19 Scopus citations

Abstract

Data cube construction is a commonly used operation in data warehouses. Because of the volume of data that is stored and analyzed in a data warehouse and the amount of computation involved in data cube construction, it is natural to consider parallel machines for this operation. We address a number of algorithmic issues in parallel data cube construction. First, we present an aggregation tree for sequential (and parallel) data cube construction, which has minimally bounded memory requirements. An aggregation tree is parameterized by the ordering of dimensions. We present a parallel algorithm based upon the aggregation tree. We analyze the interprocessor communication volume and construct a closed form expression for it. We prove that the same ordering of the dimensions minimizes both the computational and communication requirements. We also describe a method for partitioning the initial array and prove that it minimizes the communication volume. Experimental results from implementation of our algorithms on a cluster of workstations validate our theoretical results.

Original languageEnglish (US)
Title of host publicationProceedings - 2003 International Conference on Parallel Processing, ICPP 2003
EditorsP. Sadayappan, Chu-Sing Yang
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages573-580
Number of pages8
Volume16
Edition12
ISBN (Electronic)0769520170
DOIs
StatePublished - 2003
Externally publishedYes
Event2003 International Conference on Parallel Processing, ICPP 2003 - Kaohsiung, Taiwan, Province of China
Duration: Oct 6 2003Oct 9 2003

Publication series

NameProceedings of the International Conference on Parallel Processing
Volume2003-January
ISSN (Print)0190-3918

Conference

Conference2003 International Conference on Parallel Processing, ICPP 2003
CountryTaiwan, Province of China
CityKaohsiung
Period10/6/0310/9/03

Keywords

  • Aggregates
  • Clustering algorithms
  • Companies
  • Concurrent computing
  • Data analysis
  • Data warehouses
  • Parallel algorithms
  • Parallel machines
  • Partitioning algorithms
  • Performance analysis

ASJC Scopus subject areas

  • Software
  • Mathematics(all)
  • Hardware and Architecture

Fingerprint Dive into the research topics of 'Communication and memory optimal parallel data cube construction'. Together they form a unique fingerprint.

Cite this