Using tiling to scale parallel data cube construction

Ruoming Jin; Karthik Vaidyanathan; Ge Yang; Gagan Agrawal

doi:10.1109/ICPP.2004.1327944

Using tiling to scale parallel data cube construction

Ruoming Jin, Karthik Vaidyanathan, Ge Yang, Gagan Agrawal

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

2 Scopus citations

Abstract

Data cube construction is a commonly used operation in data warehouses. Because of the volume of data that is stored and analyzed in a data warehouse and the amount of computation involved in data cube construction, it is natural to consider parallel machines for this operation. Also, for both sequential and parallel data cube construction, effectively using the main memory is an important challenge. In our prior work, we have developed parallel algorithms for this problem. In this paper, we show how sequential and parallel data cube construction algorithms can be further scaled to handle larger problems, when the memory requirements could be a constraint. This is done by tiling the input and output arrays on each node. We address the challenges in using tiling while still maintaining the other desired properties of a data cube construction algorithm, which are, using minimal parents, and achieving maximal cache and memory reuse. We present a parallel algorithm that combines tiling with interprocessor communication. Our experimental results show the following. First, tiling helps in scaling data cube construction in both sequential and parallel environments. Second, choosing tiling parameters as per our theoretical results does result in better performance.

Original language	English (US)
Title of host publication	Proceedings - 2004 International Conference on Parallel Processing, ICPP 2004
Editors	R. Eigenmann
Pages	365-372
Number of pages	8
DOIs	https://doi.org/10.1109/ICPP.2004.1327944
State	Published - 2004
Externally published	Yes
Event	Proceedings - 2004 International Conference on Parallel Processing, ICPP 2004 - Montreal, Que, Canada Duration: Aug 15 2004 → Aug 18 2004

Publication series

Name	Proceedings of the International Conference on Parallel Processing
ISSN (Print)	0190-3918

Conference

Conference	Proceedings - 2004 International Conference on Parallel Processing, ICPP 2004
Country/Territory	Canada
City	Montreal, Que
Period	8/15/04 → 8/18/04

ASJC Scopus subject areas

Hardware and Architecture
General Engineering

Access to Document

10.1109/ICPP.2004.1327944

Cite this

Using tiling to scale parallel data cube construction. / Jin, Ruoming; Vaidyanathan, Karthik; Yang, Ge et al.
Proceedings - 2004 International Conference on Parallel Processing, ICPP 2004. ed. / R. Eigenmann. 2004. p. 365-372 (Proceedings of the International Conference on Parallel Processing).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Jin, R, Vaidyanathan, K, Yang, G & Agrawal, G 2004, Using tiling to scale parallel data cube construction. in R Eigenmann (ed.), Proceedings - 2004 International Conference on Parallel Processing, ICPP 2004. Proceedings of the International Conference on Parallel Processing, pp. 365-372, Proceedings - 2004 International Conference on Parallel Processing, ICPP 2004, Montreal, Que, Canada, 8/15/04. https://doi.org/10.1109/ICPP.2004.1327944

@inproceedings{b24cb9a5e0d24329939981c87b4dc32d,

title = "Using tiling to scale parallel data cube construction",

abstract = "Data cube construction is a commonly used operation in data warehouses. Because of the volume of data that is stored and analyzed in a data warehouse and the amount of computation involved in data cube construction, it is natural to consider parallel machines for this operation. Also, for both sequential and parallel data cube construction, effectively using the main memory is an important challenge. In our prior work, we have developed parallel algorithms for this problem. In this paper, we show how sequential and parallel data cube construction algorithms can be further scaled to handle larger problems, when the memory requirements could be a constraint. This is done by tiling the input and output arrays on each node. We address the challenges in using tiling while still maintaining the other desired properties of a data cube construction algorithm, which are, using minimal parents, and achieving maximal cache and memory reuse. We present a parallel algorithm that combines tiling with interprocessor communication. Our experimental results show the following. First, tiling helps in scaling data cube construction in both sequential and parallel environments. Second, choosing tiling parameters as per our theoretical results does result in better performance.",

author = "Ruoming Jin and Karthik Vaidyanathan and Ge Yang and Gagan Agrawal",

year = "2004",

doi = "10.1109/ICPP.2004.1327944",

language = "English (US)",

isbn = "0769521975",

series = "Proceedings of the International Conference on Parallel Processing",

pages = "365--372",

editor = "R. Eigenmann",

booktitle = "Proceedings - 2004 International Conference on Parallel Processing, ICPP 2004",

note = "Proceedings - 2004 International Conference on Parallel Processing, ICPP 2004 ; Conference date: 15-08-2004 Through 18-08-2004",

}

TY - GEN

T1 - Using tiling to scale parallel data cube construction

AU - Jin, Ruoming

AU - Vaidyanathan, Karthik

AU - Yang, Ge

AU - Agrawal, Gagan

PY - 2004

Y1 - 2004

N2 - Data cube construction is a commonly used operation in data warehouses. Because of the volume of data that is stored and analyzed in a data warehouse and the amount of computation involved in data cube construction, it is natural to consider parallel machines for this operation. Also, for both sequential and parallel data cube construction, effectively using the main memory is an important challenge. In our prior work, we have developed parallel algorithms for this problem. In this paper, we show how sequential and parallel data cube construction algorithms can be further scaled to handle larger problems, when the memory requirements could be a constraint. This is done by tiling the input and output arrays on each node. We address the challenges in using tiling while still maintaining the other desired properties of a data cube construction algorithm, which are, using minimal parents, and achieving maximal cache and memory reuse. We present a parallel algorithm that combines tiling with interprocessor communication. Our experimental results show the following. First, tiling helps in scaling data cube construction in both sequential and parallel environments. Second, choosing tiling parameters as per our theoretical results does result in better performance.

AB - Data cube construction is a commonly used operation in data warehouses. Because of the volume of data that is stored and analyzed in a data warehouse and the amount of computation involved in data cube construction, it is natural to consider parallel machines for this operation. Also, for both sequential and parallel data cube construction, effectively using the main memory is an important challenge. In our prior work, we have developed parallel algorithms for this problem. In this paper, we show how sequential and parallel data cube construction algorithms can be further scaled to handle larger problems, when the memory requirements could be a constraint. This is done by tiling the input and output arrays on each node. We address the challenges in using tiling while still maintaining the other desired properties of a data cube construction algorithm, which are, using minimal parents, and achieving maximal cache and memory reuse. We present a parallel algorithm that combines tiling with interprocessor communication. Our experimental results show the following. First, tiling helps in scaling data cube construction in both sequential and parallel environments. Second, choosing tiling parameters as per our theoretical results does result in better performance.

UR - http://www.scopus.com/inward/record.url?scp=10044240505&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=10044240505&partnerID=8YFLogxK

U2 - 10.1109/ICPP.2004.1327944

DO - 10.1109/ICPP.2004.1327944

M3 - Conference contribution

AN - SCOPUS:10044240505

SN - 0769521975

T3 - Proceedings of the International Conference on Parallel Processing

SP - 365

EP - 372

BT - Proceedings - 2004 International Conference on Parallel Processing, ICPP 2004

A2 - Eigenmann, R.

T2 - Proceedings - 2004 International Conference on Parallel Processing, ICPP 2004

Y2 - 15 August 2004 through 18 August 2004

ER -

Using tiling to scale parallel data cube construction

Abstract

Publication series

Conference

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this