TY - JOUR
T1 - Shared memory parallelization of data mining algorithms
T2 - Techniques, programming interface, and performance
AU - Jin, Ruoming
AU - Yang, Ge
AU - Agrawal, Gagan
N1 - Funding Information:
This research was supported by the US National Science Foundation CAREER award ACI-9733520, US National Science Foundation grant CCR-9808522, and US National Science Foundation grant ACR-9982087. The equipment for this research was purchased under US National Science Foundation grant EIA-9703088.
Copyright:
Copyright 2011 Elsevier B.V., All rights reserved.
PY - 2005/1
Y1 - 2005/1
N2 - With recent technological advances, shared memory parallel machines have become more scalable, and offer large main memories and high bus bandwidths. They are emerging as good platforms for data warehousing and data mining. In this paper, we focus on shared memory parallelization of data mining algorithms. We have developed a series of techniques for parallelization of data mining algorithms, including full replication, full locking, fixed locking, optimized full locking, and cache-sensitive locking. Unlike previous work on shared memory parallelization of specific data mining algorithms, all of our techniques apply to a large number of popular data mining algorithms. In addition, we propose a reduction-object-based interface for specifying a data mining algorithm. We show how our runtime system can apply any of the techniques we have developed starting from a common specification of the algorithm. We have carried out a detailed evaluation of the parallelization techniques and the programming interface. We have experimented with apriori and fp-tree-based association mining, k-means clustering, k-nearest neighbor classifier, and decision tree construction. The main results from our experiments are as follows: 1) Among full replication, optimized full locking, and cachesensitive locking, there is no clear winner. Each of these three techniques can outperform others depending upon machine and dataset parameters. These three techniques perform significantly better than the other two techniques. 2) Good parallel efficiency is achieved for each of the four algorithms we experimented with, using our techniques and runtime system. 3) The overhead of the interface is within 10 percent in almost all cases. 4) In the case of decision tree construction, combining different techniques turned out to be crucial for achieving high performance.
AB - With recent technological advances, shared memory parallel machines have become more scalable, and offer large main memories and high bus bandwidths. They are emerging as good platforms for data warehousing and data mining. In this paper, we focus on shared memory parallelization of data mining algorithms. We have developed a series of techniques for parallelization of data mining algorithms, including full replication, full locking, fixed locking, optimized full locking, and cache-sensitive locking. Unlike previous work on shared memory parallelization of specific data mining algorithms, all of our techniques apply to a large number of popular data mining algorithms. In addition, we propose a reduction-object-based interface for specifying a data mining algorithm. We show how our runtime system can apply any of the techniques we have developed starting from a common specification of the algorithm. We have carried out a detailed evaluation of the parallelization techniques and the programming interface. We have experimented with apriori and fp-tree-based association mining, k-means clustering, k-nearest neighbor classifier, and decision tree construction. The main results from our experiments are as follows: 1) Among full replication, optimized full locking, and cachesensitive locking, there is no clear winner. Each of these three techniques can outperform others depending upon machine and dataset parameters. These three techniques perform significantly better than the other two techniques. 2) Good parallel efficiency is achieved for each of the four algorithms we experimented with, using our techniques and runtime system. 3) The overhead of the interface is within 10 percent in almost all cases. 4) In the case of decision tree construction, combining different techniques turned out to be crucial for achieving high performance.
KW - Association mining
KW - Clustering
KW - Decision tree construction
KW - Programming interfaces
KW - Shared memory parallelization
UR - http://www.scopus.com/inward/record.url?scp=17444402472&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=17444402472&partnerID=8YFLogxK
U2 - 10.1109/TKDE.2005.18
DO - 10.1109/TKDE.2005.18
M3 - Article
AN - SCOPUS:17444402472
SN - 1041-4347
VL - 17
SP - 71
EP - 89
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
IS - 1
ER -