TY - GEN
T1 - Active learning based frequent itemset mining over the deep web
AU - Liu, Tantan
AU - Agrawal, Gagan
PY - 2011
Y1 - 2011
N2 - In recent years, one mode of data dissemination has become extremely popular, which is the deep web. A key characteristics of deep web data sources is that data can only be accessed through the limited query interface they support. This paper develops a methodology for mining the deep web. Because these data sources cannot be accessed directly, thus, data mining must be performed based on sampling of the datasets. The samples, in turn, can only be obtained by querying the deep web databases with specific inputs. Unlike existing sampling based methods, which are typically applied on relational databases or streaming data, sampling costs, and not the computation or memory costs, are the dominant consideration in designing the algorithm.
AB - In recent years, one mode of data dissemination has become extremely popular, which is the deep web. A key characteristics of deep web data sources is that data can only be accessed through the limited query interface they support. This paper develops a methodology for mining the deep web. Because these data sources cannot be accessed directly, thus, data mining must be performed based on sampling of the datasets. The samples, in turn, can only be obtained by querying the deep web databases with specific inputs. Unlike existing sampling based methods, which are typically applied on relational databases or streaming data, sampling costs, and not the computation or memory costs, are the dominant consideration in designing the algorithm.
UR - http://www.scopus.com/inward/record.url?scp=79957820915&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79957820915&partnerID=8YFLogxK
U2 - 10.1109/ICDE.2011.5767919
DO - 10.1109/ICDE.2011.5767919
M3 - Conference contribution
AN - SCOPUS:79957820915
SN - 9781424489589
T3 - Proceedings - International Conference on Data Engineering
SP - 219
EP - 230
BT - 2011 IEEE 27th International Conference on Data Engineering, ICDE 2011
T2 - 2011 IEEE 27th International Conference on Data Engineering, ICDE 2011
Y2 - 11 April 2011 through 16 April 2011
ER -