Classification trees as proxies

Anthony Scime, Nilay Saiya, Gregory Roy Murray, Steven J. Jurek

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

In data analysis, when data are unattainable, it is common to select a closely related attribute as a proxy. But sometimes substitution of one attribute for another is not sufficient to satisfy the needs of the analysis. In these cases, a classification model based on one dataset can be investigated as a possible proxy for another closely related domain's dataset. If the model's structure is sufficient to classify data from the related domain, the model can be used as a proxy tree. Such a proxy tree also provides an alternative characterization of the related domain. Just as important, if the original model does not successfully classify the related domain data the domains are not as closely related as believed. This paper presents a methodology for evaluating datasets as proxies along with three cases that demonstrate the methodology and the three types of results.

Original languageEnglish (US)
Pages (from-to)31-44
Number of pages14
JournalInternational Journal of Business Analytics
Volume2
Issue number2
DOIs
StatePublished - Apr 1 2015
Externally publishedYes

Fingerprint

Methodology
Substitution

Keywords

  • Classification
  • Data analysis
  • Data mining
  • Proxy
  • Social science

ASJC Scopus subject areas

  • Business and International Management
  • Strategy and Management

Cite this

Classification trees as proxies. / Scime, Anthony; Saiya, Nilay; Murray, Gregory Roy; Jurek, Steven J.

In: International Journal of Business Analytics, Vol. 2, No. 2, 01.04.2015, p. 31-44.

Research output: Contribution to journalArticle

Scime, Anthony ; Saiya, Nilay ; Murray, Gregory Roy ; Jurek, Steven J. / Classification trees as proxies. In: International Journal of Business Analytics. 2015 ; Vol. 2, No. 2. pp. 31-44.
@article{bb3b7ce4e7f541e7a5aa1ba3cc186694,
title = "Classification trees as proxies",
abstract = "In data analysis, when data are unattainable, it is common to select a closely related attribute as a proxy. But sometimes substitution of one attribute for another is not sufficient to satisfy the needs of the analysis. In these cases, a classification model based on one dataset can be investigated as a possible proxy for another closely related domain's dataset. If the model's structure is sufficient to classify data from the related domain, the model can be used as a proxy tree. Such a proxy tree also provides an alternative characterization of the related domain. Just as important, if the original model does not successfully classify the related domain data the domains are not as closely related as believed. This paper presents a methodology for evaluating datasets as proxies along with three cases that demonstrate the methodology and the three types of results.",
keywords = "Classification, Data analysis, Data mining, Proxy, Social science",
author = "Anthony Scime and Nilay Saiya and Murray, {Gregory Roy} and Jurek, {Steven J.}",
year = "2015",
month = "4",
day = "1",
doi = "10.4018/IJBAN.2015040103",
language = "English (US)",
volume = "2",
pages = "31--44",
journal = "International Journal of Business Analytics",
issn = "2334-4547",
publisher = "IGI Global Publishing",
number = "2",

}

TY - JOUR

T1 - Classification trees as proxies

AU - Scime, Anthony

AU - Saiya, Nilay

AU - Murray, Gregory Roy

AU - Jurek, Steven J.

PY - 2015/4/1

Y1 - 2015/4/1

N2 - In data analysis, when data are unattainable, it is common to select a closely related attribute as a proxy. But sometimes substitution of one attribute for another is not sufficient to satisfy the needs of the analysis. In these cases, a classification model based on one dataset can be investigated as a possible proxy for another closely related domain's dataset. If the model's structure is sufficient to classify data from the related domain, the model can be used as a proxy tree. Such a proxy tree also provides an alternative characterization of the related domain. Just as important, if the original model does not successfully classify the related domain data the domains are not as closely related as believed. This paper presents a methodology for evaluating datasets as proxies along with three cases that demonstrate the methodology and the three types of results.

AB - In data analysis, when data are unattainable, it is common to select a closely related attribute as a proxy. But sometimes substitution of one attribute for another is not sufficient to satisfy the needs of the analysis. In these cases, a classification model based on one dataset can be investigated as a possible proxy for another closely related domain's dataset. If the model's structure is sufficient to classify data from the related domain, the model can be used as a proxy tree. Such a proxy tree also provides an alternative characterization of the related domain. Just as important, if the original model does not successfully classify the related domain data the domains are not as closely related as believed. This paper presents a methodology for evaluating datasets as proxies along with three cases that demonstrate the methodology and the three types of results.

KW - Classification

KW - Data analysis

KW - Data mining

KW - Proxy

KW - Social science

UR - http://www.scopus.com/inward/record.url?scp=85046757913&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85046757913&partnerID=8YFLogxK

U2 - 10.4018/IJBAN.2015040103

DO - 10.4018/IJBAN.2015040103

M3 - Article

VL - 2

SP - 31

EP - 44

JO - International Journal of Business Analytics

JF - International Journal of Business Analytics

SN - 2334-4547

IS - 2

ER -