The use of classification and regression trees to predict the likelihood of seasonal influenza

Anna M. Afonso, Mark H. Ebell, Ralph Gonzales, John Stein, Blaise Genton, Nicolas Senn

Research output: Contribution to journalArticle

9 Citations (Scopus)

Abstract

Background: Individual signs and symptoms are of limited value for the diagnosis of influenza. Objective: To develop a decision tree for the diagnosis of influenza based on a classification and regression tree (CART) analysis. Methods: Data from two previous similar cohort studies were assembled into a single dataset. The data were randomly divided into a development set (70%) and a validation set (30%). We used CART analysis to develop three models that maximize the number of patients who do not require diagnostic testing prior to treatment decisions. The validation set was used to evaluate overfitting of the model to the training set. Results: Model 1 has seven terminal nodes based on temperature, the onset of symptoms and the presence of chills, cough and myalgia. Model 2 was a simpler tree with only two splits based on temperature and the presence of chills. Model 3 was developed with temperature as a dichotomous variable (≥38°C) and had only two splits based on the presence of fever and myalgia. The area under the receiver operating characteristic curves (AUROCC) for the development and validation sets, respectively, were 0.82 and 0.80 for Model 1, 0.75 and 0.76 for Model 2 and 0.76 and 0.77 for Model 3. Model 2 classified 67% of patients in the validation group into a high- or low-risk group compared with only 38% for Model 1 and 54% for Model 3. Conclusions: A simple decision tree (Model 2) classified two-thirds of patients as low or high risk and had an AUROCC of 0.76. After further validation in an independent population, this CART model could support clinical decision making regarding influenza, with low-risk patients requiring no further evaluation for influenza and high-risk patients being candidates for empiric symptomatic or drug therapy.

Original languageEnglish (US)
Article numbercms020
Pages (from-to)671-677
Number of pages7
JournalFamily Practice
Volume29
Issue number6
DOIs
StatePublished - Dec 1 2012

Fingerprint

Human Influenza
Chills
Decision Trees
Myalgia
ROC Curve
Temperature
Regression Analysis
Cough
Signs and Symptoms
Cohort Studies
Fever
Drug Therapy
Population
Therapeutics

Keywords

  • Clinical decision rules
  • Common illnesses
  • Infectious disease
  • Influenza
  • Primary care
  • Public health
  • Respiratory infections

ASJC Scopus subject areas

  • Family Practice

Cite this

Afonso, A. M., Ebell, M. H., Gonzales, R., Stein, J., Genton, B., & Senn, N. (2012). The use of classification and regression trees to predict the likelihood of seasonal influenza. Family Practice, 29(6), 671-677. [cms020]. https://doi.org/10.1093/fampra/cms020

The use of classification and regression trees to predict the likelihood of seasonal influenza. / Afonso, Anna M.; Ebell, Mark H.; Gonzales, Ralph; Stein, John; Genton, Blaise; Senn, Nicolas.

In: Family Practice, Vol. 29, No. 6, cms020, 01.12.2012, p. 671-677.

Research output: Contribution to journalArticle

Afonso, AM, Ebell, MH, Gonzales, R, Stein, J, Genton, B & Senn, N 2012, 'The use of classification and regression trees to predict the likelihood of seasonal influenza', Family Practice, vol. 29, no. 6, cms020, pp. 671-677. https://doi.org/10.1093/fampra/cms020
Afonso AM, Ebell MH, Gonzales R, Stein J, Genton B, Senn N. The use of classification and regression trees to predict the likelihood of seasonal influenza. Family Practice. 2012 Dec 1;29(6):671-677. cms020. https://doi.org/10.1093/fampra/cms020
Afonso, Anna M. ; Ebell, Mark H. ; Gonzales, Ralph ; Stein, John ; Genton, Blaise ; Senn, Nicolas. / The use of classification and regression trees to predict the likelihood of seasonal influenza. In: Family Practice. 2012 ; Vol. 29, No. 6. pp. 671-677.
@article{8f34f01687a242b4a7360a89291ba6ef,
title = "The use of classification and regression trees to predict the likelihood of seasonal influenza",
abstract = "Background: Individual signs and symptoms are of limited value for the diagnosis of influenza. Objective: To develop a decision tree for the diagnosis of influenza based on a classification and regression tree (CART) analysis. Methods: Data from two previous similar cohort studies were assembled into a single dataset. The data were randomly divided into a development set (70{\%}) and a validation set (30{\%}). We used CART analysis to develop three models that maximize the number of patients who do not require diagnostic testing prior to treatment decisions. The validation set was used to evaluate overfitting of the model to the training set. Results: Model 1 has seven terminal nodes based on temperature, the onset of symptoms and the presence of chills, cough and myalgia. Model 2 was a simpler tree with only two splits based on temperature and the presence of chills. Model 3 was developed with temperature as a dichotomous variable (≥38°C) and had only two splits based on the presence of fever and myalgia. The area under the receiver operating characteristic curves (AUROCC) for the development and validation sets, respectively, were 0.82 and 0.80 for Model 1, 0.75 and 0.76 for Model 2 and 0.76 and 0.77 for Model 3. Model 2 classified 67{\%} of patients in the validation group into a high- or low-risk group compared with only 38{\%} for Model 1 and 54{\%} for Model 3. Conclusions: A simple decision tree (Model 2) classified two-thirds of patients as low or high risk and had an AUROCC of 0.76. After further validation in an independent population, this CART model could support clinical decision making regarding influenza, with low-risk patients requiring no further evaluation for influenza and high-risk patients being candidates for empiric symptomatic or drug therapy.",
keywords = "Clinical decision rules, Common illnesses, Infectious disease, Influenza, Primary care, Public health, Respiratory infections",
author = "Afonso, {Anna M.} and Ebell, {Mark H.} and Ralph Gonzales and John Stein and Blaise Genton and Nicolas Senn",
year = "2012",
month = "12",
day = "1",
doi = "10.1093/fampra/cms020",
language = "English (US)",
volume = "29",
pages = "671--677",
journal = "Family Practice",
issn = "0263-2136",
publisher = "Oxford University Press",
number = "6",

}

TY - JOUR

T1 - The use of classification and regression trees to predict the likelihood of seasonal influenza

AU - Afonso, Anna M.

AU - Ebell, Mark H.

AU - Gonzales, Ralph

AU - Stein, John

AU - Genton, Blaise

AU - Senn, Nicolas

PY - 2012/12/1

Y1 - 2012/12/1

N2 - Background: Individual signs and symptoms are of limited value for the diagnosis of influenza. Objective: To develop a decision tree for the diagnosis of influenza based on a classification and regression tree (CART) analysis. Methods: Data from two previous similar cohort studies were assembled into a single dataset. The data were randomly divided into a development set (70%) and a validation set (30%). We used CART analysis to develop three models that maximize the number of patients who do not require diagnostic testing prior to treatment decisions. The validation set was used to evaluate overfitting of the model to the training set. Results: Model 1 has seven terminal nodes based on temperature, the onset of symptoms and the presence of chills, cough and myalgia. Model 2 was a simpler tree with only two splits based on temperature and the presence of chills. Model 3 was developed with temperature as a dichotomous variable (≥38°C) and had only two splits based on the presence of fever and myalgia. The area under the receiver operating characteristic curves (AUROCC) for the development and validation sets, respectively, were 0.82 and 0.80 for Model 1, 0.75 and 0.76 for Model 2 and 0.76 and 0.77 for Model 3. Model 2 classified 67% of patients in the validation group into a high- or low-risk group compared with only 38% for Model 1 and 54% for Model 3. Conclusions: A simple decision tree (Model 2) classified two-thirds of patients as low or high risk and had an AUROCC of 0.76. After further validation in an independent population, this CART model could support clinical decision making regarding influenza, with low-risk patients requiring no further evaluation for influenza and high-risk patients being candidates for empiric symptomatic or drug therapy.

AB - Background: Individual signs and symptoms are of limited value for the diagnosis of influenza. Objective: To develop a decision tree for the diagnosis of influenza based on a classification and regression tree (CART) analysis. Methods: Data from two previous similar cohort studies were assembled into a single dataset. The data were randomly divided into a development set (70%) and a validation set (30%). We used CART analysis to develop three models that maximize the number of patients who do not require diagnostic testing prior to treatment decisions. The validation set was used to evaluate overfitting of the model to the training set. Results: Model 1 has seven terminal nodes based on temperature, the onset of symptoms and the presence of chills, cough and myalgia. Model 2 was a simpler tree with only two splits based on temperature and the presence of chills. Model 3 was developed with temperature as a dichotomous variable (≥38°C) and had only two splits based on the presence of fever and myalgia. The area under the receiver operating characteristic curves (AUROCC) for the development and validation sets, respectively, were 0.82 and 0.80 for Model 1, 0.75 and 0.76 for Model 2 and 0.76 and 0.77 for Model 3. Model 2 classified 67% of patients in the validation group into a high- or low-risk group compared with only 38% for Model 1 and 54% for Model 3. Conclusions: A simple decision tree (Model 2) classified two-thirds of patients as low or high risk and had an AUROCC of 0.76. After further validation in an independent population, this CART model could support clinical decision making regarding influenza, with low-risk patients requiring no further evaluation for influenza and high-risk patients being candidates for empiric symptomatic or drug therapy.

KW - Clinical decision rules

KW - Common illnesses

KW - Infectious disease

KW - Influenza

KW - Primary care

KW - Public health

KW - Respiratory infections

UR - http://www.scopus.com/inward/record.url?scp=84870029784&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84870029784&partnerID=8YFLogxK

U2 - 10.1093/fampra/cms020

DO - 10.1093/fampra/cms020

M3 - Article

VL - 29

SP - 671

EP - 677

JO - Family Practice

JF - Family Practice

SN - 0263-2136

IS - 6

M1 - cms020

ER -