Bayesian criterion-based variable selection

Arnab Kumar Maity, Sanjib Basu, Santu Ghosh

Research output: Contribution to journalArticlepeer-review

Abstract

Bayesian approaches for criterion based selection include the marginal likelihood based highest posterior model (HPM) and the deviance information criterion (DIC). The DIC is popular in practice as it can often be estimated from sampling-based methods with relative ease and DIC is readily available in various Bayesian software. We find that sensitivity of DIC-based selection can be high, in the range of 90–100%. However, correct selection by DIC can be in the range of 0–2%. These performances persist consistently with increase in sample size. We establish that both marginal likelihood and DIC asymptotically disfavour under-fitted models, explaining the high sensitivities of both criteria. However, mis-selection probability of DIC remains bounded below by a positive constant in linear models with g-priors whereas mis-selection probability by marginal likelihood converges to 0 under certain conditions. A consequence of our results is that not only the DIC cannot asymptotically differentiate between the data-generating and an over-fitted model, but, in fact, it cannot asymptotically differentiate between two over-fitted models as well. We illustrate these results in multiple simulation studies and in a biomarker selection problem on cancer cachexia of non-small cell lung cancer patients. We further study the performances of HPM and DIC in generalized linear model as practitioners often choose to use DIC that is readily available in software in such non-conjugate settings.

Original languageEnglish (US)
Pages (from-to)835-857
Number of pages23
JournalJournal of the Royal Statistical Society. Series C: Applied Statistics
Volume70
Issue number4
DOIs
StatePublished - Aug 2021

Fingerprint

Dive into the research topics of 'Bayesian criterion-based variable selection'. Together they form a unique fingerprint.

Cite this