The medical AI insurgency: what physicians must know about data to practice with intelligent machines

D. Douglas Miller

doi:10.1038/s41746-019-0138-5

The medical AI insurgency: what physicians must know about data to practice with intelligent machines

D. Douglas Miller

Cardiology

Research output: Contribution to journal › Article › peer-review

43 Scopus citations

Abstract

Machine learning (ML) and its parent technology trend, artificial intelligence (AI), are deriving novel insights from ever larger and more complex datasets. Efficient and accurate AI analytics require fastidious data science—the careful curating of knowledge representations in databases, decomposition of data matrices to reduce dimensionality, and preprocessing of datasets to mitigate the confounding effects of messy (i.e., missing, redundant, and outlier) data. Messier, bigger and more dynamic medical datasets create the potential for ML computing systems querying databases to draw erroneous data inferences, portending real-world human health consequences. High-dimensional medical datasets can be static or dynamic. For example, principal component analysis (PCA) used within R computing packages can speed & scale disease association analytics for deriving polygenic risk scores from static gene-expression microarrays. Robust PCA of k-dimensional subspace data accelerates image acquisition and reconstruction of dynamic 4-D magnetic resonance imaging studies, enhancing tracking of organ physiology, tissue relaxation parameters, and contrast agent effects. Unlike other data-dense business and scientific sectors, medical AI users must be aware that input data quality limitations can have health implications, potentially reducing analytic model accuracy for predicting clinical disease risks and patient outcomes. As AI technologies find more health applications, physicians should contribute their health domain expertize to rules-/ML-based computer system development, inform input data provenance and recognize the importance of data preprocessing quality assurance before interpreting the clinical implications of intelligent machine outputs to patients.

Original language	English (US)
Article number	62
Journal	npj Digital Medicine
Volume	2
Issue number	1
DOIs	https://doi.org/10.1038/s41746-019-0138-5
State	Published - Dec 1 2019

ASJC Scopus subject areas

Medicine (miscellaneous)
Health Informatics
Computer Science Applications
Health Information Management

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1038/s41746-019-0138-5

Cite this

@article{02586ef478284568acbcc5d9c16d8500,

title = "The medical AI insurgency: what physicians must know about data to practice with intelligent machines",

abstract = "Machine learning (ML) and its parent technology trend, artificial intelligence (AI), are deriving novel insights from ever larger and more complex datasets. Efficient and accurate AI analytics require fastidious data science—the careful curating of knowledge representations in databases, decomposition of data matrices to reduce dimensionality, and preprocessing of datasets to mitigate the confounding effects of messy (i.e., missing, redundant, and outlier) data. Messier, bigger and more dynamic medical datasets create the potential for ML computing systems querying databases to draw erroneous data inferences, portending real-world human health consequences. High-dimensional medical datasets can be static or dynamic. For example, principal component analysis (PCA) used within R computing packages can speed & scale disease association analytics for deriving polygenic risk scores from static gene-expression microarrays. Robust PCA of k-dimensional subspace data accelerates image acquisition and reconstruction of dynamic 4-D magnetic resonance imaging studies, enhancing tracking of organ physiology, tissue relaxation parameters, and contrast agent effects. Unlike other data-dense business and scientific sectors, medical AI users must be aware that input data quality limitations can have health implications, potentially reducing analytic model accuracy for predicting clinical disease risks and patient outcomes. As AI technologies find more health applications, physicians should contribute their health domain expertize to rules-/ML-based computer system development, inform input data provenance and recognize the importance of data preprocessing quality assurance before interpreting the clinical implications of intelligent machine outputs to patients.",

author = "Miller, {D. Douglas}",

note = "Publisher Copyright: {\textcopyright} 2019, The Author(s).",

year = "2019",

month = dec,

day = "1",

doi = "10.1038/s41746-019-0138-5",

language = "English (US)",

volume = "2",

journal = "npj Digital Medicine",

issn = "2398-6352",

publisher = "Nature Publishing Group",

number = "1",

}

TY - JOUR

T1 - The medical AI insurgency

T2 - what physicians must know about data to practice with intelligent machines

AU - Miller, D. Douglas

PY - 2019/12/1

Y1 - 2019/12/1

N2 - Machine learning (ML) and its parent technology trend, artificial intelligence (AI), are deriving novel insights from ever larger and more complex datasets. Efficient and accurate AI analytics require fastidious data science—the careful curating of knowledge representations in databases, decomposition of data matrices to reduce dimensionality, and preprocessing of datasets to mitigate the confounding effects of messy (i.e., missing, redundant, and outlier) data. Messier, bigger and more dynamic medical datasets create the potential for ML computing systems querying databases to draw erroneous data inferences, portending real-world human health consequences. High-dimensional medical datasets can be static or dynamic. For example, principal component analysis (PCA) used within R computing packages can speed & scale disease association analytics for deriving polygenic risk scores from static gene-expression microarrays. Robust PCA of k-dimensional subspace data accelerates image acquisition and reconstruction of dynamic 4-D magnetic resonance imaging studies, enhancing tracking of organ physiology, tissue relaxation parameters, and contrast agent effects. Unlike other data-dense business and scientific sectors, medical AI users must be aware that input data quality limitations can have health implications, potentially reducing analytic model accuracy for predicting clinical disease risks and patient outcomes. As AI technologies find more health applications, physicians should contribute their health domain expertize to rules-/ML-based computer system development, inform input data provenance and recognize the importance of data preprocessing quality assurance before interpreting the clinical implications of intelligent machine outputs to patients.

AB - Machine learning (ML) and its parent technology trend, artificial intelligence (AI), are deriving novel insights from ever larger and more complex datasets. Efficient and accurate AI analytics require fastidious data science—the careful curating of knowledge representations in databases, decomposition of data matrices to reduce dimensionality, and preprocessing of datasets to mitigate the confounding effects of messy (i.e., missing, redundant, and outlier) data. Messier, bigger and more dynamic medical datasets create the potential for ML computing systems querying databases to draw erroneous data inferences, portending real-world human health consequences. High-dimensional medical datasets can be static or dynamic. For example, principal component analysis (PCA) used within R computing packages can speed & scale disease association analytics for deriving polygenic risk scores from static gene-expression microarrays. Robust PCA of k-dimensional subspace data accelerates image acquisition and reconstruction of dynamic 4-D magnetic resonance imaging studies, enhancing tracking of organ physiology, tissue relaxation parameters, and contrast agent effects. Unlike other data-dense business and scientific sectors, medical AI users must be aware that input data quality limitations can have health implications, potentially reducing analytic model accuracy for predicting clinical disease risks and patient outcomes. As AI technologies find more health applications, physicians should contribute their health domain expertize to rules-/ML-based computer system development, inform input data provenance and recognize the importance of data preprocessing quality assurance before interpreting the clinical implications of intelligent machine outputs to patients.

UR - http://www.scopus.com/inward/record.url?scp=85089606173&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85089606173&partnerID=8YFLogxK

U2 - 10.1038/s41746-019-0138-5

DO - 10.1038/s41746-019-0138-5

M3 - Article

AN - SCOPUS:85089606173

SN - 2398-6352

VL - 2

JO - npj Digital Medicine

JF - npj Digital Medicine

IS - 1

M1 - 62

ER -

The medical AI insurgency: what physicians must know about data to practice with intelligent machines

Abstract

ASJC Scopus subject areas

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this