An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics

The Cancer Genome Atlas Research Network

Research output: Contribution to journalArticle

46 Citations (Scopus)

Abstract

For a decade, The Cancer Genome Atlas (TCGA) program collected clinicopathologic annotation data along with multi-platform molecular profiles of more than 11,000 human tumors across 33 different cancer types. TCGA clinical data contain key features representing the democratized nature of the data collection process. To ensure proper use of this large clinical dataset associated with genomic features, we developed a standardized dataset named the TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR), which includes four major clinical outcome endpoints. In addition to detailing major challenges and statistical limitations encountered during the effort of integrating the acquired clinical data, we present a summary that includes endpoint usage recommendations for each cancer type. These TCGA-CDR findings appear to be consistent with cancer genomics studies independent of the TCGA effort and provide opportunities for investigating cancer biology using clinical correlates at an unprecedented scale. Analysis of clinicopathologic annotations for over 11,000 cancer patients in the TCGA program leads to the generation of TCGA Clinical Data Resource, which provides recommendations of clinical outcome endpoint usage for 33 cancer types.

Original languageEnglish (US)
Pages (from-to)400-416.e11
JournalCell
Volume173
Issue number2
DOIs
StatePublished - Apr 5 2018

Fingerprint

Atlases
Genes
Genome
Survival
Neoplasms
Tumors
Genomics

Keywords

  • Cox proportional hazards regression model
  • TCGA
  • The Cancer Genome Atlas
  • clinical data resource
  • disease-free interval
  • disease-specific survival
  • follow-up time
  • overall survival
  • progression-free interval
  • translational research

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)

Cite this

An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics. / The Cancer Genome Atlas Research Network.

In: Cell, Vol. 173, No. 2, 05.04.2018, p. 400-416.e11.

Research output: Contribution to journalArticle

The Cancer Genome Atlas Research Network. / An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics. In: Cell. 2018 ; Vol. 173, No. 2. pp. 400-416.e11.
@article{5dd103cdf663489c8612d45349110f75,
title = "An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics",
abstract = "For a decade, The Cancer Genome Atlas (TCGA) program collected clinicopathologic annotation data along with multi-platform molecular profiles of more than 11,000 human tumors across 33 different cancer types. TCGA clinical data contain key features representing the democratized nature of the data collection process. To ensure proper use of this large clinical dataset associated with genomic features, we developed a standardized dataset named the TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR), which includes four major clinical outcome endpoints. In addition to detailing major challenges and statistical limitations encountered during the effort of integrating the acquired clinical data, we present a summary that includes endpoint usage recommendations for each cancer type. These TCGA-CDR findings appear to be consistent with cancer genomics studies independent of the TCGA effort and provide opportunities for investigating cancer biology using clinical correlates at an unprecedented scale. Analysis of clinicopathologic annotations for over 11,000 cancer patients in the TCGA program leads to the generation of TCGA Clinical Data Resource, which provides recommendations of clinical outcome endpoint usage for 33 cancer types.",
keywords = "Cox proportional hazards regression model, TCGA, The Cancer Genome Atlas, clinical data resource, disease-free interval, disease-specific survival, follow-up time, overall survival, progression-free interval, translational research",
author = "{The Cancer Genome Atlas Research Network} and Jianfang Liu and Tara Lichtenberg and Hoadley, {Katherine A.} and Poisson, {Laila M.} and Lazar, {Alexander J.} and Cherniack, {Andrew D.} and Kovatich, {Albert J.} and Benz, {Christopher C.} and Levine, {Douglas A.} and Lee, {Adrian V.} and Larsson Omberg and Wolf, {Denise M.} and Shriver, {Craig D.} and Vesteinn Thorsson and Caesar-Johnson, {Samantha J.} and Demchok, {John A.} and Ina Felau and Melpomeni Kasapi and Ferguson, {Martin L.} and Hutter, {Carolyn M.} and Sofia, {Heidi J.} and Roy Tarnuzzer and Zhining Wang and Liming Yang and Zenklusen, {Jean C.} and Zhang, {Jiashan (Julia)} and Sudha Chudamani and Jia Liu and Laxmi Lolla and Rashi Naresh and Todd Pihl and Qiang Sun and Yunhu Wan and Ye Wu and Juok Cho and Timothy DeFreitas and Scott Frazer and Nils Gehlenborg and Gad Getz and Heiman, {David I.} and Jaegil Kim and Lawrence, {Michael S.} and Pei Lin and Sam Meier and Noble, {Michael S.} and Gordon Saksena and Doug Voet and Hailei Zhang and Brady Bernard and Zaren, {Howard A.}",
year = "2018",
month = "4",
day = "5",
doi = "10.1016/j.cell.2018.02.052",
language = "English (US)",
volume = "173",
pages = "400--416.e11",
journal = "Cell",
issn = "0092-8674",
publisher = "Cell Press",
number = "2",

}

TY - JOUR

T1 - An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics

AU - The Cancer Genome Atlas Research Network

AU - Liu, Jianfang

AU - Lichtenberg, Tara

AU - Hoadley, Katherine A.

AU - Poisson, Laila M.

AU - Lazar, Alexander J.

AU - Cherniack, Andrew D.

AU - Kovatich, Albert J.

AU - Benz, Christopher C.

AU - Levine, Douglas A.

AU - Lee, Adrian V.

AU - Omberg, Larsson

AU - Wolf, Denise M.

AU - Shriver, Craig D.

AU - Thorsson, Vesteinn

AU - Caesar-Johnson, Samantha J.

AU - Demchok, John A.

AU - Felau, Ina

AU - Kasapi, Melpomeni

AU - Ferguson, Martin L.

AU - Hutter, Carolyn M.

AU - Sofia, Heidi J.

AU - Tarnuzzer, Roy

AU - Wang, Zhining

AU - Yang, Liming

AU - Zenklusen, Jean C.

AU - Zhang, Jiashan (Julia)

AU - Chudamani, Sudha

AU - Liu, Jia

AU - Lolla, Laxmi

AU - Naresh, Rashi

AU - Pihl, Todd

AU - Sun, Qiang

AU - Wan, Yunhu

AU - Wu, Ye

AU - Cho, Juok

AU - DeFreitas, Timothy

AU - Frazer, Scott

AU - Gehlenborg, Nils

AU - Getz, Gad

AU - Heiman, David I.

AU - Kim, Jaegil

AU - Lawrence, Michael S.

AU - Lin, Pei

AU - Meier, Sam

AU - Noble, Michael S.

AU - Saksena, Gordon

AU - Voet, Doug

AU - Zhang, Hailei

AU - Bernard, Brady

AU - Zaren, Howard A.

PY - 2018/4/5

Y1 - 2018/4/5

N2 - For a decade, The Cancer Genome Atlas (TCGA) program collected clinicopathologic annotation data along with multi-platform molecular profiles of more than 11,000 human tumors across 33 different cancer types. TCGA clinical data contain key features representing the democratized nature of the data collection process. To ensure proper use of this large clinical dataset associated with genomic features, we developed a standardized dataset named the TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR), which includes four major clinical outcome endpoints. In addition to detailing major challenges and statistical limitations encountered during the effort of integrating the acquired clinical data, we present a summary that includes endpoint usage recommendations for each cancer type. These TCGA-CDR findings appear to be consistent with cancer genomics studies independent of the TCGA effort and provide opportunities for investigating cancer biology using clinical correlates at an unprecedented scale. Analysis of clinicopathologic annotations for over 11,000 cancer patients in the TCGA program leads to the generation of TCGA Clinical Data Resource, which provides recommendations of clinical outcome endpoint usage for 33 cancer types.

AB - For a decade, The Cancer Genome Atlas (TCGA) program collected clinicopathologic annotation data along with multi-platform molecular profiles of more than 11,000 human tumors across 33 different cancer types. TCGA clinical data contain key features representing the democratized nature of the data collection process. To ensure proper use of this large clinical dataset associated with genomic features, we developed a standardized dataset named the TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR), which includes four major clinical outcome endpoints. In addition to detailing major challenges and statistical limitations encountered during the effort of integrating the acquired clinical data, we present a summary that includes endpoint usage recommendations for each cancer type. These TCGA-CDR findings appear to be consistent with cancer genomics studies independent of the TCGA effort and provide opportunities for investigating cancer biology using clinical correlates at an unprecedented scale. Analysis of clinicopathologic annotations for over 11,000 cancer patients in the TCGA program leads to the generation of TCGA Clinical Data Resource, which provides recommendations of clinical outcome endpoint usage for 33 cancer types.

KW - Cox proportional hazards regression model

KW - TCGA

KW - The Cancer Genome Atlas

KW - clinical data resource

KW - disease-free interval

KW - disease-specific survival

KW - follow-up time

KW - overall survival

KW - progression-free interval

KW - translational research

UR - http://www.scopus.com/inward/record.url?scp=85044905247&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85044905247&partnerID=8YFLogxK

U2 - 10.1016/j.cell.2018.02.052

DO - 10.1016/j.cell.2018.02.052

M3 - Article

VL - 173

SP - 400-416.e11

JO - Cell

JF - Cell

SN - 0092-8674

IS - 2

ER -