Accurate and efficient estimation of small P-values with the cross-entropy method: Applications in genomic data analysis

Yang Shi, Mengqiao Wang, Weiping Shi, Ji Hyun Lee, Huining Kang, Hui Jiang

Research output: Contribution to journalArticle

Abstract

Motivation: Small P-values are often required to be accurately estimated in large-scale genomic studies for the adjustment of multiple hypothesis tests and the ranking of genomic features based on their statistical significance. For those complicated test statistics whose cumulative distribution functions are analytically intractable, existing methods usually do not work well with small P-values due to lack of accuracy or computational restrictions. We propose a general approach for accurately and efficiently estimating small P-values for a broad range of complicated test statistics based on the principle of the cross-entropy method and Markov chain Monte Carlo sampling techniques. Results: We evaluate the performance of the proposed algorithm through simulations and demonstrate its application to three real-world examples in genomic studies. The results show that our approach can accurately evaluate small to extremely small P-values (e.g. 10-6 to 10-100). The proposed algorithm is helpful for the improvement of some existing test procedures and the development of new test procedures in genomic studies.

Original languageEnglish (US)
Article numberbty1005
Pages (from-to)2441-2448
Number of pages8
JournalBioinformatics
Volume35
Issue number14
DOIs
StatePublished - Jul 15 2019

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Fingerprint Dive into the research topics of 'Accurate and efficient estimation of small P-values with the cross-entropy method: Applications in genomic data analysis'. Together they form a unique fingerprint.

  • Cite this