Journal of Data Science ›› 2020, Vol. 18 ›› Issue (3): 409-432.doi: 10.6339/JDS.202007_18(3).0003

Previous Articles     Next Articles

An epidemiological forecast model and software assessing interventions on the COVID-19 epidemic in China (with discussion)

Lili Wang1, Yiwang Zhou1, Jie He1, Bin Zhu2, Fei Wang3, Lu Tang4, Michael Kleinsasser1, Daniel Barker1, Marisa C. Eisenberg5, and Peter X.K. Song1

  1. 1 Department of Biostatistics, University of Michigan, Ann Arbor, MI
    2 Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 
    3 Data Science Team, CarGurus, Cambridge, MA
    4 Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA 
    5 Department of Epidemiology, University of Michigan, Ann Arbor, MI

  • Online:2020-07-21 Published:2020-07-22

Abstract: We develop a health informatics toolbox that enables timely analysis and evaluation of the time- course dynamics of a range of infectious disease epidemics. As a case study, we examine the novel coronavirus (COVID-19) epidemic using the publicly available data from the China CDC. This toolbox is built upon a hierarchical epidemiological model in which two observed time series of daily proportions of infected and removed cases are generated from the underlying infection dy- namics governed by a Markov Susceptible-Infectious-Removed (SIR) infectious disease process. We extend the SIR model to incorporate various types of time-varying quarantine protocols, in- cluding government-level ‘macro’ isolation policies and community-level ‘micro’ social distancing (e.g. self-isolation and self-quarantine) measures. We develop a calibration procedure for under- reported infected cases. This toolbox provides forecasts, in both online and offline forms, as well as simulating the overall dynamics of the epidemic. An R software package is made available for the public, and examples on the use of this software are illustrated. Some possible extensions of our novel epidemiological models are discussed.

Key words: coronavirus, Infectious disease, MCMC, prediction, Runga–Kutta approximation, SIR model, turning point, under-reporting