Top Read Articles

    Published in last 1 year |  In last 2 years |  In last 3 years |  All
    Please wait a minute...
    For Selected: Toggle Thumbnails
    Design Considerations for Vaccine Trials with a Special Focus on COVID-19 Vaccine Development
    Jie Chen and Naitee Ting
    Journal of Data Science    2020, 18 (3): 550-580.   DOI: 10.6339/JDS.202007_18(3).0020
    Abstract305)      PDF (476KB)(320)       Save
    The COVID-19 pandemic has triggered explosive activities in searching for cures, including vac- cines against the SARS-CoV-2 infection. As of April 30, 2020, there are at least 102 COVID-19 vaccine development programs worldwide, the majority of which are in preclinical development phases, five are in phase I trial, and three are in phase I/II trial. Experts caution against rushing COVID-19 vaccine development, not only because the knowledge about SARS-CoV-2 is lack- ing (albeit rapidly accumulating), but also because vaccine development is a complex, lengthy process with its own rules and timelines. Clinical trials are critically important in vaccine devel- opment, usually starting from small-scale phase I trials and gradually moving to the next phases (II and III) after the primary objectives are met. This paper is intended to provide an overview on design considerations for vaccine clinical trials, with a special focus on COVID-19 vaccine de- velopment. Given the current pandemic paradigm and unique features of vaccine development, our recommendations from statistical design perspective for COVID-19 vaccine trials include: (1) novel trial design (e.g., master protocol) to expedite the simultaneous evaluation of multiple candidate vaccines or vaccine doses, (2) human challenge studies to accelerate clinical develop- ment, (3) adaptive design strategies (e.g., group sequential designs) for early termination due to futility, efficacy, and/or safety, (4) extensive modeling and simulation to characterize and estab- lish long-term efficacy based on early-phase or short-term follow-up data, (5) safety evaluation as one of the primary focuses throughout all phases of clinical trials, (6) leveraging real-world data and evidence in vaccine trial design and analysis to establish vaccine effectiveness, and (7) global collaboration to form a joint development effort for more efficient use of resource and expertise and data sharing.
    Related Articles | Metrics
    Gene Set Enrichment Analysis in RNA-Seq Data
    Chen-An Tsai and Pei-Hsun Li
    Journal of Data Science    2020, 18 (4): 632-648.   DOI: 10.6339/JDS.202010_18(4).0003
    Abstract185)      PDF (674KB)(54)       Save
    Related Articles | Metrics
    Editorial: A reformed Journal of Data Science for the era of data science
    Jun Yan
    Journal of Data Science    2020, 18 (3): 405-406.   DOI: 10.6339/JDS.202007_18(3).0001
    Abstract154)      PDF (143KB)(122)       Save
    Related Articles | Metrics
    [ Discussion Paper ] An Epidemiological Forecast Model and Software Assessing Interventions on COVID-19 Epidemic in China
    Lili Wang, Yiwang Zhou, Jie He, Bin Zhu, Fei Wang, Lu Tang, Marisa C. Eisenberg, Peter X.K. Song
    Journal of Data Science    0, (): 1-.  
    Accepted: 03 April 2020

    Abstract146)      PDF (774KB)(147)       Save
    We develop a health informatics toolbox that enables timely analysis and evaluation of the time-course dynamics of a range of infectious disease epidemics. As a case study, we examine the novel coronavirus (COVID-19) epidemic using the publicly available data from the China CDC. This toolbox is built upon a hierarchical epidemiological model in which two observed time series of daily proportions of infected and removed cases are generated from the underlying infection dynamics governed by a Markov Susceptible-Infectious-Removed (SIR) infectious disease process. We extend the SIR model to incorporate various types of time-varying quarantine protocols, including government-level ‘macro’ isolation policies and community-level ‘micro’ social distancing (e.g. self-isolation and self-quarantine) measures. We develop a calibration procedure for under-reported infected cases. This toolbox provides forecasts, in both online and offline forms,as well as simulating the overall dynamics of the epidemic. An R software package is made available for the public, and examples on the use of this software are illustrated. Some possible extensions of our novel epidemiological models are discussed.
    Related Articles | Metrics
    [ Discussion Paper ] Tracking Reproductivity of COVID-19 Epidemic in China with Varying Coefficient SIR Model
    Haoxuan Sun, Yumou Qiu, Han Yan, Yaxuan Huang, Yuru Zhu, Jia Gu, Song Xi Chen
    Journal of Data Science    0, (): 2-.  
    Accepted: 23 April 2020

    Abstract135)      PDF (1138KB)(128)       Save
    We propose a varying coefficient Susceptible-Infected-Removal (vSIR) model that allows changing infection and removal rates for the latest corona virus (COVID-19) outbreak in China. The vSIR model together with proposed estimation procedures allow one to track the reproductivity of the COVID-19 through time and to assess the effectiveness of the control measures implemented since Jan 23 2020 when the city of Wuhan was lockdown followed by an extremely high level of self-isolation in the population. Our study finds that the reproductivity of COVID-19 had been significantly slowed down in the three weeks from January 27 to February 17th with 96.3% and 95.1% reductions in the effective reproduction numbers R among the 30 provinces and 15 Hubei cities, respectively. Predictions to the ending times and the total numbers of infected are made under three scenarios of the removal rates. The paper provides a timely model and associated estimation and prediction methods which may be applied in other countries to track, assess and predict the epidemic of the COVID-19 or other infectious diseases.
    Related Articles | Metrics
    Data Visualization and Descriptive Analysis for Understanding Epidemiological Characteristics of COVID-19: A Case Study of a Dataset from January 22, 2020 to March 29, 2020
    Yasin Khadem Charvadeh and Grace Y. Yi
    Journal of Data Science    2020, 18 (3): 526-535.   DOI: 10.6339/JDS.202007_18(3).0018
    Abstract121)      PDF (475KB)(82)       Save
    COVID-19 is a disease caused by the severe acute respiratory syndrome coronavirus 2 (SARS- CoV-2) that was reported to spread in people in December 2019. Understanding epidemiological features of COVID-19 is important for the ongoing global efforts to contain the virus. As a complement to the available work, in this article we analyze the Kaggle novel coronavirus dataset of 3397 patients dated from January 22, 2020 to March 29, 2020. We employ semiparametric and nonparametric survival models as well as text mining and data visualization techniques to examine the clinical manifestations and epidemiological features of COVID-19. Our analysis shows that: (i) the median incubation time is about 5 days and older people tend to have a longer incubation period; (ii) the median time for infected people to recover is about 20 days, and the recovery time is significantly associated with age but not gender; (iii) the fatality rate is higher for older infected patients than for younger patients.
    Related Articles | Metrics
    An epidemiological forecast model and software assessing interventions on the COVID-19 epidemic in China (with discussion)
    Lili Wang, Yiwang Zhou, Jie He, Bin Zhu, Fei Wang, Lu Tang, Michael Kleinsasser, Daniel Barker, Marisa C. Eisenberg, and Peter X.K. Song
    Journal of Data Science    2020, 18 (3): 409-432.   DOI: 10.6339/JDS.202007_18(3).0003
    Abstract115)      PDF (1444KB)(122)       Save
    We develop a health informatics toolbox that enables timely analysis and evaluation of the time- course dynamics of a range of infectious disease epidemics. As a case study, we examine the novel coronavirus (COVID-19) epidemic using the publicly available data from the China CDC. This toolbox is built upon a hierarchical epidemiological model in which two observed time series of daily proportions of infected and removed cases are generated from the underlying infection dy- namics governed by a Markov Susceptible-Infectious-Removed (SIR) infectious disease process. We extend the SIR model to incorporate various types of time-varying quarantine protocols, in- cluding government-level ‘macro’ isolation policies and community-level ‘micro’ social distancing (e.g. self-isolation and self-quarantine) measures. We develop a calibration procedure for under- reported infected cases. This toolbox provides forecasts, in both online and offline forms, as well as simulating the overall dynamics of the epidemic. An R software package is made available for the public, and examples on the use of this software are illustrated. Some possible extensions of our novel epidemiological models are discussed.

    Related Articles | Metrics
    Meta-Analysis of Several Epidemic Characteristics of COVID-19
    Panpan Zhang, Tiandong Wang, and Sharon X. Xie
    Journal of Data Science    2020, 18 (3): 536-549.   DOI: 10.6339/JDS.202007_18(3).0019
    Abstract107)      PDF (376KB)(57)       Save
    As the COVID-19 pandemic has strongly disrupted people’s daily work and life, a great amount of scientific research has been conducted to understand the key characteristics of this new epidemic. In this manuscript, we focus on four crucial epidemic metrics with regard to the COVID-19, namely the basic reproduction number, the incubation period, the serial interval and the epidemic doubling time. We collect relevant studies based on the COVID-19 data in China and conduct a meta-analysis to obtain pooled estimates on the four metrics. From the summary results, we conclude that the COVID-19 has stronger transmissibility than SARS, implying that stringent public health strategies are necessary.
    Related Articles | Metrics
    COVID-19 Fatality: A Cross-Sectional Study using Adaptive Lasso Penalized Sliced Inverse Regression
    Kaida Cai, Wenqing He, and Grace Y. Yi
    Journal of Data Science    2020, 18 (3): 483-494.   DOI: 10.6339/JDS.202007_18(3).0015
    Abstract106)      PDF (406KB)(60)       Save
    Coronavirus disease 2019 (COVID-19) is an infectious disease caused by severe acute respiratory syndrome coronvirus, which was declared as a global pandemic by the World Health Organi- zation on March 11, 2020. In this work, we conduct a cross-sectional study to investigate how the infection fatality rate (IFR) of COVID-19 may be associated with possible geographical or demographical features of the infected population. We employ a multiple index model in combi- nation with sliced inverse regression to facilitate the relationship between the IFR and possible risk factors. To select associated features for the infection fatality rate, we utilize an adaptive Lasso penalized sliced inverse regression method, which achieves variable selection and sufficient dimension reduction simultaneously with unimportant features removed automatically. We ap- ply the proposed method to conduct a cross-sectional study for the COVID-19 data obtained from two time points of the outbreak.
    Related Articles | Metrics
    Editorial: Data science in action in response to the outbreak of COVID-19
    Dean Follmann, Peter X. K. Song, Hansheng Wang, and Jun Yan
    Journal of Data Science    2020, 18 (3): 407-408.   DOI: 10.6339/JDS.202007_18(3).0002
    Abstract94)      PDF (195KB)(93)       Save
    Related Articles | Metrics
    Tracking Reproductivity of COVID-19 Epidemic in China with Varying Coefficient SIR Model
    Haoxuan Sun, Yumou Qiu, Han Yan, Yaxuan Huang, Yuru Zhu, Jia Gu, and Song Xi Chen
    Journal of Data Science    2020, 18 (3): 455-472.   DOI: 10.6339/JDS.202007_18(3).0010
    Abstract94)      PDF (894KB)(276)       Save
    We propose a varying coefficient Susceptible-Infected-Removal (vSIR) model that allows changing infection and removal rates for the latest corona virus (COVID-19) outbreak in China. The vSIR model together with proposed estimation procedures allow one to track the reproductivity of the COVID-19 through time and to assess the effectiveness of the control measures implemented since Jan 23 2020 when the city of Wuhan was lockdown followed by an extremely high level of self-isolation in the population. Our study finds that the reproductivity of COVID-19 had been significantly slowed down in the three weeks from January 27th to February 17th with 96.3% and 95.1% reductions in the effective reproduction numbers R among the 30 provinces and 15 Hubei cities, respectively. Predictions to the ending times and the total numbers of infected are made under three scenarios of the removal rates. The paper provides a timely model and associated estimation and prediction methods which may be applied in other countries to track, assess and predict the epidemic of the COVID-19 or other infectious diseases.

    Related Articles | Metrics
    A Meta Analysis for the Basic Reproduction Number of COVID-19 with Application in Evaluating the Effectiveness of Isolation Measures in Different Countries
    Jianghu (James) Dong, Yongdao Zhou, Ying Zhang, Thomas Flaherty, and Douglas Franz
    Journal of Data Science    2020, 18 (3): 496-510.   DOI: 10.6339/JDS.202007_18(3).0016
    Abstract86)      PDF (1163KB)(69)       Save
    COVID-19 is quickly spreading around the world and carries along with it a significant threat to public health. This study sought to apply meta-analysis to more accurately estimate the basic reproduction number (R0) because prior estimates of R0 have a broad range from 1.95 to 6.47 in the existing literature. Utilizing meta-analysis techniques, we can determine a more robust estimation of R0, which is substantially larger than that provided by the World Health Organization (WHO). A susceptible-Infectious-removed (SIR) model for the new infection cases based on R0 from meta analysis is proposed to estimate the effective reproduction number Rt. The curves of estimated Rt values over time can illustrate that the isolation measures enforced in China and South Korea were substantially more effective in controlling COVID-19 compared to the measures enacted early in both Italy and the United States. Finally, we present the daily standardized infection cases per million population over time across countries, which is a good index to indicate the effectiveness of isolation measures on the prevention of COVID-19. This standardized infection case determines whether the current infection severity status is out of range of the national health capacity to care for patients.
    Related Articles | Metrics
    Discussion of “An epidemiological forecast model and software assessing interventions on the COVID-19 epidemic in China”
    Shannon Gallagher
    Journal of Data Science    2020, 18 (3): 437-437.   DOI: 10.6339/JDS.202007_18(3).0005
    Abstract81)      PDF (359KB)(41)       Save
    Related Articles | Metrics
    On Identification of High Risk Carriers of COVID-19 Using Masked Mobile Device Data 
    Da Huang, Xuening Zhu, Weidong Luo, Hao Yin, Jing Hong, Yu Chen, Jing Zhou, and Hansheng Wang
    Journal of Data Science    2020, 18 (Volume S1): 3-.   DOI: 10.6339/JDS.202012_18(S1).0002
    Abstract69)      PDF (289KB)(29)       Save
    Related Articles | Metrics
    Rejoinder: An epidemiological forecast model and software assessing interventions on COVID-19 epidemic in China
    Lili Wang, Yiwang Zhou, Jie He, Bin Zhu, Fei Wang, Lu Tang, Michael Kleinsasser, Daniel Barker, Marisa C. Eisenberg, and Peter X.K. Song
    Journal of Data Science    2020, 18 (3): 446-454.   DOI: 10.6339/JDS.202007_18(3).0009
    Abstract68)      PDF (1721KB)(96)       Save
    Related Articles | Metrics
    Discussion of “An epidemiological forecast model and software assessing interventions on the COVID-19 epidemic in China”
    Tianjian Zhou and Yuan Ji
    Journal of Data Science    2020, 18 (3): 440-442.   DOI: 10.6339/JDS.202007_18(3).0007
    Abstract68)      PDF (448KB)(46)    PDF(mobile) (448KB)(1)    Save
    Related Articles | Metrics
    Assessing the Impacts of Mutations to the Structure of COVID-19 Spike Protein via Sequential Monte Carlo
    Samuel W. K. Wong
    Journal of Data Science    2020, 18 (3): 511-525.   DOI: 10.6339/JDS.202007_18(3).0017
    Abstract67)      PDF (3056KB)(34)       Save
    Proteins play a key role in facilitating the infectiousness of the 2019 novel coronavirus. A specific spike protein enables this virus to bind to human cells, and a thorough understanding of its 3-dimensional structure is therefore critical for developing effective therapeutic interventions. However, its structure may continue to evolve over time as a result of mutations. In this paper, we use a data science perspective to study the potential structural impacts due to ongoing mutations in its amino acid sequence. To do so, we identify a key segment of the protein and apply a sequential Monte Carlo sampling method to detect possible changes to the space of low- energy conformations for different amino acid sequences. Such computational approaches can further our understanding of this protein structure and complement laboratory efforts.
    Related Articles | Metrics
    Discussion of “An epidemiological forecast model and software assessing interventions on the COVID-19 epidemic in China”
    Debangan Dey and Vadim Zipunnikov
    Journal of Data Science    2020, 18 (3): 433-436.   DOI: 10.6339/JDS.202007_18(3).0004
    Abstract64)      PDF (474KB)(99)       Save
    Related Articles | Metrics
    Discussion of “Tracking reproductivity of COVID-19 epidemic in China with varying coefficient SIR model”
    Yukang Jiang, Jianbin Tan, Ting Tian, and Xueqin Wang
    Journal of Data Science    2020, 18 (3): 473-474.   DOI: 10.6339/JDS.202007_18(3).0011
    Abstract63)      PDF (457KB)(36)       Save
    Related Articles | Metrics
    Subsampled Data Based Alternative Regularized Estimators 
    Subir Ghosh, Gabriel Ruiz, and Brandon Wales
    Journal of Data Science    2020, 18 (2): 238-256.   DOI: 10.6339/JDS.202004 18(2).0002
    Abstract61)      PDF (308KB)(48)       Save
    Subsampling the data is used in this paper as a learning method about the influence of the data points for drawing inference on the parameters of a fitted logistic regression model. The alternative, alternative regularized, alternative regularized lasso, and alternative regularized ridge estimators are proposed for the parameter estimation of logistic regression models and are then compared with the maximum likelihood estimators. The proposed alternative regularized estimators are obtained by using a tuning parameter but the proposed alternative estimators are not regularized. The proposed alternative regularized lasso estimators are the averaged standard lasso estimators and the alternative regularized ridge estimators are also the averaged standard ridge estimators over subsets of groups where the number of subsets could be smaller than the number of parameters. The values of the tuning parameters are obtained to make the alternative regularized estimators very close to the maximum likelihood estimators and the process is explained with two real data as well as a simulated study. The alternative and alternative regularized estimators always have the closed form expressions in terms of observations that the maximum likelihood estimators do not have. When the maximum likelihood estimators do not have the closed form expressions, the alternative regularized estimators thus obtained provide the approximate closed form expressions for them.
    Related Articles | Metrics
    Discussion of “Tracking reproductivity of COVID-19 epidemic in China with varying coefficient SIR model”
    Lu Tang
    Journal of Data Science    2020, 18 (3): 475-476.   DOI: 10.6339/JDS.202007_18(3).0012
    Abstract61)      PDF (390KB)(38)       Save
    Related Articles | Metrics
    Rejoinder: “Tracking Reproductivity of COVID-19 Epidemic with Varying Coefficient SIR Model”
    Haoxuan Sun, Yumou Qiu, Han Yan, Yaxuan Huang, Yuru Zhu, Jia Gu, and Song Xi Chen
    Journal of Data Science    2020, 18 (3): 480-482.   DOI: 10.6339/JDS.202007_18(3).0014
    Abstract59)      PDF (488KB)(49)       Save
    Related Articles | Metrics
    Discussion of “An epidemiological forecast model and software assessing interventions on the COVID-19 epidemic in China”
    Yifan Zhu and Ying Qing Chen
    Journal of Data Science    2020, 18 (3): 443-445.   DOI: 10.6339/JDS.202007_18(3).0008
    Abstract59)      PDF (410KB)(38)       Save
    Related Articles | Metrics
    Discussion of “An epidemiological forecast model and software assessing interventions on the COVID-19 epidemic in China”
    Kelly R. Moran
    Journal of Data Science    2020, 18 (3): 438-439.   DOI: 10.6339/JDS.202007_18(3).0006
    Abstract58)      PDF (345KB)(41)       Save
    Related Articles | Metrics
    Discussion of “Tracking reproductivity of COVID-19 epidemic in China with varying coefficient SIR model”
    Lili Wang, Fei Wang, Yiwang Zhou, and Peter X.K. Song
    Journal of Data Science    2020, 18 (3): 477-479.   DOI: 10.6339/JDS.202007_18(3).0013
    Abstract57)      PDF (514KB)(34)       Save
    Related Articles | Metrics
    Editorial: Data Science in Action in Response to the Outbreak of COVID-19 in China
    Dean Follmann, Peter X. K. Song, Hansheng Wang, and Jun Yan
    Journal of Data Science    0, (): 1-.   DOI: 10.6339/JDS.202007_18(S1).0001
    Abstract55)      PDF (167KB)(53)       Save
    Related Articles | Metrics
    The Real-time Effect of Public Health Interventions on the COVID-19 Epidemic in Hubei Province 
    Jiamin Liu, Ze Chen, Jianqiang Zhang, Yanyan Ouyang, Xu Guo , and Wangli Xu
    Journal of Data Science    2020, 18 (Volume S1): 61-.   DOI: 10.6339/JDS.202012_18(S1).0006
    Abstract55)      PDF (643KB)(45)       Save
    Related Articles | Metrics
    Extended Poisson-Frechet Distribution: Mathematical Properties and Applications to Survival and Repair Times
    M. S. Hamed
    Journal of Data Science    2020, 18 (2): 319-342.   DOI: 10.6339/JDS.202004_18(2).0006
    Abstract52)      PDF (859KB)(30)       Save
    In this paper, a new four parameter zero truncated Poisson Frechet distribution is defined and studied. Various structural mathematical properties of the proposed model including ordinary moments, incomplete moments, generating functions, order statistics, residual and reversed residual life functions are investigated. The maximum likelihood method is used to estimate the model parameters. We assess the performance of the maximum likelihood method by means of a numerical simulation study. The new distribution is applied for modeling two real data sets to illustrate empirically its flexibility.
    Related Articles | Metrics
    Investigating the Repeatability of the Extracted Factors in Relation to the Type of Rotation Used, and the Level of Random Error: A Simulation Study
    Dimitris Panaretos, George Tzavelas, Malvina Vamvakari, Demosthenes Panagiotakos
    Journal of Data Science    2020, 18 (2): 390-404.   DOI: 10.6339/JDS.202004_18(2).0010
    Abstract48)      PDF (377KB)(28)       Save
    Factor analysis (FA) is the most commonly used pattern recognition methodology in social and health research. A technique that may help to better retrieve true information from FA is the rotation of the information axes. The purpose of this study was to evaluate whether the selection of rotation type affects the repeatability of the patterns derived from FA, under various scenarios of random error introduced, based on simulated data from the Standard Normal distribution. It was observed that when applying promax non - orthogonal rotation, the results were more repeatable as compared to the orthogonal rotation, irrespective of the level of random error introduced in the model.
    Related Articles | Metrics
    Zografos Balakrishnan Power Lindley Distriution
    Noor Shahid, Rashida khalil and Javeria Khokhar
    Journal of Data Science    2020, 18 (2): 279-298.   DOI: 10.6339/JDS.202004_18(2).0004
    Abstract47)      PDF (701KB)(43)       Save
    In this paper Zografos Balakrishnan Power Lindley (ZB-PL) distribution has been obtained through the generalization of Power Lindley distribution using Zografos and Balakrishnan (2009) technique. For this technique, density of upper record values exists as their special case. Probability density (pdf), cumulative distribution (cdf) and hazard rate function (hrf) of the proposed distribution are obtained. The probability density and cumulative distribution function are expanded as linear combination of the density and distribution function of Exponentiated Power Lindley (EPL) distribution. This expansion is further used to study different properties of the new distribution. Some mathematical and statistical properties such as asymptotes, quantile function, moments, mgf, mean deviation, renyi entropy and reliability are also discussed. Probability density (pdf), cumulative distribution (cdf) and hazard rate (hrf) functions are graphically presented for different values of the parameters. In the end Maximum Likelihood Method is used to estimate the unknown parameters and application to a real data set is provided a. It has been observed that the proposed distribution provides superior fit than many useful distributions for given data set.
    Related Articles | Metrics
    Statistical Inference for K Exponential Populations Under Joint Progressive Type-I Censored Scheme
    O.E. Abo-Kasem and Mazen Nassar
    Journal of Data Science    2020, 18 (2): 376-389.   DOI: 10.6339/JDS.202004_18(2).0009
    Abstract45)      PDF (478KB)(17)       Save
    In this article, the maximum likelihood estimators of the k independent exponential populations parameters are obtained based on joint progressive type- I censored (JPC-I) scheme. The Bayes estimators are also obtained by considering three different loss functions. The approximate confidence, two Bootstrap confidence and the Bayes credible intervals for the unknown parameters are discussed. A simulated and real data sets are analyzed to illustrate the theoretical results.
    Related Articles | Metrics
    Estimation of the Bivariate Kumaraswamy Lifetime Distribution under Progressive Type-I Censoring 
    Hanan. M. Aly , Hiba. Z. Muhammed , and Ola. A. Abuelamayem
    Journal of Data Science    2020, 18 (4): 739-749.   DOI: 10.6339/JDS.202010_18(4).0009
    Abstract42)      PDF (332KB)(61)       Save
    Related Articles | Metrics
    [ Comment ] Discussion of “Tracking Reproductivity of COVID-19 Epidemic in China with Varying Coefficient SIR Model”
    Yukang Jiang, Jianbin Tan, Ting Tian, Xueqin Wang
    Journal of Data Science    0, (): 1-.  
    Accepted: 23 April 2020

    Abstract40)      PDF (222KB)(45)       Save
    Related Articles | Metrics
    Inference and Optimal Design of Accelerated Life Test using Geometric Process for Generalized Half-Logistic Distribution under Progressive Type-II Censoring
    H. M. Aly, S. O. Bleed, and H. Z. Muhammed
    Journal of Data Science    2020, 18 (2): 358-375.   DOI: 10.6339/JDS.202001 18(2).0008
    Abstract40)      PDF (295KB)(31)       Save
    In this paper, the geometric process model is used for analyzing constant stress accelerated life testing. The generalized half logistic lifetime distribution is considered under progressive type-II censoring. Statistical inference is developed on the basis of maximum likelihood approach for estimating the unknown parameters and getting both the asymptotic and bootstrap confidence intervals. Besides, the predictive values of the reliability function under usual conditions are found. Moreover, the method of finding the optimal value of the ratio of the geometric process is presented. Finally, a simulation study is presented to illustrate the proposed procedures and to evaluate the performance of the geometric process model.
    Related Articles | Metrics
    A Two-Parameter Distribution with Increasing and Bathtub Hazard Rate
    Pedro L. Ramos, Francisco Louzada , and Fernando A. Moala
    Journal of Data Science    2020, 18 (4): 813-827.   DOI: 10.6339/JDS.202010_18(4).0014
    Abstract39)      PDF (434KB)(23)       Save
    Related Articles | Metrics
    Stationary Bootstrap Based Multi-Step Forecasts for Unrestricted VAR Models 
    U. Beyaztas and Abdel-Salam G. Abdel-Salam
    Journal of Data Science    2020, 18 (4): 682-696.   DOI: 10.6339/JDS.202010_18(4).0006
    Abstract38)      PDF (396KB)(43)       Save
    Related Articles | Metrics
    Wavelet-Based Robust Estimation of Hurst Exponent with Application in Visual Impairment Classification
    Chen Feng, Yajun Mei, and Brani Vidakovic
    Journal of Data Science    2020, 18 (4): 581-605.   DOI: 10.6339/JDS.202010_18(4).0001
    Abstract37)      PDF (1730KB)(44)       Save
    Related Articles | Metrics
    Cubic Rank Transmuted Modified Burr III Distribution: Development, Properties, Characterizations and Applications
    Fiaz Ahmad Bhatti, G.G. Hamedani, Seyed Morteza Najibi and Munir Ahmad
    Journal of Data Science    2020, 18 (2): 299-318.   DOI: 10.6339/JDS.202004_18(2).0005
    Abstract37)      PDF (634KB)(17)       Save
    We propose a lifetime distribution with flexible hazard rate called cubic rank transmuted modified Burr III (CRTMBIII) distribution. We develop the proposed distribution on the basis of the cubic ranking transmutation map. The density function of CRTMBIII is symmetrical, right-skewed, left-skewed, exponential, arc, J and bimodal shaped. The flexible hazard rate of the proposed model can accommodate almost all types of shapes such as unimodal, bimodal, arc, increasing, decreasing, decreasing-increasing-decreasing, inverted bathtub and modified bathtub. To show the importance of proposed model, we present mathematical properties such as moments, incomplete moments, inequality measures, residual life function and stress strength reliability measure. We characterize the CRTMBIII distribution via techniques. We address the maximum likelihood method for the model parameters. We evaluate the performance of the maximum likelihood estimates (MLEs) via simulation study. We establish empirically that the proposed model is suitable for strengths of glass fibers. We apply goodness of fit statistics and the graphical tools to examine the potentiality and utility of the CRTMBIII distribution.
    Related Articles | Metrics
    The Performance of Largest Caliper Matching Through Monte Carlo Simulation and an Application to Support Data
    Sharif Mahmood
    Journal of Data Science    2020, 18 (2): 343-357.   DOI: 10.6339/JDS.202001 18(2).0007
    Abstract36)      PDF (383KB)(19)       Save
    The paper presents an investigation of estimating treatment effect using differ- ent matching methods through Monte Carlo simulation. The study proposed a new method which is computationally efficient and convenient in implication—largest caliper matching and compared the performance with other five popular matching methods. The bias, empirical standard deviation and the mean square error of the estimates in the simulation are checked under different treatment prevalence and different distributions of covariates. It is shown that largest caliper matching improves estimation of the population treatment effect in a wide range of settings compare to other methods. It reduces the bias if the data contains the selection on observables and treatment imbalances. Also, findings about the relative performance of the different matching methods are provided to help practitioners determine which method should be used under certain situations. An application of these methods is implemented on the Study to Understand Prognoses and Preferences for Outcomes and Risks of Treatments (SUPPORT) data and, important demographic and socioeconomic factors that may affect the clinical outcome are also reported in this paper.
    Related Articles | Metrics
    Parameter Estimation of a Seasonal Poisson INAR(1) Model with Different Monthly Means 
    Turaj Vazifedan, Homa Jalaeian Taghadomi , Xixi Wang , and Mujde Erten-Unal
    Journal of Data Science    2020, 18 (4): 697-717.   DOI: 10.6339/JDS.202010_18(4).0007
    Abstract36)      PDF (663KB)(62)       Save
    Related Articles | Metrics