Time: 9:30-11:30am, July. 11th, 2019, Thursday
Venue: RoomA1514, Science Building, North Zhongshan Road Campus
Spreker: Professor Ying Wei , Columbia University
Title 1: Inference of quantile hurdle model with its application in scRNA-seq data
Abstract: Developing differential gene expression (DE) analysis methods for scRNA-seq data is challenging due to the special characteristics of scRNA-seq data. First, multimodality of gene expression and heterogeneity among different cell conditions lead to divergences in the tail events or crossings of the cell distributions. As a result, the existing parametric approaches targeting the mean difference in gene expression levels are limited, while quantile regression that examines various locations in the distribution gives a higher power. In addition, scRNA-seq data generally has a large fraction of dropout events, causing zero-inflation in the expression. We propose a rank-score based test at a fixed quantile under a quantile hurdle model for zero-inflated outcomes, and used a minP procedure to combine the test statistics over multiple quantile level. We evaluated the proposed tests with simulation studies and showed a higher power in detecting true DEGs compared with the existing methods. Also we applied them to a real human scRNA-seq data to study DEGs in neoplastic and regular cells, and successfully detected a group of crucial genes associated with tumor.
Title 2: Conditional Quantile Decision Trees and Forest with its application for predicting the risk (Post-Traumatic Stress Disorder) PTSD after experienced an acute coronary syndrome
Abstract: Classification and regression trees (CART) are a classic statistical learning method that efficiently partitions the sample space into mutually exclusive subspaces with the distinctive means of an outcome of interest. It is a powerful tool for efficient subgroup analysis and allows for complex associations and interactions to achieve high prediction accuracy and stability. Hence, they are appealing tools for precision health applications that deal with large amounts of data from EMRs, genomics, and mobile data and aim to provide a transparent decision mechanism. Although there is a vast literature on decision trees and random forests, most algorithms identify subspaces with distinctive outcome means. The most vulnerable or high-risk groups for certain diseases are often patients with extremely high (or low) biomarker and phenotype values. However, means-based partitioning may not be effective for identifying patients with extreme phenotype values. We propose a new regression tree framework based on quantile regression that partitions the sample space and predicts the outcome of interest based on conditional quantiles of the outcome variable. We implemented and evaluated the performance of the conditional quantile trees/forests to predict the risk of developing PTSD after experiencing an acute coronary syndrome (ACS), using an observational cohort data from the REactions to Acute Care and Hospitalization (REACH) study at New York Presbyterian Hospital. The results show that the conditional quantile based trees/forest have better discrimination power to identify patients with severe PTSD symptoms, in comparison to the classical mean based CART.
Speaker’s Bio:Wei Ying教授,本科和硕士毕业于中国科学技术大学,博士毕业于美国伊利诺伊大学香槟分校。2004—2011年为美国哥伦比亚大学助理教授,2011—2017年为美国哥伦比亚大学副教授,2017—至今为美国哥伦比亚大学教授。在包括统计学顶级杂志Annals of Statistics, Journal of American Statistical Association和Biometrika等杂志上发表论文四十余篇。