Modern science requires
modern statistics.
Nonparametric and semiparametric (causal) inference
In many cases, answering the scientific question of interest requires, as an intermediate step, estimation of a high- or indeed infinite-dimensional object. For example, to estimate the causal effect of a treatment on a given outcome, we require estimation of either the probability of the outcome as a function of treatment and confounder and/or an estimate of the probability of treatment as a function of confounders. In modern applications, measured confounders may be high-dimensional with little background knowledge informing the analyst as to how they relate to treatment probability or outcome. Thus, we are motivated to employ flexible regression techniques, such as those developed in machine learning, to estimate these quantities. Demonstrating valid statistical properties of estimators based on these flexible approaches can be quite challenging, involving expertise in empirical process theory and non/semiparametric efficiency theory. My research involves leveraging these areas to propose new approaches for tackling challenges that arise in these difficult settings.
Machine learning
Recent advances in theoretical statistics and computer science have led to the development of machine learning algorithms that accurately assess risks of future adverse health outcomes. These assessments may be used by clinicians and public health practitioners to appropriately guide decision making. Machine learning can also identify subgroups of patients who benefit more (or less) from treatments or public health interventions, which may lead to targeted treatment and intervention strategies. In light of these opportunities, machine learning is poised to have a large impact on clinical and public health decision making in our lifetime. My research involves developing methods for building accurate and interpretable machine learning models, as well as the statistical methods needed to evaluate the practical performance of these models. I am also interested in areas of machine learning that touch on causal inference, such as fairness in machine learning and using machine learning for precision medicine.
Analysis of preventive vaccines
A large component of my research is motivated by applications in studies of preventive vaccines. I have work on clinical trials and observational studies of vaccines aimed at preventing many diseases including HIV, malaria, dengue fever, and influenza. One of the key areas of my research is developing statistical methodology that helps elucidate mechanisms of vaccine protection. This often involves understanding mediating pathways of vaccine effects, as well as the influence of pathogen genetics on the effect of vaccines. Statistically, this work draws on methodology from the causal inference and survival analysis literature and tackles issues such as causal mediation and competing risks.
Education
I received my B.S. in Statistics and MPH in Biostatistics from the University of Georgia (2010) and my PhD. in Biostatistics from the University of Washington (2015), before spending time as a postdoctoral fellow at University of California, Berkeley (2015-17).
News
Congratulations to Ziyue Wu for passing his dissertation proposal!
Ziyue's proposal involves using ensemble machine learning (a.k.a. super learning) for modeling healthcare expenditures. His first project proposes a two-stage super learner for dealing with healthcare expenditures with … Read More