A how-to guide for conducting retrospective analyses: example COVID-19 study

  • Michael Powell ,
  • Allison Koenecke ,
  • James Byrd ,
  • Akihiko Nishimura ,
  • Maximilian Konig ,
  • Ruoxuan Xiong ,
  • Sadiqa Mahmood ,
  • Vera Mucaj ,
  • Chetan Bettegowda ,
  • Liam Rose ,
  • Suzanne Tamang ,
  • Adam Sacarny ,
  • Brian Caffo ,
  • Susan Athey ,
  • Elizabeth Stuart ,
  • Joshua Vogelstein

DOI

In the urgent setting of the COVID-19 pandemic, treatment hypotheses abound, each of which requires careful evaluation. A randomized controlled trial generally provides the strongest possible evaluation of a treatment, but the efficiency and effectiveness of the trial depend on the existing evidence supporting the treatment. The researcher must therefore compile a body of evidence justifying the use of time and resources to further investigate a treatment hypothesis in a trial. An observational study can help provide this evidence, but the lack of randomized exposure and the researcher’s inability to control treatment administration and data collection introduce significant challenges for nonexperimental studies. A proper analysis of observational health care data thus requires an extensive background in a diverse set of topics ranging from epidemiology and causal analysis to relevant medical specialties and data sources. Here we provide 10 rules that serve as an end-to-end introduction to retrospective analyses of observational health care data. A running example of a COVID-19 study presents a practical implementation of each rule in the context of a specific treatment hypothesis. When carefully designed and properly executed, a retrospective analysis framed around these rules will inform the decisions of whether and how to investigate a treatment hypothesis in a randomized controlled trial.