Multimodal AI generates virtual population for tumor microenvironment modeling
- Jeya Maria Jose Valanarasu ,
- Hanwen Xu ,
- Naoto Usuyama ,
- Chanwoo Kim ,
- Cliff Wong ,
- Peniel Argaw ,
- Racheli Ben Shimol ,
- Angela Crabtree ,
- Kevin Matlock ,
- Alexandra Q. Bartlett ,
- Jaspreet Bagga ,
- Yu Gu ,
- Sheng Zhang ,
- Tristan Naumann ,
- Bernard A. Fox ,
- Bill Wright ,
- Ari Robicsek ,
- Brian Piening ,
- Carlo Bifulco ,
- Sheng Wang ,
- Hoifung Poon
Cell |
The tumor immune microenvironment (TIME) critically impacts cancer progression and immunotherapy response. Multiplex immunofluorescence (mIF) is a powerful imaging modality for deciphering TIME, but its applicability is limited by high cost and low throughput. We propose GigaTIME, a multimodal AI framework for population-scale TIME modeling by bridging cell morphology and states. GigaTIME learns a cross-modal translator to generate virtual mIF images from hematoxylin and eosin (H&E) slides by training on 40 million cells with paired H&E and mIF data across 21 proteins. We applied GigaTIME to 14,256 patients from 51 hospitals and over 1,000 clinics across seven US states in Providence Health, generating 299,376 virtual mIF slides spanning 24 cancer types and 306 subtypes. This virtual population uncovered 1,234 statistically significant associations linking proteins, biomarkers, staging, and survival. Such analyses were previously infeasible due to the scarcity of mIF data. Independent validation on 10,200 TCGA patients further corroborated our findings.