
The convergence of digital transformation and the GenAI revolution creates an unprecedented opportunity for accelerating progress in precision health. Precision immunotherapy is a poster child for this transformation. Emerging technologies such as multiplex immunofluorescence (mIF) can assess internal states of individual cells along with their spatial locations, which is critical for deciphering how tumors interact with the immune system. The resulting insights, often referred to as the “grammar” of the tumor microenvironment, can help predict whether a tumor will respond to immunotherapy. If it is unlikely to respond, these insights can also inform strategies to reprogram the tumor from “cold” to “hot,” increasing its susceptibility to treatment.
This is exciting, but progress is hindered by the high cost and limited scalability of current technology. For example, obtaining mIF data of a couple dozen protein channels for a tissue sample can cost thousands of dollars, and even the most advanced labs can barely scale it to a tiny fraction of their available tissue samples.
In our paper published in Cell on December 9, “Multimodal AI generates virtual population for tumor microenvironment modeling (opens in new tab),” we present GigaTIME (opens in new tab), a multimodal AI model for translating routinely available hematoxylin and eosin (H&E) pathology slides to virtual mIF images. Developed in collaboration with Providence and the University of Washington, GigaTIME was trained on a Providence dataset of 40 million cells with paired H&E and mIF images across 21 protein channels. We applied GigaTIME to 14,256 cancer patients from 51 hospitals and over a thousand clinics within the Providence system. This effort generated a virtual population of around 300,000 mIF images spanning 24 cancer types and 306 cancer subtypes. This virtual population uncovered 1,234 statistically significant associations linking mIF protein activations with key clinical attributes such as biomarkers, staging, and patient survival. Independent external validation on 10,200 Cancer Genome Atlas (TCGA) patients further corroborated our findings.
To our knowledge, this is the first population-scale study of tumor immune microenvironment (TIME) based on spatial proteomics. Such studies were previously infeasible due to the scarcity of mIF data. By translating readily available H&E pathology slides into high-resolution virtual mIF data, GigaTIME provides a novel research framework for exploring precision immuno-oncology through population-scale TIME analysis and discovery. We have made our GigaTIME model publicly available at Microsoft Foundry Labs (opens in new tab) and on Hugging Face (opens in new tab) to help accelerate clinical research in precision oncology.
“GigaTIME is about unlocking insights that were previously out of reach,” explained Carlo Bifulco, MD, chief medical officer of Providence Genomics and medical director of cancer genomics and precision oncology at the Providence Cancer Institute. “By analyzing the tumor microenvironment of thousands of patients, GigaTIME has the potential to accelerate discoveries that will shape the future of precision oncology and improve patient outcomes.”
PODCAST SERIES
GigaTIME generates a virtual population for tumor microenvironment modeling
Digital pathology transforms a microscopy slide of stained tumor tissue into a high-resolution digital image, revealing details of cell morphology such as nucleus and cytoplasm. Such a slide only costs $5 to $10 per image and has become routinely available in cancer care. It is well known that H&E-based cell morphology contains information about the cellular states. Last year, we released GigaPath, the first digital pathology foundation model for scaling transformer architectures to gigapixel H&E slides. Afterward, researchers at Mount Sinai Hospital and Memorial Sloan Kettering Cancer Center showed in a global prospective trial that it can reliably predict a key biomarker from H&E slides for precision oncology triaging. However, such prior works are generally limited to average biomarker status across the entire tissue. GigaTIME thus represents a major step forward by learning to predict spatially resolved, single-cell states essential for tumor microenvironment modeling. In turn, this enables us to generate a virtual population of mIF images for large-scale TIME analysis (Figure 1).

GigaTIME learns a multimodal AI model to translate pathology slides into spatial proteomics images, bridging cell morphology and cell states

GigaTIME learned a cross-modal AI translator from digital pathology to spatial multiplex proteomics by training on 40 million cells with paired H&E slides and mIF images from Providence. To our knowledge, this is the first large-scale study exploring multimodal AI for scaling virtual mIF generation. The high-quality paired data enabled much more accurate cross-modal translation compared to prior state-of-the-art methods (Figure 2).
Virtual population enables population-scale discovery of associations between cell states and key biomarkers

By applying GigaTIME to Providence real-world data, we generated a virtual population of 14,256 patients with virtual mIF and key clinical attributes. After correcting for multiple hypothesis testing, we have identified 1,234 statistically significant associations between tumor immune cell states (CD138, CD20, CD4) and clinical biomarkers (tumor mutation burden, KRAS, KMT2D), from pan-cancer to cancer subtypes (Figure 3). Many of these findings are supported by existing literature. For example, MSI high and TMB high associated with increased activations of TIME-related channels such as CD138. Additionally, the virtual population also uncovered previously unknown associations, such as pan-cancer associations between immune activations and key tumor biomarkers, such as the tumor suppressor KMT2D and the oncogene KRAS).
Virtual population enables population-scale discovery of tumor immune signatures for patient stratification

The virtual population also uncovered GigaTIME signatures for effective patient stratification across staging and survival profiles (Figure 4), from pan-cancer to cancer subtypes. Prior studies have explored patient stratification based on individual immune proteins such as CD3 and CD8. We found that GigaTIME-simulated CD3 and CD8 are similarly effective. Moreover, the combined GigaTIME signature across all 21 protein channels attained even better patient stratification compared to individual channels.
Virtual population uncovers interesting spatial and combinatorial interactions

The virtual population uncovered interesting non-linear interactions across the GigaTIME virtual protein channels, revealing associations with spatial features such as sharpness and entropy, as well as with key clinical biomarkers like APC and KMT2D (Figure 6). Such combinatorial studies were previously out of reach given the scarcity of mIF data.
Independent external validation on TCGA

We conducted an independent external validation by applying GigaTIME to 10,200 patients in The Cancer Genome Atlas (TCGA) dataset and studied associations between GigaTIME-simulated virtual mIF and clinical biomarkers available in TCGA. We observed significant concordance across the virtual populations from Providence and TCGA, with a Spearman correlation of 0.88 for virtual protein activations across cancer subtypes. The two populations also uncovered a significant overlap of associations between GigaTIME-simulated protein activations and clinical biomarkers (Fisher’s exact test p < 2 × 10−9). On the other hand, the Providence virtual population yielded 33% more significant associations than TCGA, highlighting the value of large and diverse real-world data for clinical discovery.
GigaTIME is a promising step toward the moonshot of “virtual patient”
By learning to translate across modalities, GigaTIME is a promising step toward “learning the language of patients” for the ultimate goal of developing a “virtual patient”, a high-fidelity digital twin that could one day accurately forecast disease progression and counterfactual treatment response. By converting routinely available cell morphology data into otherwise scarce high-resolution cell states signals, GigaTIME demonstrated the potential in harnessing multimodal AI to scale real-world evidence (RWE) generation.
Going forward, growth opportunities abound. GigaTIME can be extended to handle more spatial modalities and cell-state channels. It can be integrated into advanced multimodal frameworks such as LLaVA-Med to facilitate conversational image analysis by “talking to the data.” To facilitate research in tumor microenvironment modeling, we have made GigaTIME open-source (opens in new tab) on Foundry Labs (opens in new tab) and Hugging Face (opens in new tab).
GigaTIME is a joint work with Providence and the University of Washington’s Paul G. Allen School of Computer Science & Engineering. It reflects Microsoft’s larger commitment to advancing multimodal generative AI for precision health (opens in new tab), with other exciting progress such as GigaPath, BiomedCLIP, LLaVA-Rad (opens in new tab), BiomedJourney, BiomedParse, TrialScope, Curiosity.
Paper co-authors: Jeya Maria Jose Valanarasu, Hanwen Xu, Naoto Usuyama, Chanwoo Kim, Cliff Wong, Peniel Argaw, Racheli Ben Shimol, Angela Crabtree, Kevin Matlock, Alexandra Q. Bartlett, Jaspreet Bagga, Yu Gu, Sheng Zhang, Tristan Naumann, Bernard A. Fox, Bill Wright, Ari Robicsek, Brian Piening, Carlo Bifulco, Sheng Wang, Hoifung Poon