Temporal Vision-Language Processing (BioViL-T)
BioViL-T is a Vision-Language model trained on sequences of biomedical image and text data at a scale. It does not require manual annotations and can leverage historical raw clinical image acquisitions and clinical notes. The…