ニュース&特集
読み込み中…
Microsoft Research ブログ
ACAV100M: Scaling up self-supervised audio-visual learning with automatically curated internet videos
| Yale Song
The natural association between visual o…
Microsoft Research ブログ
Microsoft and NVIDIA introduce parameter-efficient multimodal transformers for video representation learning
| Yale Song
Understanding video is one of the most c…