Novel Image Captioning
- Lijuan Wang, Microsoft
When does a machine “understand” an image? One definition is when it can generate a novel caption that summarizes the salient content within an image. This content may include objects that are present, their attributes, actions, or their relations with each other. Determining the salient content requires not only knowing the contents of an image, but also deducing which aspects of the scene may be interesting or novel through commonsense knowledge. This video demonstrates the quality of the latest image captioning model (after) compared with the old model (before).
-
-
Lijuan Wang
Principal Research Manager
-
-
Watch Next
-
-
-
-
-
-
-
-
Physics and information theory of generative diffusion
- Luca Ambrogioni
-
-
Upper Bound 2024: Towards Human-Centered AI in AAA Video Game
- Raluca Georgescu