Multimodal · 1 piece on file
Multimodal
Vision, audio, video, and the messy edges where modalities meet. Charts and documents the models still fail on.
Feature · MAY 19, 2026
Google Gemini Omni: world-understanding multimodal at scale, any-input-to-any-output
Announced at Google I/O on May 19, Gemini Omni is positioned as a leap in world understanding, multimodality, and editing — generating any output from any input, starting with video.