Vision for Intelligent Task Assistants

VITA

The 1st Workshop in Conjunction with CVPR 2026

June 3, 2026,
1:00 PM

About VITA 2026

About VITA 2026

Advances in computer vision, multimodal learning, and AR/VR/XR technologies and smart glasses are converging toward Virtual Intelligent Task Assistants (VITAs)β€”systems that observe, interpret, and guide humans in complex real-world activities. This workshop bridges computer vision foundations and interactive AR/VR/XR research to enable long-term task understanding and assistance. Topics include learning from long streaming egocentric and exocentric videos, multimodal reasoning, task and step prediction, procedure planning and correction, human-AI collaboration and coaching, and new datasets and benchmarks. By fostering dialogue across disciplines, the workshop aims to define the core challenges and opportunities for building practical and generalizable VITAs.

Where

Room 108,
Colorado Convention Center

When

June 3, 2026,
1:00 PM

Speakers

Speaker 1

Kristen Grauman

University of Texas at Austin

Speaker 2

Ivan Laptev

MBZUAI

Speaker 3

Marc Pollefeys

ETH Zurich

Speaker 4

Antonino Furnari

University of Catania

Speaker 5

Gedas Bertasius

University of North Carolina, Chapel Hill

Speaker 6

Steven Feiner

Columbia University

Organizers

Organizer 1

Mohsen Moghaddam

Georgia Institute of Technology

Organizer 2

Angela Yao

National University of Singapore

Organizer 3

Jason Corso

University of Michigan

Organizer 4

Ehsan Elhamifar

Northeastern University