Recent Advances in Egocentric Vision: Data, Multimodal, and Generative AI

演讲人: Miao Liu Meta GenAI
时间: 2024-10-21 10:45-2024-10-21 12:00
地点: Zoom Meeting: ID: 850 4716 8329, Passcode: 196471)

Egocentric vision has rapidly advanced in recent years due to its transformative potential in applications like Augmented Reality (AR) and Robot Learning. In this talk, I will present an overview of the unique characteristics of egocentric data and highlight our efforts to harness it for skill transfer. Specifically, I will discuss the evolution of the datasets we have developed, and our models that address critical egocentric vision tasks spanning varying modalities, time intervals and granularities. Moreover, I will provide a deep dive on our latest work on generative models for egocentric action frame generation. This research has been recognized as one of the top 15 Best Paper award candidates at ECCV 2024. We believe this work marks a significant step forward in leveraging generative models to enhance action prediction and skill transfer in egocentric contexts.


Miao Liu is currently a Senior Research Scientist at Meta GenAI. His research focuses on egocentric vision and generative AI models, such as Llama and EMU. Miao’s work has gained significant recognition, with over 20 publications as the lead author in top-tier AI conferences and journals, including CVPR, ECCV, ACL, and TPAMI, with 7 papers selected for oral presentations. He and the students he mentored have received a number of awards, including 2nd place in the Epic Kitchens challenge, the BMVC 2022 Best Student Paper Award, the Best Paper final list at CVPR 2022, and the Best Paper Award Candidate nomination at ECCV 2024. Miao earned his Ph.D. from Georgia Tech, a Master’s degree from Carnegie Mellon University and a bachelor’ degree from Beihang University.