Building Embodied 3D Foundation Models

Speaker: Yining Hong UCLA
Time: 2024-01-09 13:30-2024-01-09 14:30
Venue: C19-2 or https://meeting.tencent.com/dm/jIFqTqXeb7T5


Powerful as recent large language models and vision language models can be, these models are not grounded in the 3D physical world like human beings, let alone explore and interact within the richer realm of 3D embodied environments. Yining Hong's work emphasizes the development  of 3D embodied foundation models, dedicated to building general-purpose embodied agents that could actively explore and interact with the 3D physical world, and perform common sense reasoning within the embodied environment. These models facilitate dynamic interactions with 3D spaces, incorporating essential embodied concepts such as spatial relationships, affordances, physics, layout, multisensory learning and so on. Yining Hong's research specifically emphasizes on three critical perspectives in building such generalist embodied agents: Building 3D world models; embodied foundation models and common sense reasoning.

Short Bio:

Yining Hong is a fourth-year PhD candidate in UCLA and a research member at MIT-IBM Watson AI Lab, supervised by Prof. Chuang Gan, Prof. Song-Chun Zhu and Prof. Ying-Nian Wu. She obtained her bachelor degree in Shanghai Jiao Tong University under the supervision of Prof. Xinbing Wang and Prof. Weinan Zhang. Her research interests include embodied AI, 3D large language models and machine reasoning. She has published multiple first-authored papers in NeurIPS, CVPR, ICCV, AAAI and so on. She was the recipient of Baidu Scholarship 2022.