Login [Center] Logout Join Us Guidelines  I  CQI

Towards Learning 3D Foundation Models Using Mixed 2D and 3D Data

Speaker: Qixing Huang University of Texas at Austin
Time: 2024-08-06 14:00-2024-08-06 15:00
Venue: Seminar Room 2, 19th Floor, Tower C, TusPark

Abstract:

We are in the era of foundation models, in which large language models and visual-language models have enabled many practical applications, and this trend will continue to evolve. One message from the community is that the scale of clean data and model size are more important than data representations and training objectives. However, this creates a fundamental issue in the 3D domain, because we will not have sufficient large-scale 3D data to learn 3D foundation models. This talk will cover our recent efforts to tackle this problem. I will talk about how to learn 3D foundation models from image/video foundation models by developing generalizable 3D representations that possess efficient neural rendering capabilities, learning 3D foundation models from real images, and distilling 3D knowledge from video foundation models. I will also talk about opportunities and challenges in learning foundation models grouned in physics and geometry.

Short Bio:

Qixing Huang is an associate professor with tenure at the computer science department of the University of Texas at Austin. His research sits at the intersection of graphics, geometry, optimization, vision, and machine learning. He has published more than 100 papers at leading venues across these areas. His research has received several awards, including multiple best paper awards, the best dataset award at Symposium on Geometry Processing 2018, IJCAI 2019 early career spotlight, and 2021 NSF Career award. He has also served as area chairs of CVPR, ECCV, ICCV and technical papers committees of SIGGRAPH and SIGGRAPH Asia, and co-chaired Symposium on Geometry Processing 2020.