Speaker: Xueyan Zou UCSD
Time: 2024-10-29 10:00-2024-10-29 11:00
Venue: FIT 1-222
Abstract:
The evolution of neural networks, from the foundational linear regression to the sophisticated Generative Pre-trained Transformers (GPTs), exemplifies a consistent narrative of alignment between spaces. Foundation models, in particular, provide a generalized framework for bridging human and machine intelligence. The efficacy of these models reveals unprecedented potential when integrated with embodied robotics, facilitating direct interaction between machine intelligence and the physical realm. Along the road, the alignment transit from human intelligence to machine intelligence and recently embodiment. The talk will emphasis on a series of works that built towards multimodal foundation model and an introduction on the attempts that bridging the gap between foundation model and robotics.
Short Bio:
Xueyan Zou is a Postdoctoral Researcher at UCSD Contextual Robotic Institute advising by Prof. Xiaolong Wang with NSF TILOS fellowship. She received Ph.D. degree from University of Wisconsin-Madison in May 2024 supervised by Prof. Yong Jae Lee. Her research interests lies on multimodal foundation model and robotics, with a recent emphasis on their intersection. She developed and contributed to a series of impactful works including, X-Decoder, SEEM, SemanticSAM, SoM towards a generalized multimodal foundation model. Her first and first-author publication has received the Best Paper Award in BMVC2020. And her recent joint proposal has been selected in Microsoft’s accelerating foundation model program. She has 12 publications in CVPR, ICCV, ECCV, NeurIPS, CoRL, IJCV, and BMVC.