Leveraging Generative Models to Understand the Visual Cortex

演讲人: Andrew Luo Carnegie Mellon University
时间: 2024-05-28 10:00-2024-05-28 11:00
地点:FIT 1-222

Understanding the functional organization of the higher visual cortex is a fundamental goal in neuroscience. Traditional approaches have focused on mapping the visual and semantic selectivity of neural populations using hand-selected, non-naturalistic stimuli, which require a priori hypotheses about visual cortex selectivity. To address these limitations, we introduce two data-driven methods: Brain Diffusion for Visual Exploration ('BrainDiVE') and Semantic Captioning Using Brain Alignments ('BrainSCUBA'). BrainDiVE synthesizes images predicted to activate specific brain regions, having been trained on a dataset of natural images and paired fMRI recordings, thus bypassing the need for hand-crafted visual stimuli. This approach leverages large-scale diffusion models combined with brain-gradient guided image synthesis. We demonstrate the synthesis of preferred images with high semantic specificity for category-selective regions of interest. BrainSCUBA, on the other hand, generates natural language descriptions for images predicted to maximally activate individual voxels. This approach enables efficient fine-grained labeling of the entire higher visual cortex. Together, these two methods offer well-specified constraints for future hypothesis-driven examinations and demonstrate the potential of data-driven approaches in uncovering brain organization.


Andrew Luo is a Ph.D. candidate in the joint program for Neural Computation and Machine Learning at Carnegie Mellon University, co-advised by Profs Michael Tarr and Leila Wehbe. His research focuses on understanding the functional organization of higher visual cortex using learnable generative models across modalities. Before joining CMU, he earned a B.S. degree in Computer Science from MIT in 2019.