panbowen0607 [at] gmail [dot] com
Bowen Pan
I am a research scientist at Apple foundation model (AFM) team. I work on multi-modal model training and RL.
I completed my Ph.D. and M.Sc. at MIT CSAIL, advised by Aude Oliva. My Ph.D. thesis focuses on efficient algorithms for the training and inference of multimodal agents. Prior to that, I obtained my B.E. from Shanghai Jiao Tong University.
Publications
[Full list]*: equal contribution
Brain Netflix: Scaling Data to Reconstruct Videos from Brain Signals
HoloAssist: an Egocentric Human Interaction Dataset for Interactive AI Assistants in the Real World
Egocentric Vision 2022/2023 Distinguished Paper Award
Multi-Moments in Time: Learning and Interpreting Models for Multi-Action Video Understanding
Argoverse 2.0: Next Generation Datasets for Self-Driving Perception and Forecasting
IA-RED2: Interpretability-Aware Redundancy Reduction for Vision Transformer
An interpretable run-time token pruning strategy for vision transformer.
VA-RED2: Video Adaptive Redundancy Reduction
Recurrent Residual Module for Fast Inference in Videos
Misc
In my spare time, I play soccer and go to the gym.