Bowen Pan (潘柏文)

I am a third-year Ph.D. student in CSAIL, MIT, advised by Prof. Aude Oliva. Before coming to MIT, I received the B.E. in Electronic Engineering from Shanghai Jiao Tong University in 2019.

My research interests lie at Computer Vision and Machine Learning. In particular, I am interested in how we can learn high-level knowledge of human behavior from video and apply it to an embodied agent.

Email  /  CV  /  Scholar  /  Github


Massachusetts Institute of Technology, USA
Ph.D. student • Sept. 2019 to Present

Shanghai Jiao Tong University, China
Bachelor of Engineering • Sept. 2015 to Jun. 2019

Multi-Moments in Time: Learning and Interpreting Models for Multi-Action Video Understanding
Mathew Monfort, Bowen Pan, Kandan Ramakrishnan, Alex Andonian, Barry A McNamara, Alex Lascelles, Quanfu Fan, Dan Gutfreund, Rogerio Feris, Aude Oliva
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Area: Video Understanding
[Project Page] [PDF]

Argoverse 2.0: Next Generation Datasets for Self-Driving Perception and Forecasting
Benjamin Wilson, William Qi, Tanmay Agarwal, John Lambert, Jagjeet Singh, Siddhesh Khandelwal, Bowen Pan, Ratnesh Kumar, Andrew Hartnett, Jhony Kaesemodel Pontes, Deva Ramanan, Peter Carr, James Hays
Thirty-fifth Conference on Neural Information Processing Systems Track on Datasets and Benchmarks (NeurIPS 2021)
Area: Self-driving, Outdoor Scene Understanding
[Paper] [Code]

IA-RED2: Interpretability-Aware Redundancy Reduction for Vision Transformer
Bowen Pan, Rameswar Panda, Yifan Jiang, Zhangyang Wang, Rogerio Feris, Aude Oliva
Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021)
Area: Interpretability, Efficient Inference, Video Understanding, Vision Transformer
[Project Page] [PDF]

VA-RED2: Video Adaptive Redundancy Reduction
Bowen Pan, Rameswar Panda, Camilo Fosco, Chung-Ching Lin, Alex Andonian, Yue Meng, Kate Saenko, Aude Oliva, Rogerio Feris
Ninth International Conference on Learning Representations (ICLR 2021)
Area: Efficient Inference, Video Understanding
[Project Page] [PDF]

Cross-view Semantic Segmentation for Sensing Surroundings
Bowen Pan*, Jiankai Sun*, Ho Yin Tiga Leung, Alex Andonian, Bolei Zhou
IEEE Robotics and Automation Letters (RA-L) and IROS 2020
Area: Environment Understanding, Visual Navigation
[Project Page] [PDF]

Recurrent Residual Module for Fast Inference in Videos
Bowen Pan, Wuwei Lin, Xiaolin Fang, Chaoqin Huang, Bolei Zhou, Cewu Lu
Computer Vision and Pattern Recognition (CVPR 2018) [PDF]
Area: Efficient Inference, Video Analysis

We introduce a highly-efficient algorithm for video inference which makes use of the appearance similarity between two adjacent frames.

PoseHD: Boosting Human Detectors using Human Pose Information
Zhijian Liu, Bowen Pan, Yuliang Xiu, Cewu Lu
AAAI Conference on Artificial Intelligence (AAAI 2018) [PDF]
Area: Pedestrain Detection, Pose Estimation

We propose a original framework to boost human detection by utilizing skeleton information.

This handsome guy...