Bowen Pan

I am a second-year Ph.D. student in CSAIL, MIT, advised by Prof. Aude Oliva. Before coming to MIT, I received the B.E. in Electronic Engineering from Shanghai Jiao Tong University in 2019.

My research interests lie at Computer Vision and Machine Learning. In particular, I am interested in how we can learn high-level knowledge of human behavior from video and apply it to an embodied agent.

Email  /  CV  /  Scholar  /  Github


Massachusetts Institute of Technology, USA
Ph.D. student • Sept. 2019 to Present

Shanghai Jiao Tong University, China
Bachelor of Engineering • Sept. 2015 to Jun. 2019

IA-RED2: Interpretability-Aware Redundancy Reduction for Vision Transformer
Bowen Pan, Yifan Jiang, Rameswar Panda, Zhangyang Wang, Rogerio Feris, Aude Oliva
arXiv, 2021
Area: Interpretability, Efficient Inference, Video Understanding, Vision Transformer
[Project Page] [PDF]

Multi-Moments in Time: Learning and Interpreting Models for Multi-Action Video Understanding
Mathew Monfort, Kandan Ramakrishnan, Alex Andonian, Barry A McNamara, Alex Lascelles, Bowen Pan, Quanfu Fan, Dan Gutfreund, Rogerio Feris, Aude Oliva
arXiv:1911.00232, 2019
Area: Video Understanding
[Project Page] [PDF]

VA-RED2: Video Adaptive Redundancy Reduction
Bowen Pan, Rameswar Panda, Camilo Fosco, Chung-Ching Lin, Alex Andonian, Yue Meng, Kate Saenko, Aude Oliva, Rogerio Feris
Ninth International Conference on Learning Representations (ICLR 2021)
Area: Efficient Inference, Video Understanding
[Project Page] [PDF]

Cross-view Semantic Segmentation for Sensing Surroundings
Bowen Pan*, Jiankai Sun*, Ho Yin Tiga Leung, Alex Andonian, Bolei Zhou
IEEE Robotics and Automation Letters (RA-L) and IROS 2020
Area: Environment Understanding, Visual Navigation
[Project Page] [PDF]

Recurrent Residual Module for Fast Inference in Videos
Bowen Pan, Wuwei Lin, Xiaolin Fang, Chaoqin Huang, Bolei Zhou, Cewu Lu
Computer Vision and Pattern Recognition (CVPR 2018) [PDF]
Area: Efficient Inference, Video Analysis

We introduce a highly-efficient algorithm for video inference which makes use of the appearance similarity between two adjacent frames.

PoseHD: Boosting Human Detectors using Human Pose Information
Zhijian Liu, Bowen Pan, Yuliang Xiu, Cewu Lu
AAAI Conference on Artificial Intelligence (AAAI 2018) [PDF]
Area: Pedestrain Detection, Pose Estimation

We propose a original framework to boost human detection by utilizing skeleton information.

This handsome guy...