Chuang Gan;

Chuang Gan


I am a principal research staff member and director of robotics lab at IBM Research. I am also a research scientist at MIT, working closely with Prof. Antonio Torralba and Prof. Josh Tenenbaum. Before that, I completed my PhD with the highest honor at Tsinghua University, where I was supervised by Prof. Andrew Chi-Chih Yao. My research lies at the intersection of computer vision, AI, cognitive science, and robotics. The overarching goal of my research is to build a human-like common sense machine that is capable of sensing, reasoning, and acting in the physical world. My works have been recognized by Microsoft Fellowship, Baidu Fellowship, and media coverage from CNN, BBC, The New York Times, WIRED, Forbes, and MIT Tech Review.

 

Google Scholar | Contact | News | Publications | Competitions | Software | Honors | Accessibility

 


Email: ganchuang [at] csail (dot) mit (dot) edu


News

  • I am serving as an Area Chair for ICLR 2022, CVPR 2022, ICML 2022, ECCV 2022, ICCV 2021, ACL 2021 and NeurIPS 2021.
  • Code and dataset of PlasticineLab, DCL, and Order-Memory Policy Network have been released.
  • Code and dataset of Foley Music have been released.
  • Code, dataset and evaluation server of Video CLEVRER have been released.

    Research Highlight

    ThreeDWorld (TDW)

    A Multi-Modal Interactive Physical Simulation Platform for
    Computer Vision, Robotics and Cognitive Science

    Transport Challenge for Visually Guided Task and Motion Planing Agent Benchmark for Psychological Reasoning ThreeDWorld Website

    Publications(by date / by topic)

    2022

    Prompting Decision Transformer for Few-shot Policy Generalization

    Mengdi Xu, Yikang Shen, Shun Zhang, Yuchen Lu, Ding Zhao, Joshua B. Tenenbaum, Chuang Gan

    ICML 2022

    Finding Fallen Objects Via Asynchronous Audio-Visual Integration

    Chuang Gan*, Yi Gu*, Siyuan Zhou, Jeremy Schwartz, Seth Alter, James Traer, Dan Gutfreund, Joshua B. Tenenbaum, Josh McDermott*, Antonio Torralba*

    CVPR 2022

    Fixing Malfunctional Objects With Learned Physical Simulation and Functional Prediction

    Yining Hong, Kaichun Mo, Li Yi, Leonidas J. Guibas, Antonio Torralba, Joshua B. Tenenbaum, Chuang Gan

    CVPR 2022

    The ThreeDWorld Transport Challenge: A Visually Guided Task-and-Motion Planning Benchmark Towards Physically Realistic Embodied AI

    Chuang Gan, Siyuan Zhou, Jeremy Schwartz, Seth Alter, Abhishek Bhandwaldar, Dan Gutfreund, Daniel L.K. Yamins, James J DiCarlo, Josh McDermott, Antonio Torralba, Joshua B. Tenenbaum

    ICRA 2022

    RISP: Rendering-Invariant State Predictor with Differentiable Simulation and Rendering for Cross-Domain Parameter Estimation

    Pingchuan Ma*, Tao Du*, Joshua B. Tenenbaum, Wojciech Matusik, Chuang Gan

    ICLR 2022 (Oral)

    DiffSkill: Skill Abstraction from Differentiable Physics for Deformable Object Manipulations with Tools

    Xingyu Lin, Zhiao Huang, Yunzhu Li, Joshua B. Tenenbaum, David Held, Chuang Gan

    ICLR 2022

    Linking Emergent and Natural Languages via Corpus Transfer

    Shunyu Yao, Mo Yu, Yang Zhang, Karthik R Narasimhan, Joshua B. Tenenbaum, Chuang Gan

    ICLR 2022 (Spotlight)

    ComPhy: Compositional Physical Reasoning of Objects and Events from Videos

    Zhenfang Chen, Kexin Yi, Yunzhu Li, Mingyu Ding, Antonio Torralba, Joshua B. Tenenbaum, Chuang Gan

    ICLR 2022

    Contact Points Discovery for Soft-Body Manipulations with Differentiable Physics

    Sizhe Li*, Zhiao Huang*, Tao Du, Hao Su, Joshua B. Tenenbaum, Chuang Gan

    ICLR 2022 (Spotlight)

    FALCON: Fast Visual Concept Learning by Integrating Images, Linguistic descriptions, and Conceptual Relations

    Lingjie Mei*, Jiayuan Mao*, Ziqi Wang, Chuang Gan, Joshua B. Tenenbaum

    ICLR 2022

    2021

    ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation

    Chuang Gan, Jeremy Schwartz, Seth Alter, Damian Mrowca, Martin Schrimpf, James Traer, Julian De Freitas, Jonas Kubilius, Abhishek Bhandwaldar, Nick Haber, Megumi Sano, Kuno Kim, Elias Wang, Michael Lingelbach, Aidan Curtis, Kevin Tyler Feigelis, Daniel Bear, Dan Gutfreund, David Daniel Cox, Antonio Torralba, James J. DiCarlo, Joshua B. Tenenbaum, Josh Mcdermott, Daniel LK Yamins

    NeurIPS Dataset 2021 (Oral)

    Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language

    Mingyu Ding, Zhenfang Chen, Tao Du, Ping Luo, Joshua B. Tenenbaum, Chuang Gan

    NeurIPS 2021

    PTR: A Benchmark for Part-based Conceptual, Relational, and Physical Reasoning

    Yining Hong, Li Yi, Joshua B. Tenenbaum, Antonio Torralba, Chuang Gan

    NeurIPS 2021

    When Does Contrastive Learning Preserve Adversarial Robustness from Pretraining to Finetuning?

    Lijie Fan, Sijia Liu, Pin-Yu Chen, Gaoyuan Zhang, Chuang Gan

    NeurIPS 2021

    STAR: A Benchmark for Situated Reasoning in Real-World Videos

    Bo Wu, Shoubin Yu, Zhenfang Chen, Joshua B. Tenenbaum, Chuang Gan

    NeurIPS Dataset 2021

    Curious Representation Learning for Embodied Intelligence

    Yilun Du, Chuang Gan, Phillip Isola

    ICCV 2021

    OPEn: An Open-ended Physics Environment for Learning Without a Task

    Chuang Gan, Abhishek Bhandwaldar, Antonio Torralba, Joshua B. Tenenbaum, Phillip Isola

    IROS 2021

    AGENT: A Benchmark for Core Psychological Reasoning

    Tianmin Shu, Abhishek Bhandwaldar, Chuang Gan, Kevin A. Smith, Shari Liu, Dan Gutfreund, Elizabeth Spelke, Joshua B. Tenenbaum, Tomer D. Ullman

    ICML 2021

    Temporal and Object Quantification Networks

    Jiayuan Mao, Zhezheng Luo, Chuang Gan, Joshua B. Tenenbaum, Jiajun Wu, Leslie P. Kaelbling, Tomer D. Ullman

    IJCAI 2021

    PlasticineLab: A Soft-Body Manipulation Benchmark with Differentiable Physics.

    Zhiao Huang, Yuanming Hu, Tao Du, Siyuan Zhou, Hao Su, Joshua B. Tenenbaum, Chuang Gan

    ICLR 2021 (Spotlight)

    Grounding Physical Concepts of Objects and Events Through Dynamic Visual Reasoning

    Zhenfang Chen, Jiayuan Mao, Jiajun Wu, Kwan-Yee Kenneth Wong, Joshua B. Tenenbaum, Chuang Gan

    ICLR 2021

    Learning Task Decomposition with Order-Memory Policy Network

    Yuchen Lu, Yikang Shen, Siyuan Zhou, Aaron Courville, Joshua B. Tenenbaum, Chuang Gan

    ICLR 2021

    2020

    Foley Music: Learning to Generate Music from Videos

    Chuang Gan, Deng Huang, Peihao Chen, Joshua B. Tenenbaum, Antonio Torralba

    ECCV 2020

    Music Gesture for Visual Sound Separation

    Chuang Gan, Deng Huang, Hang Zhao, Joshua B. Tenenbaum, Antonio Torralba

    CVPR 2020

    Dense Regression Network For Video Grounding

    Runhao Zeng, Haoming Xu, Wenbing Huang, Peihao Chen, Mingkui Tan, Chuang Gan

    CVPR 2020

    TinyTL: Reduce Activations, Not Trainable Parameters for Efficient On-Device Learning

    Han Cai, Chuang Gan, Ligeng Zhu, Song Han

    NeurIPS 2020

    MCUNet: Tiny Deep Learning on IoT Devices

    Ji Lin, Wei-Ming Chen, Yujun Lin, John Cohn, Chuang Gan, Song Han

    NeurIPS 2020 (Spotlight)

    CLEVRER: CoLlision Events for Video REpresentation and Reasoning

    Kexin Yi*, Chuang Gan*, Yunzhu Li, Pushmeet Kohli, Jiajun Wu, Antonio Torralba, Joshua B. Tenenbaum

    ICLR 2020 (Spotlight)

    Deep Audio Priors Emerge From Harmonic Convolutional Networks

    Zhoutong Zhang, Yunyun Wang, Chuang Gan, Jiajun Wu, Joshua B. Tenenbaum, Antonio Torralba, William T. Freeman

    ICLR 2020

    Once for All: Train One Network and Specialize it for Efficient Deployment

    Han Cai, Chuang Gan, Tianzhe Wang, Zhekai Zhang, Song Han

    ICLR 2020

    Look, Listen, and Act: Towards Audio-Visual Embodied Navigation

    Chuang Gan*, Yiwei Zhang*, Jiajun Wu, Boqing Gong, Joshua B. Tenenbaum

    ICRA 2020

    2019

    Self-supervised Moving Vehicle Tracking with Stereo Sound

    Chuang Gan, Hang Zhao, Peihao Chen, David Cox, Antonio Torralba

    ICCV 2019

    The Sound of Motions

    Hang Zhao, Chuang Gan, Wei-Chiu Ma, Antonio Torralba

    ICCV 2019

    TSM: Temporal Shift Module for Efficient Video Understanding

    Ji Lin, Chuang Gan, Song Han

    ICCV 2019

    Graph Convolutional Networks for Temporal Action Localization

    Runhao Zeng, Wenbing Huang, Mingkui Tan, Yu Rong, Peilin Zhao, Junzhou Huang, Chuang Gan

    ICCV 2019

    Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement

    Chao Yang, Xiaojian Ma, Wenbing Huang, Fuchun Sun, Huaping Liu, Junzhou Huang, Chuang Gan

    NeurIPS 2019 (Spotlight)

    Visual Concept-Metaconcept Learning

    Chi Han, Jiayuan Mao, Chuang Gan, Josh Tenenbaum, Jiajun Wu

    NeurIPS 2019

    Cross-channel Communication Networks

    Jianwei Yang, Zhile Ren, Chuang Gan, Hongyuan Zhu, Devi Parikh

    NeurIPS 2019

    The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision

    Jiayuan Mao, Chuang Gan, Pushmeet Kohli, Joshua B. Tenenbaum, Jiajun Wu

    ICLR 2019 (Oral)

    Defensive quantization: When efficiency meets robustness

    Ji Lin, Chuang Gan, Song Han

    ICLR 2019

    2018

    Weakly Supervised Dense Event Captioning in Videos

    Xuguang Duan, Wenbing Huang, Chuang Gan, Jingdong Wang, Wenwu Zhu, Junzhou Huang

    NeurIPS 2018

    Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding

    Kexin Yi, Jiajun Wu, Chuang Gan, Antonio Torralba, Pushmeet Kohli, Joshua B. Tenenbaum

    NeurIPS 2018 (Spotlight)

    The Sound of Pixels

    Hang Zhao, Chuang Gan, Andrew Rouditchenko, Carl Vondrick, Josh McDermott, Antonio Torralba

    ECCV 2018

    Unsupervised Domain Adaptation for 3D Keypoint Estimation via View Consistency

    Xingyi Zhou, Arjun Karpur, Chuang Gan, Linjie Luo, Qixing Huang

    ECCV 2018

    Geometry-Guided CNNs for Self-supervised Video Representation Learning

    Chuang Gan, Boqing Gong, Kun Liu, Hao Su, Leonidas Guibas

    CVPR 2018

    Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification

    Xiang Long, Chuang Gan, Gerard de Melo, Jiajun Wu, Xiao Liu, Shilei Wen

    CVPR 2018

    End-to-End Learning of Motion Representation for Video Understanding

    Lijie Fan, Wenbing Huang, Chuang Gan, Stefano Ermon, Boqing Gong, Junzhou Huang

    CVPR 2018 (Spotlight)

    Sparse, Smart Contours to Represent and Edit Images

    Tali Dekel, Chuang Gan, Dilip Krishnan, Ce Liu, William T. Freeman

    CVPR 2018

    Video Captioning with Multi-Faceted Attention

    Xiang Long, Chuang Gan, Gerard de Melo

    TACL 2018


    2017

    StyleNet: Generating Attractive Visual Captions with Styles

    Chuang Gan, Zhe Gan, Xiaodong He, Jianfeng Gao, Li Deng

    CVPR 2017

    Semantic Compositional Networks for Visual Captioning

    Zhe Gan, Chuang Gan, Xiaodong He, Yunchen Pu, Kenneth Tran, Jianfeng Gao, Lawrence Carin, Li Deng

    CVPR 2017 (Spotlight)

    VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation

    Chuang Gan, Yandong Li, Haoxiang Li, Chen Sun, Boqing Gong

    ICCV 2017

    Recurrent Topic-Transition GAN for Visual Paragraph Generation

    Xiaodan Liang, Zhiting Hu, Hao Zhang, Chuang Gan, Eric P. Xing

    ICCV 2017


    2016

    Learning Attributes Equals Multi-Source Domain Generalization

    Chuang Gan, Tianbao Yang, Boqing Gong

    CVPR 2016 (Spotlight)

    You Lead, We Exceed: Labor-Free Video Concept Learning by Jointly Exploiting Web Videos and Images

    Chuang Gan, Ting Yao, Kuiyuan Yang, Yi Yang, Tao Mei

    CVPR 2016 (Spotlight)

    Recognizing an Action Using Its Name: A Knowledge-Based Approach

    Chuang Gan, Yi Yang, Linchao Zhu, Deli Zhao, Yueting Zhuang

    IJCV 2016

    Webly-Supervised Video Recognition by Mutually Voting for Relevant Web Images and Web Video Frames

    Chuang Gan, Chen Sun, Lixin Duan, Boqing Gong

    ECCV 2016


    2015

    DevNet: A Deep Event Network for multimedia event detection and evidence recounting

    Chuang Gan, Naiyan Wang, Yi Yang, Dit-Yan Yeung, Alexander G. Hauptmann

    CVPR 2015

    Automatic Concept Discovery from Parallel Text and Visual Corpora

    Chen Sun, Chuang Gan, Ram Nevatia

    ICCV 2015

    Embodied Intelligence

    ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation

    Chuang Gan, Jeremy Schwartz, Seth Alter, Damian Mrowca, Martin Schrimpf, James Traer, Julian De Freitas, Jonas Kubilius, Abhishek Bhandwaldar, Nick Haber, Megumi Sano, Kuno Kim, Elias Wang, Michael Lingelbach, Aidan Curtis, Kevin Tyler Feigelis, Daniel Bear, Dan Gutfreund, David Daniel Cox, Antonio Torralba, James J. DiCarlo, Joshua B. Tenenbaum, Josh Mcdermott, Daniel LK Yamins

    NeurIPS Dataset 2021 (Oral)

    Prompting Decision Transformer for Few-shot Policy Generalization

    Mengdi Xu, Yikang Shen, Shun Zhang, Yuchen Lu, Ding Zhao, Joshua B. Tenenbaum, Chuang Gan

    ICML 2022

    The ThreeDWorld Transport Challenge: A Visually Guided Task-and-Motion Planning Benchmark Towards Physically Realistic Embodied AI

    Chuang Gan, Siyuan Zhou, Jeremy Schwartz, Seth Alter, Abhishek Bhandwaldar, Dan Gutfreund, Daniel L.K. Yamins, James J DiCarlo, Josh McDermott, Antonio Torralba, Joshua B. Tenenbaum

    ICRA 2022

    RISP: Rendering-Invariant State Predictor with Differentiable Simulation and Rendering for Cross-Domain Parameter Estimation

    Pingchuan Ma*, Tao Du*, Joshua B. Tenenbaum, Wojciech Matusik, Chuang Gan (Oral)

    ICLR 2022 (Oral)

    DiffSkill: Skill Abstraction from Differentiable Physics for Deformable Object Manipulations with Tools

    Xingyu Lin, Zhiao Huang, Yunzhu Li, Joshua B. Tenenbaum, David Held, Chuang Gan

    ICLR 2022

    Contact Points Discovery for Soft-Body Manipulations with Differentiable Physics

    Sizhe Li*, Zhiao Huang*, Tao Du, Hao Su, Joshua B. Tenenbaum, Chuang Gan

    ICLR 2022 (Spotlight)

    OPEn: An Open-ended Physics Environment for Learning Without a Task

    Chuang Gan, Abhishek Bhandwaldar, Antonio Torralba, Joshua B. Tenenbaum, Phillip Isola

    IROS 2021

    Curious Representation Learning for Embodied Intelligence

    Yilun Du, Chuang Gan, Phillip Isola

    ICCV 2021

    PlasticineLab: A Soft-Body Manipulation Benchmark with Differentiable Physics.

    Zhiao Huang, Yuanming Hu, Tao Du, Siyuan Zhou, Hao Su, Joshua B. Tenenbaum, Chuang Gan

    ICLR 2021 (Spotlight)

    Learning Task Decomposition with Order-Memory Policy Network.

    Yuchen Lu, Yikang Shen, Siyuan Zhou, Aaron Courville, Joshua B. Tenenbaum, Chuang Gan

    ICLR 2021

    Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement

    Chao Yang, Xiaojian Ma, Wenbing Huang, Fuchun Sun, Huaping Liu, Junzhou Huang, Chuang Gan

    NeurIPS 2019 (Spotlight)


    Audio-Visual Scene Analysis

    Finding Fallen Objects Via Asynchronous Audio-Visual Integration

    Chuang Gan*, Yi Gu*, Siyuan Zhou, Jeremy Schwartz, Seth Alter, James Traer, Dan Gutfreund, Joshua B. Tenenbaum, Josh McDermott*, Antonio Torralba*

    CVPR 2022

    Look, Listen, and Act: Towards Audio-Visual Embodied Navigation

    Chuang Gan*, Yiwei Zhang*, Jiajun Wu, Boqing Gong, Joshua B. Tenenbaum

    ICRA 2020

    Foley Music: Learning to Generate Music from Videos

    Chuang Gan, Deng Huang, Peihao Chen, Joshua B. Tenenbaum, Antonio Torralba

    ECCV 2020

    Music Gesture for Visual Sound Separation

    Chuang Gan, Deng Huang, Hang Zhao, Joshua B. Tenenbaum, Antonio Torralba

    CVPR 2020

    The Sound of Pixels

    Hang Zhao, Chuang Gan, Andrew Rouditchenko, Carl Vondrick, Josh McDermott, Antonio Torralba

    ECCV 2018

    Self-supervised Moving Vehicle Tracking with Stereo Sound

    Chuang Gan, Hang Zhao, Peihao Chen, David Cox, Antonio Torralba

    ICCV 2019

    The Sound of Motions

    Hang Zhao, Chuang Gan, Wei-Chiu Ma, Antonio Torralba

    ICCV 2019


    Visual Commonsense Reasoning

    Fixing Malfunctional Objects With Learned Physical Simulation and Functional Prediction

    Yining Hong, Kaichun Mo, Li Yi, Leonidas J. Guibas, Antonio Torralba, Joshua B. Tenenbaum, Chuang Gan

    CVPR 2022

    ComPhy: Compositional Physical Reasoning of Objects and Events from Videos

    Zhenfang Chen, Kexin Yi, Yunzhu Li, Mingyu Ding, Antonio Torralba, Joshua B. Tenenbaum, Chuang Gan

    ICLR 2022

    Linking Emergent and Natural Languages via Corpus Transfer

    Shunyu Yao, Mo Yu, Yang Zhang, Karthik R Narasimhan, Joshua B. Tenenbaum, Chuang Gan

    ICLR 2022 (Spotlight)

    FALCON: Fast Visual Concept Learning by Integrating Images, Linguistic descriptions, and Conceptual Relations

    Lingjie Mei*, Jiayuan Mao*, Ziqi Wang, Chuang Gan, Joshua B. Tenenbaum

    ICLR 2022

    Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language

    Mingyu Ding, Zhenfang Chen, Tao Du, Ping Luo, Joshua B. Tenenbaum, Chuang Gan

    NeurIPS 2021

    PTR: A Benchmark for Part-based Conceptual, Relational, and Physical Reasoning

    Yining Hong, Li Yi, Joshua B. Tenenbaum, Antonio Torralba, Chuang Gan

    NeurIPS 2021

    STAR: A Benchmark for Situated Reasoning in Real-World Videos

    Bo Wu, Shoubin Yu, Zhenfang Chen, Joshua B. Tenenbaum, Chuang Gan

    NeurIPS Dataset 2021

    AGENT: A Benchmark for Core Psychological Reasoning

    Tianmin Shu, Abhishek Bhandwaldar, Chuang Gan, Kevin A. Smith, Shari Liu, Dan Gutfreund, Elizabeth Spelke, Joshua B. Tenenbaum, Tomer D. Ullman

    ICML 2021

    Temporal and Object Quantification Networks

    Jiayuan Mao, Zhezheng Luo, Chuang Gan, Joshua B. Tenenbaum, Jiajun Wu, Leslie P. Kaelbling, Tomer D. Ullman

    IJCAI 2021

    Grounding Physical Object and Event Concepts Through Dynamic Visual Reasoning.

    Zhenfang Chen, Jiayuan Mao, Jiajun Wu, Kwan-Yee Kenneth Wong, Joshua B. Tenenbaum, Chuang Gan

    ICLR 2021

    CLEVRER: CoLlision Events for Video REpresentation and Reasoning

    Kexin Yi*, Chuang Gan*, Yunzhu Li, Pushmeet Kohli, Jiajun Wu, Antonio Torralba, Joshua B. Tenenbaum

    ICLR 2020 (Oral Spotlight)

    Dense Regression Network For Video Grounding

    Runhao Zeng, Haoming Xu, Wenbing Huang, Peihao Chen, Mingkui Tan, Chuang Gan

    CVPR 2020

    The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision

    Jiayuan Mao, Chuang Gan, Pushmeet Kohli, Joshua B. Tenenbaum, Jiajun Wu

    ICLR 2019 (Oral)

    Visual Concept-Metaconcept Learning

    Chi Han, Jiayuan Mao, Chuang Gan, Josh Tenenbaum, Jiajun Wu

    NeurIPS 2019

    Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding

    Kexin Yi, Jiajun Wu, Chuang Gan, Antonio Torralba, Pushmeet Kohli, Joshua B. Tenenbaum

    NIPS 2018

    VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation

    Chuang Gan, Yandong Li, Haoxiang Li, Chen Sun, Boqing Gong

    ICCV 2017


    Visual Representations Learning

    When Does Contrastive Learning Preserve Adversarial Robustness from Pretraining to Finetuning?

    Lijie Fan, Sijia Liu, Pin-Yu Chen, Gaoyuan Zhang, Chuang Gan

    NeurIPS 2021

    TinyTL: Reduce Activations, Not Trainable Parameters for Efficient On-Device Learning

    Han Cai, Chuang Gan, Ligeng Zhu, Song Han

    NeurIPS 2020

    MCUNet: Tiny Deep Learning on IoT Devices

    Ji Lin, Wei-Ming Chen, Yujun Lin, John Cohn, Chuang Gan, Song Han

    NeurIPS 2020 (Spotlight)

    Once for All: Train One Network and Specialize it for Efficient Deployment

    Han Cai, Chuang Gan, Tianzhe Wang, Zhekai Zhang, Song Han

    ICLR 2020

    Cross-channel Communication Networks

    Jianwei Yang, Zhile Ren, Chuang Gan, Hongyuan Zhu, Devi Parikh

    NeurIPS 2019

    TSM: Temporal Shift Module for Efficient Video Understanding

    Ji Lin, Chuang Gan, Song Han

    ICCV 2019

    Graph Convolutional Networks for Temporal Action Localization

    Runhao Zeng, Wenbing Huang, Mingkui Tan, Yu Rong, Peilin Zhao, Junzhou Huang, Chuang Gan

    ICCV 2019

    Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification

    Xiang Long, Chuang Gan, Gerard de Melo, Jiajun Wu, Xiao Liu, Shilei Wen

    CVPR 2018

    End-to-End Learning of Motion Representation for Video Understanding

    Lijie Fan, Wenbing Huang, Chuang Gan, Stefano Ermon, Boqing Gong, Junzhou Huang

    CVPR 2018 (Spotlight)

    DevNet: A Deep Event Network for multimedia event detection and evidence recounting

    Chuang Gan, Naiyan Wang, Yi Yang, Dit-Yan Yeung, Alexander G. Hauptmann

    CVPR 2015


    Learning from Unlabeled Videos

    Geometry-Guided CNNs for Self-supervised Video Representation Learning

    Chuang Gan, Boqing Gong, Kun Liu, Hao Su, Leonidas Guibas

    CVPR 2018

    You Lead, We Exceed: Labor-Free Video Concept Learning by Jointly Exploiting Web Videos and Images

    Chuang Gan, Ting Yao, Kuiyuan Yang, Yi Yang, Tao Mei

    CVPR 2016 (Spotlight)

    Recognizing an Action Using Its Name: A Knowledge-Based Approach

    Chuang Gan, Yi Yang, Linchao Zhu, Deli Zhao, Yueting Zhuang

    IJCV 2016

    Webly-Supervised Video Recognition by Mutually Voting for Relevant Web Images and Web Video Frames

    Chuang Gan, Chen Sun, Lixin Duan, Boqing Gong

    ECCV 2016


    Generative Models for Vision and Language

    Weakly Supervised Dense Event Captioning in Videos

    Xuguang Duan, Wenbing Huang, Chuang Gan, Jingdong Wang, Wenwu Zhu, Junzhou Huang

    NeurIPS 2018

    Video Captioning with Multi-Faceted Attention

    Xiang Long, Chuang Gan, Gerard de Melo

    TACL 2018

    StyleNet: Generating Attractive Visual Captions with Styles

    Chuang Gan, Zhe Gan, Xiaodong He, Jianfeng Gao, Li Deng

    CVPR 2017

    Semantic Compositional Networks for Visual Captioning

    Zhe Gan, Chuang Gan, Xiaodong He, Yunchen Pu, Kenneth Tran, Jianfeng Gao, Lawrence Carin, Li Deng

    CVPR 2017 (Spotlight)

    Recurrent Topic-Transition GAN for Visual Paragraph Generation

    Xiaodan Liang, Zhiting Hu, Hao Zhang, Chuang Gan, Eric P. Xing

    ICCV 2017

    Automatic Concept Discovery from Parallel Text and Visual Corpora

    Chen Sun, Chuang Gan, Ram Nevatia

    ICCV 2015

    Sparse, Smart Contours to Represent and Edit Images

    Tali Dekel, Chuang Gan, Dilip Krishnan, Ce Liu, William T. Freeman

    CVPR 2018


    Domaim Adaptation

    Learning Attributes Equals Multi-Source Domain Generalization

    Chuang Gan, Tianbao Yang, Boqing Gong

    CVPR 2016 (Spotlight)

    Unsupervised Domain Adaptation for 3D Keypoint Estimation via View Consistency

    Xingyi Zhou, Arjun Karpur,Chuang Gan, Linjie Luo, Qixing Huang

    ECCV 2018


    Competitions

    • Rank 1st in ActivityNet AVA Challenge 2018

    • Rank 1st in ActivityNet Kinetics Challenge 2017

    • Rank 1st in NIST TRECVID MED and MER 2014

    • Rank 2nd in Moments in Time 2018

    • Rank 3rd in Youtube8M Challenge 2017

    • Rank 3rd in ActivityNet classification Challenge 2016


    Data & Software

    NS-VQA. Neural-Symbolic Visual Reasoning.

    WSDEC. Weakly-supervised Dense Event Captioning.

    The Sound of Pixels. Listen to the sound of pixels.

    Smart Contours. Edit images using contours.

    Attention Clusters. Multiple and diverse attention for video classification.

    SCN. Semantic composition network for image and video captioning.

    VQS. Visual question segmentation.

    TVNET. End to end video motion learning.

    Youtube8M. Temporal modeling for video classification.


    Honors

    • Outstanding Doctoral Thesis Award at Tsinghua University (2018)

    • Excellent Graduate Student at Tsinghua University (2018)

    • Top Talented Graduate Student at Tsinghua University (2017)

    • Academic Rising Star Finalist at Tsinghua University (2016, 2017)

    • Microsoft Fellowship (2016)

    • Baidu Fellowship (2016)

    • National Scholarship, by Ministry of Education of China (2015)