|
|
Contact
Email: ganchuang [at] csail (dot) mit (dot) eduNews
Publications(by date / by topic)
2020
![]() |
Foley Music: Learning to Generate Music from Videos
ECCV 2020 |
![]() |
Music Gesture for Visual Sound Separation
CVPR 2020 |
![]() |
Dense Regression Network For Video Grounding
CVPR 2020 |
![]() |
CLEVRER: CoLlision Events for Video REpresentation and Reasoning
ICLR 2020 (Oral Spotlight) |
![]() |
Deep Audio Priors Emerge From Harmonic Convolutional Networks
ICLR 2020 |
![]() |
Once for All: Train One Network and Specialize it for Efficient Deployment
ICLR 2020 |
![]() |
Look, Listen, and Act: Towards Audio-Visual Embodied Navigation
ICRA 2020 |
2019
![]() |
Self-supervised Moving Vehicle Tracking with Stereo Sound
ICCV 2019 |
![]() |
ICCV 2019 |
![]() |
TSM: Temporal Shift Module for Efficient Video Understanding
ICCV 2019 |
![]() |
Graph Convolutional Networks for Temporal Action Localization
ICCV 2019 |
![]() |
Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement
NeurIPS 2019 (Spotlight) |
![]() |
Visual Concept-Metaconcept Learning
NeurIPS 2019 |
![]() |
Cross-channel Communication Networks
NeurIPS 2019 |
![]() |
ICLR 2019 (Oral) |
![]() |
Defensive quantization: When efficiency meets robustness
ICLR 2019 |
2018
![]() |
Weakly Supervised Dense Event Captioning in Videos
NeurIPS 2018 |
![]() |
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding
NeurIPS 2018 (Spotlight) |
![]() |
ECCV 2018 |
![]() |
Unsupervised Domain Adaptation for 3D Keypoint Estimation via View Consistency
ECCV 2018 |
![]() |
Geometry-Guided CNNs for Self-supervised Video Representation Learning
CVPR 2018 |
![]() |
Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification
CVPR 2018 |
![]() |
End-to-End Learning of Motion Representation for Video Understanding
CVPR 2018 (Spotlight) |
![]() |
Sparse, Smart Contours to Represent and Edit Images
CVPR 2018 |
![]() |
Video Captioning with Multi-Faceted Attention
TACL 2018 |
2017
![]() |
StyleNet: Generating Attractive Visual Captions with Styles
CVPR 2017 |
![]() |
Semantic Compositional Networks for Visual Captioning
CVPR 2017 (Spotlight) |
![]() |
ICCV 2017 |
![]() |
Recurrent Topic-Transition GAN for Visual Paragraph Generation
ICCV 2017 |
2016
![]() |
Automatic Concept Discovery from Parallel Text and Visual Corpora
ICCV 2015 |
Audio-Visual Scene Analysis
![]() |
Foley Music: Learning to Generate Music from Videos
ECCV 2020 |
![]() |
Music Gesture for Visual Sound Separation
CVPR 2020 |
![]() |
ECCV 2018 |
![]() |
Self-supervised Moving Vehicle Tracking with Stereo Sound
ICCV 2019 |
![]() |
ICCV 2019 |
![]() |
Deep Audio Priors Emerge From Harmonic Convolutional Networks
ICLR 2020 |
![]() |
Look, Listen, and Act: Towards Audio-Visual Embodied Navigation
ICRA 2020 |
Visual Reasoning
![]() |
CLEVRER: CoLlision Events for Video REpresentation and Reasoning
ICLR 2020 (Oral Spotlight) |
![]() |
Dense Regression Network For Video Grounding
CVPR 2020 |
![]() |
ICLR 2019 (Oral) |
![]() |
Visual Concept-Metaconcept Learning
NeurIPS 2019 |
![]() |
Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding
NIPS 2018 |
![]() |
ICCV 2017 |
Visual Representations Learning
![]() |
Once for All: Train One Network and Specialize it for Efficient Deployment
ICLR 2020 |
![]() |
Cross-channel Communication Networks
NeurIPS 2019 |
![]() |
TSM: Temporal Shift Module for Efficient Video Understanding
ICCV 2019 |
![]() |
Graph Convolutional Networks for Temporal Action Localization
ICCV 2019 |
![]() |
Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification
CVPR 2018 |
![]() |
End-to-End Learning of Motion Representation for Video Understanding
CVPR 2018 (Spotlight) |
![]() |
DevNet: A Deep Event Network for multimedia event detection and evidence recounting
CVPR 2015 |
Learning from Unlabeled Videos
![]() |
Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement
NeurIPS 2019 (Spotlight) |
![]() |
Geometry-Guided CNNs for Self-supervised Video Representation Learning
CVPR 2018 |
![]() |
You Lead, We Exceed: Labor-Free Video Concept Learning by Jointly Exploiting Web Videos and Images
CVPR 2016 (Spotlight) |
![]() |
Recognizing an Action Using Its Name: A Knowledge-Based Approach
IJCV 2016 |
![]() |
Webly-Supervised Video Recognition by Mutually Voting for Relevant Web Images and Web Video Frames
ECCV 2016 |
Generative Models for Vision and Language
![]() |
Weakly Supervised Dense Event Captioning in Videos
NeurIPS 2018 |
![]() |
Video Captioning with Multi-Faceted Attention
TACL 2018 |
![]() |
StyleNet: Generating Attractive Visual Captions with Styles
CVPR 2017 |
![]() |
Semantic Compositional Networks for Visual Captioning
CVPR 2017 (Spotlight) |
![]() |
Recurrent Topic-Transition GAN for Visual Paragraph Generation
ICCV 2017 |
![]() |
Automatic Concept Discovery from Parallel Text and Visual Corpora
ICCV 2015 |
![]() |
Sparse, Smart Contours to Represent and Edit Images
CVPR 2018 |
Domaim Adaptation
![]() |
Learning Attributes Equals Multi-Source Domain Generalization
CVPR 2016 (Spotlight) |
![]() |
Unsupervised Domain Adaptation for 3D Keypoint Estimation via View Consistency
ECCV 2018 |
Competitions
• Rank 1st in ActivityNet AVA Challenge 2018
• Rank 1st in ActivityNet Kinetics Challenge 2017
• Rank 1st in NIST TRECVID MED and MER 2014
• Rank 2nd in Moments in Time 2018
• Rank 3rd in Youtube8M Challenge 2017
• Rank 3rd in ActivityNet classification Challenge 2016
Data & Software
• NS-VQA. Neural-Symbolic Visual Reasoning.
• WSDEC. Weakly-supervised Dense Event Captioning.
• The Sound of Pixels. Listen to the sound of pixels.
• Smart Contours. Edit images using contours.
• Attention Clusters. Multiple and diverse attention for video classification.
• SCN. Semantic composition network for image and video captioning.
• VQS. Visual question segmentation.
• TVNET. End to end video motion learning.
• Youtube8M. Temporal modeling for video classification.
Talks
Video Understanding: From Tags to Language.
Stanford University, MSR, AI2, NEC, NVDIA, Baidu, MERL, IBM, UCF (2017)
Honors
• Outstanding Doctoral Thesis Award at Tsinghua University (2018)
• Excellent Graduate Student at Tsinghua University (2018)
• Top Talented Graduate Student at Tsinghua University (2017)
• Academic Rising Star Finalist at Tsinghua University (2016, 2017)
• Microsoft Fellowship (2016)
• Baidu Fellowship (2016)
• National Scholarship, by Ministry of Education of China (2015)