A Multi-Modal Interactive Physical Simulation Platform for
Computer Vision, Robotics and Cognitive Science
• Rank 1st in ActivityNet AVA Challenge 2018
• Rank 1st in ActivityNet Kinetics Challenge 2017
• Rank 1st in NIST TRECVID MED and MER 2014
• Rank 2nd in Moments in Time 2018
• Rank 3rd in Youtube8M Challenge 2017
• Rank 3rd in ActivityNet classification Challenge 2016
Data & Software
• NS-VQA. Neural-Symbolic Visual Reasoning.
• WSDEC. Weakly-supervised Dense Event Captioning.
• The Sound of Pixels. Listen to the sound of pixels.
• Smart Contours. Edit images using contours.
• Attention Clusters. Multiple and diverse attention for video classification.
• SCN. Semantic composition network for image and video captioning.
• VQS. Visual question segmentation.
• TVNET. End to end video motion learning.
• Youtube8M. Temporal modeling for video classification.
Stanford University, MSR, AI2, NEC, NVDIA, Baidu, MERL, IBM, UCF (2017)
• Outstanding Doctoral Thesis Award at Tsinghua University (2018)
• Excellent Graduate Student at Tsinghua University (2018)
• Top Talented Graduate Student at Tsinghua University (2017)
• Academic Rising Star Finalist at Tsinghua University (2016, 2017)
• Microsoft Fellowship (2016)
• Baidu Fellowship (2016)
• National Scholarship, by Ministry of Education of China (2015)