2019

  1. Transfer Learning from Audio-Visual Grounding to Speech Recognition Hsu, Wei-Ning, Harwath, David, and Glass, James In Interspeech 2019 [Paper]
  2. An Unsupervised Autoregressive Model for Speech Representation Learning Chung, Yu-An, Hsu, Wei-Ning, Tang, Hao, and Glass, James In Interspeech 2019 [Paper] [Code]
  3. Hierarchical Generative Modeling for Controllable Speech Synthesis Hsu, Wei-Ning, Zhang, Yu, Weiss, Ron J, Zen, Heiga, Wu, Yonghui, Wang, Yuxuan, Cao, Yuan, Jia, Ye, Chen, Zhifeng, Shen, Jonathan, and others, In International Conference on Learning Representations (ICLR) 2019 [Paper] [Webpage]
  4. Disentangling Correlated Speaker and Noise for Speech Synthesis via Data Augmentation and Adversarial Factorization Hsu, Wei-Ning, Zhang, Yu, Weiss, Ron J, Chung, Yu-An, Wang, Yuxuan, Wu, Yonghui, and Glass, James In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019 [Paper] [Webpage]
  5. Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis Chung, Yu-An, Wang, Yuxuan, Hsu, Wei-Ning, Zhang, Yu, and Skerry-Ryan, RJ In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019 [Paper] [Webpage]
  6. Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling Shen, Jonathan, Nguyen, Patrick, Wu, Yonghui, Chen, Zhifeng, Chen, Mia X, Jia, Ye, Kannan, Anjuli, Sainath, Tara, Cao, Yuan, Chiu, Chung-Cheng, and others, arXiv preprint arXiv:1902.08295 2019 [Paper] [Code]

2018

  1. Disentangling by Partitioning: A Representation Learning Framework for Multimodal Sensory Data Hsu, Wei-Ning, and Glass, James arXiv preprint arXiv:1805.11264 2018 [Paper]
  2. Unsupervised Representation Learning of Speech for Dialect Identification Shon, Suwon, Hsu, Wei-Ning, and Glass, James IEEE Workshop on Spoken Language Technology Workshop (SLT) 2018 [Paper]
  3. Unsupervised Adaptation with Interpretable Disentangled Representation for Distant Conversational Speech Recognition Hsu, Wei-Ning, Tang, Hao, and Glass, James In Interspeech 2018 [Paper]
  4. A Study of Enhancement, Augmentation, and Autoencoder Methods for Domain Adaptation in Distant Speech Recognition Tang, Hao, Hsu, Wei-Ning, Grondin, François, and Glass, James In Interspeech 2018 [Paper]
  5. Scalable Factorized Hierarchical Variational Autoencoder Training Hsu, Wei-Ning, and Glass, James In Interspeech 2018 [Paper] [Code]
  6. A Noise-Robust Self-Adaptive Multitarget Speaker Detection System Zheng, Siqi, Wang, Jianzong, Xiao, Jing, Hsu, Wei-Ning, and Glass, James In International Conference on Pattern Recognition 2018
  7. Extracting Domain Invariant Features by Unsupervised Learning for Robust Automatic Speech Recognition Hsu, Wei-Ning, and Glass, James In International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2018 [Paper] [Poster]

2017

  1. Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data Hsu, Wei-Ning, Zhang, Yu, and Glass, James In Advances in Neural Information Processing Systems (NIPS) 2017 [Paper] [Supp] [Poster] [Code]
  2. Unsupervised Domain Adaptation for Robust Speech Recognition via Variational Autoencoder-Based Data Augmentation Hsu, Wei-Ning, Zhang, Yu, and Glass, James In IEEE Workshop on Automatic Speech Recognition and Understanding Workshop (ASRU) 2017 (Best Paper Honorable Mention) [Paper] [Poster]
  3. Automatic Speech Recognition of Arabic Multi-Genre Broadcast Media Najafian, Maryam, Hsu, Wei-Ning, and Glass, James In IEEE Workshop on Automatic Speech Recognition and Understanding Workshop (ASRU) 2017
  4. Learning Latent Representations for Speech Generation and Transformation Hsu, Wei-Ning, Zhang, Yu, and Glass, James In Interspeech 2017 [Paper] [Slides] [Code] [Webpage]

2016

  1. Neural Attention for Learning to Rank Questions in Community Question Answering Romeo, Salvatore, Da San Martino, Giovanni, Barrón-Cedeno, Alberto, Moschitti, Alessandro, Belinkov, Yonatan, Hsu, Wei-Ning, Zhang, Yu, Mohtarami, Mitra, and Glass, James In International Conference on Computational Linguistics (COLING) 2016 [Paper]
  2. A Prioritized Grid Long Short-Term Memory RNN for Speech Recognition Hsu, Wei-Ning, Zhang, Yu, and Glass, James In IEEE Workshop on Spoken Language Technology Workshop (SLT) 2016 [Paper]
  3. Development of the MIT ASR System for the 2016 Arabic Multi-Genre Broadcast Challenge AlHanai, Tuka, Hsu, Wei-Ning, and Glass, James In IEEE Workshop on Spoken Language Technology Workshop (SLT) 2016 [Paper]
  4. SLS at SemEval-2016 Task 3: Neural-based approaches for ranking in community question answering Mohtarami, Mitra, Belinkov, Yonatan, Hsu, Wei-Ning, Zhang, Yu, Lei, Tao, Bar, Kfir, Cyphers, Scott, and Glass, James In International Workshop on Semantic Evaluation (SemEval) 2016 [Paper]
  5. Exploiting Depth and Highway Connections in Convolutional Recurrent Deep Neural Networks for Speech Recognition Hsu, Wei-Ning, Zhang, Yu, Lee, Ann, and Glass, James In Interspeech 2016 [Paper]
  6. Multi-Channel Speech Recognition: LSTMs All the Way Through Erdogan, Hakan, Hayashi, Tomoki, Hershey, John R, Hori, Takaaki, Hori, Chiori, Hsu, Wei-Ning, Kim, Suyoun, Le Roux, Jonathan, Meng, Zhong, and Watanabe, Shinji In The 4th CHiME Speech Separation and Recognition Challenge 2016 [Paper]
  7. Recurrent Neural Network Encoder with Attention for Community Question Answering Hsu, Wei-Ning, Zhang, Yu, and Glass, James arXiv preprint arXiv:1603.07044 2016 [Paper] [Code]

2015

  1. Active Learning by Learning Hsu, Wei-Ning, and Lin, Hsuan-Tien In AAAI Conference on Artificial Intelligence (AAAI) 2015 [Paper] [Code]
  2. Enhancing Automatically Discovered Multi-Level Acoustic Patterns Considering Context Consistency with Applications in Spoken Term Detection Chung, Cheng-Tao, Hsu, Wei-Ning, Lee, Cheng-Yi, and Lee, Lin-Shan In International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2015 [Paper]