Nov. 2019: Talk on "Marrying Neural Networks and Undirected Graphical Models for Efficient Speech Recognition" (联合神经网络与无向图模型的高效语音识别) at ASRU中英混杂语音识别挑战赛线下技术交流会 2019/11/23.
[ slides ]
Nov. 2019: We release CAT (Crf-based Asr Tookit) for data-efficient end2end speech recognition. Hope it's fun!
[ code at github ]
Oct. 2019: Co-chair the panel "Frontiers and challenges of intelligent speech processing" at CNCC (China National Computer Congress) 2019, Suzhou, China.
August 2019: Chair the session "Speech perception and recognition" at CCF Speech Conversation and Auditory TC Annual Meeting, Xining, China.
July 2019: Invited to teach a short course "Theory and Applications of Probabilistic Graphical Models" (1 credit) at Xinjiang University, China.
June 2019: Congratulations to Keyu An: Tsinghua University Outstanding Undergraduate Thesis - "Random Fields based Speech Recognition".
April 2019: Talk on "Toward Simple Speaker Recognition and Speech Recognition"(简洁的说话人识别及语音识别) at The Symposium on Speaker Recognition Research and Application, Kun-shan, China.
[ slides ]
Nov. 2018: Congratulations to Yutian Li: ISCSLP (International Symposium on Chinese Spoken Language Processing) Best Student Paper Award.
[ News ]
Nov. 26, 2018: Tutorial "Language Modeling: State of the Art" at the 11th International Symposium on Chinese Spoken Language Processing (ISCSLP), Taipei, given by Zhijian Ou and Bin Wang.
[ URL | slides ]
Oct. 2018: Talk on "Neural random fields with applications to modeling of languages and images" at Apple Boston, Toyota Technological Institute at Chicago (TTIC), and UIUC (link).
July 2018: Congratulations to Bin Wang: Tsinghua University Outstanding Ph.D. Thesis.
[ Thesis pdf ]
Feb. 2018: We release the Iqiyi movie dialog dataset collected and labeled under a crowdsourcing Wizard-of-Oz framework, which is used in our paper - "Tracking of enriched dialog states for flexible conversational information access", published at ICASSP 2018.
[ arxiv | data readme | data download ]
Oct. 2017: Talk on "Future Artificial Intelligence with Efficient Learning" at Tsinghua University Science and Technology Forum, Su-zhou, China.
Oct. 2017: Present the PAMI paper “Learning Trans-dimensional Random Fields with Applications to Language Modeling”, at the Special Session on Journal Papers, NCMMSC (National Conference on Man-Machine Communication) 2017, Lian-yun-gang, China, by Bin Wang.
[ slides ]
Dec. 2016: Present “The THU-SPMI SRE-16 System with Joint Bayesian Scoring and Ladder Network based Feature Learning”, at 2016 NIST Speaker Recognition Workshop (SRE2016), San Diego, USA.
[ pdf | teaser presentation | poster ]
Nov. 2016: We release SPMI Array processing toolkit, which was used in our work - "The THU-SPMI CHiME-4 system : Lightweight design with advanced multi-channel processing, feature enhancement, and language modeling", published in CHiME Workshop, 2016.
The toolkit implements a bundle of beamforming methods, including MVDR, PMWF, GSC, ME, and we compare these techniques with the same back-end. See the paper for details.
[ pdf | poster | SPMIArray toolkit at github ]
Nov. 2016: We release the 9-hour real-world audio stream recorded from a public broadcast radio in Beijing with its motif annotation, which is used in our paper - "Scalable Discovery of Audio Fingerprint Motifs in Broadcast Streams With Determinantal Point Process Based Motif Clustering", published in IEEE/ACM Transactions on Audio, Speech and Language Processing, 2016.
[ pdf | data readme | data download ]
Oct. 17, 2016: Tutorial "Undirected Graphical Models: Theory and Applications to Speech and Language Processing" at the 10th International Symposium on Chinese Spoken Language Processing (ISCSLP), Tianjin, China.
[ URL | slides ]
August 2016: We release the SPMILM toolkit at github, which not only implements TRF-LMs, but also includes third-party codes for ngram-LMs, RNN-LMs and LSTM-LMs. Example scripts are provided to evaluate different LMs by rescoring n-best lists.
[ SPMILM toolkit at github | doxygen documentation ]
July 2016: We release the code for SSP 2016 Paper "Block-Wise MAP Inference for Determinantal Point Processes with Application to Change-Point Detection" ! The change-point selection problem is elegantly addressed through DPP-based subset selection. Five real-world data experiments are shown.
[ pdf | poster | longer version at arxiv | code readme | code download ]
Oct. 2015: Present "Trans-dimensional Random Fields (TDRF) for Sequence Modeling" at SANE workshop, held at Google in New York City, NY.
We point out the potential of applying random fields for sequence modeling, demonstrated by its success in language modeling.
[ poster ]
July 2015: We release the code for our ACL-2015 Long Paper "Trans-dimensional Random Fields for Language Modeling" ! The TRF-LMs lead to performances as good as the RNN-LMs but are computationally more efficient in computing sentence probability (200x faster).
[ pdf | poster | code readme | code download ]
June 2015: Talk on "Probabilistic Modeling of Speech and Language" at IBM TJ Watson Research Center, New York.
[ slides ]
June 2015: Talk on "Probabilistic Modeling of Speech and Language" at Mitsubishi Electric Research Laboratories (MERL), Boston.
May 2015: Talk on "Probabilistic Modeling of Speech and Language" at ICSI, Berkeley.
[ abstract ]
August 2014: A brief introduction to SPMI (Speech Processing and Machine Intelligence) Lab, as the orientation material for post-graduate freshman in the EE Department.
[ two slides in Chinese ]