Mar 25, 2022: The Department (Electronic Engineering Department, Tsinghua University) is launching Speech And Language Technologies (SALT) Frontiers Lecture Series (语音语言技术前沿讲座).
Feb, 2022: We are organizing SereTOD workshop "Towards Semi-Supervised and Reinforced Task-Oriented Dialog Systems" and the related challenge, Co-located with EMNLP 2022.
Workshop Call for Papers | Challenge Call for Participation | Chinese Blog
Dec 18, 2021: Invited Talk "Data-efficient Multilingual and Crosslingual Speech Recognition" (数据高效的多语言与跨语言语音识别) at CCF First Seminar on Southeast-Asia Language Technology and Application (首届东南亚非通用语种处理技术与应用研讨会), Nanning.
URL | slides
Nov 9, 2021: Invited Tutorial "State-of-the-Art of End-to-End Speech Recognition" at The 6th Asian Conference on Pattern Recognition (ACPR2021), Jeju Island, Korea.
URL | slides | video (part 1) | video (part 2)
Sep 24, 2021: Invited Talk "Semi-Supervised Task-Oriented Dialog Systems and Natural Language Labeling" at UIUC NLP Seminar.
July 9, 2021: Invited Talk "Data-efficient Automatic Speech Recognition" at Apple office, Beijing.
June 5, 2021:
Chinese Blog: 2021中国移动九天人工智能新技术论坛报告——建设强泛化人机语音对话系统迈向强人工智能
Invited talk "Building Strongly Generalizable Man-Machine Spoken Dialogue Systems Towards Strong Artificial Intelligence" at China Mobile AI Forum 2021.
June 2021: Congratulations to Hong Liu: Tsinghua University Outstanding Undergraduate Thesis - "Data-efficient end-to-end dialog systems ".
May 31, 2021:
Chinese Blog: 电子系欧智坚研究组项目获国家广电MediaAIAC一等奖
Congratulations: First-class prize in Media Artificial Intelligence Application Innovation National Competition.
Mar 29, 2021:Chinese Blog: 2021全国声学大会语言声学分论坛报告—第三代语音识别技术初探
Invited talk "On exploring the third generation of speech recognition technology" at National Conference on Acoustics, Shanghai.
Mar 14, 2021: Chinese Blog: 清华SPMI课题组招聘博士后，年薪30万元起，小伙伴快砸简历
Jan 30, 2021: Invited Talk "Neural Architecture Search for End-to-end Speech Recognition via Straight-Through Gradients" at SLT2021 Children Speech Recognition Challenge (CSRC) Workshop, Virtual, 2021/1/30.
Dec 28, 2020: Chinese Blog: 十张图十句话带你快速了解SPMI2020的十篇论文
Nov 23, 2020: Chinese Blog: SPMI@EMNLP2020: 隐对话状态端到端对话模型的半监督学习
Aug 3, 2020: Chinese Blog: SPMI@ICASSP2020: 混合特征随机场语言模型及其语音识别应用
June 2020: Congratulations to Silin Gao: Tsinghua University Outstanding Undergraduate Thesis - "Mixed-feature Random Field Language Models with Applications to Speech Recognition".
June 2020: Co-chair the online workshop "Frontiers in Speech Conversation and Auditory", organized by CCF Speech Conversation and Auditory TC.
Nov. 2019: Invited Talk "Marrying Neural Networks and Undirected Graphical Models for Efficient Speech Recognition" (联合神经网络与无向图模型的高效语音识别) at ASRU2019 Code-Switching Challenge Seminar (ASRU2019中英混杂语音识别挑战赛线下技术交流会), Beijing, 2019/11/23.
Nov. 2019: We release CAT (Crf-based Asr Tookit) for data-efficient end2end speech recognition. Hope it's fun!
code at github
Oct. 2019: Co-chair the panel "Frontiers and challenges of intelligent speech processing" at CNCC (China National Computer Congress) 2019, Suzhou, China.
August 2019: Chair the session "Speech perception and recognition" at CCF Speech Conversation and Auditory TC Annual Meeting, Xining, China.
July 2019: Invited to teach a short course "Theory and Applications of Probabilistic Graphical Models" (1 credit) at Xinjiang University, China.
June 2019: Congratulations to Keyu An: Tsinghua University Outstanding Undergraduate Thesis - "Random Fields based Speech Recognition".
April 2019: Invited Talk "Toward Simple Speaker Recognition and Speech Recognition"(简洁的说话人识别及语音识别) at The Symposium on Speaker Recognition Research and Application, Kun-shan, China.
Mar 24, 2019: Chinese Blog: SPMI@ICASSP2019: 基于条件随机场打造简洁灵活的端到端语音识别
Nov. 2018: Congratulations to Yutian Li: ISCSLP (International Symposium on Chinese Spoken Language Processing) Best Student Paper Award.
Nov. 26, 2018: Tutorial "Language Modeling: State of the Art" at the 11th International Symposium on Chinese Spoken Language Processing (ISCSLP), Taipei, given by Zhijian Ou and Bin Wang.
URL | slides
Oct. 2018: Invited Talk "Neural random fields with applications to modeling of languages and images" at Apple Boston, Toyota Technological Institute at Chicago (TTIC), and UIUC (link).
July 2018: Congratulations to Bin Wang: Tsinghua University Outstanding Ph.D. Thesis.
Feb. 2018: We release the Iqiyi movie dialog dataset collected and labeled under a crowdsourcing Wizard-of-Oz framework, which is used in our paper - "Tracking of enriched dialog states for flexible conversational information access", published at ICASSP 2018.
arxiv | data readme | data download
Oct. 2017: Invited Talk "Future Artificial Intelligence with Efficient Learning" at Tsinghua University Science and Technology Forum, Su-zhou, China.
Oct. 2017: Present the PAMI paper “Learning Trans-dimensional Random Fields with Applications to Language Modeling”, at the Special Session on Journal Papers, NCMMSC (National Conference on Man-Machine Communication) 2017, Lian-yun-gang, China, by Bin Wang.
Dec. 2016: Present “The THU-SPMI SRE-16 System with Joint Bayesian Scoring and Ladder Network based Feature Learning”, at 2016 NIST Speaker Recognition Workshop (SRE2016), San Diego, USA.
pdf | teaser presentation | poster
Nov. 2016: We release SPMI Array processing toolkit, which was used in our work - "The THU-SPMI CHiME-4 system : Lightweight design with advanced multi-channel processing, feature enhancement, and language modeling", published in CHiME Workshop, 2016.
The toolkit implements a bundle of beamforming methods, including MVDR, PMWF, GSC, ME, and we compare these techniques with the same back-end. See the paper for details.
pdf | poster | SPMIArray toolkit at github
Nov. 2016: We release the 9-hour real-world audio stream recorded from a public broadcast radio in Beijing with its motif annotation, which is used in our paper - "Scalable Discovery of Audio Fingerprint Motifs in Broadcast Streams With Determinantal Point Process Based Motif Clustering", published in IEEE/ACM Transactions on Audio, Speech and Language Processing, 2016.
pdf | data readme | data download
Oct. 17, 2016: Tutorial "Undirected Graphical Models: Theory and Applications to Speech and Language Processing" at the 10th International Symposium on Chinese Spoken Language Processing (ISCSLP), Tianjin, China.
URL | slides
August 2016: We release the SPMILM toolkit at github, which not only implements TRF-LMs, but also includes third-party codes for ngram-LMs, RNN-LMs and LSTM-LMs. Example scripts are provided to evaluate different LMs by rescoring n-best lists.
SPMILM toolkit at github | doxygen documentation
July 2016: We release the code for SSP 2016 Paper "Block-Wise MAP Inference for Determinantal Point Processes with Application to Change-Point Detection" ! The change-point selection problem is elegantly addressed through DPP-based subset selection. Five real-world data experiments are shown.
pdf | poster | longer version at arxiv | code readme | code download
Oct. 2015: Present "Trans-dimensional Random Fields (TDRF) for Sequence Modeling" at SANE workshop, held at Google in New York City, NY.
We point out the potential of applying random fields for sequence modeling, demonstrated by its success in language modeling.
July 2015: We release the code for our ACL-2015 Long Paper "Trans-dimensional Random Fields for Language Modeling" ! The TRF-LMs lead to performances as good as the RNN-LMs but are computationally more efficient in computing sentence probability (200x faster).
pdf | poster | code readme | code download
June 2015: Invited Talk "Probabilistic Modeling of Speech and Language" at IBM TJ Watson Research Center, New York.
June 2015: Invited Talk "Probabilistic Modeling of Speech and Language" at Mitsubishi Electric Research Laboratories (MERL), Boston.
May 2015: Invited Talk "Probabilistic Modeling of Speech and Language" at ICSI, Berkeley.
August 2014: A brief introduction to SPMI (Speech Processing and Machine Intelligence) Lab, as the orientation material for post-graduate freshman in the EE Department.
two slides in Chinese