2021

Nov 9, 2021: Invited Tutorial "State-of-the-Art of End-to-End Speech Recognition" at The 6th Asian Conference on Pattern Recognition (ACPR2021)
URL | slides | video (part 1) | video (part 2)

Sep 24, 2021: Invited Talk "Semi-Supervised Task-Oriented Dialog Systems and Natural Language Labeling" at UIUC NLP Seminar.
slides

July 9, 2021: Invited Talk "Data-efficient Automatic Speech Recognition" at Apple office, Beijing.
slides

June 5, 2021: Chinese Blog: 2021中国移动九天人工智能新技术论坛报告——建设强泛化人机语音对话系统迈向强人工智能
Invited talk "Building Strongly Generalizable Man-Machine Spoken Dialogue Systems Towards Strong Artificial Intelligence" at China Mobile AI Forum 2021. video

May 31, 2021: Chinese Blog: 电子系欧智坚研究组项目获国家广电MediaAIAC一等奖
Congratulations: First-class prize in Media Artificial Intelligence Application Innovation National Competition.

Mar 29, 2021:Chinese Blog: 2021全国声学大会语言声学分论坛报告—第三代语音识别技术初探 Part-1 Part-2.
Invited talk "On exploring the third generation of speech recognition technology" at National Conference on Acoustics, Shanghai.

Mar 14, 2021: Chinese Blog: 清华SPMI课题组招聘博士后,年薪30万元起,小伙伴快砸简历

Feb 3, 2021: Chinese Blog: SPMI@SLT2021: 基于直通梯度的高效神经结构搜索与端到端语音识别融合

Jan 30, 2021: Invited Talk "Neural Architecture Search for End-to-end Speech Recognition via Straight-Through Gradients" at SLT2021 Children Speech Recognition Challenge (CSRC) Workshop, Virtual, 2021/1/30.
slides

2020

Dec 28, 2020: Chinese Blog: 十张图十句话带你快速了解SPMI2020的十篇论文

Nov 23, 2020: Chinese Blog: SPMI@EMNLP2020: 隐对话状态端到端对话模型的半监督学习

Oct 29, 2020: Chinese Blog: SPMI@INTERSPEECH2020: 数据高效端到端语音识别,知识融合的词表示学习

Aug 3, 2020: Chinese Blog: SPMI@ICASSP2020: 混合特征随机场语言模型及其语音识别应用

June 2020: Congratulations to Silin Gao: Tsinghua University Outstanding Undergraduate Thesis - "Mixed-feature Random Field Language Models with Applications to Speech Recognition".

June 2020: Co-chair the online workshop "Frontiers in Speech Conversation and Auditory", organized by CCF Speech Conversation and Auditory TC.

2019

Nov. 2019: Invited Talk "Marrying Neural Networks and Undirected Graphical Models for Efficient Speech Recognition" (联合神经网络与无向图模型的高效语音识别) at ASRU2019 Code-Switching Challenge Seminar (ASRU2019中英混杂语音识别挑战赛线下技术交流会), Beijing, 2019/11/23.
slides

Nov. 2019: We release CAT (Crf-based Asr Tookit) for data-efficient end2end speech recognition. Hope it's fun!
code at github

Oct. 2019: Co-chair the panel "Frontiers and challenges of intelligent speech processing" at CNCC (China National Computer Congress) 2019, Suzhou, China.

August 2019: Chair the session "Speech perception and recognition" at CCF Speech Conversation and Auditory TC Annual Meeting, Xining, China.

July 2019: Invited to teach a short course "Theory and Applications of Probabilistic Graphical Models" (1 credit) at Xinjiang University, China.

June 2019: Congratulations to Keyu An: Tsinghua University Outstanding Undergraduate Thesis - "Random Fields based Speech Recognition".

April 2019: Invited Talk "Toward Simple Speaker Recognition and Speech Recognition"(简洁的说话人识别及语音识别) at The Symposium on Speaker Recognition Research and Application, Kun-shan, China.
slides

Mar 24, 2019: Chinese Blog: SPMI@ICASSP2019: 基于条件随机场打造简洁灵活的端到端语音识别

2018

Nov. 2018: Congratulations to Yutian Li: ISCSLP (International Symposium on Chinese Spoken Language Processing) Best Student Paper Award.
News

Nov. 26, 2018: Tutorial "Language Modeling: State of the Art" at the 11th International Symposium on Chinese Spoken Language Processing (ISCSLP), Taipei, given by Zhijian Ou and Bin Wang.
URL | slides

Oct. 2018: Invited Talk "Neural random fields with applications to modeling of languages and images" at Apple Boston, Toyota Technological Institute at Chicago (TTIC), and UIUC (link).

July 2018: Congratulations to Bin Wang: Tsinghua University Outstanding Ph.D. Thesis.
Thesis pdf

Feb. 2018: We release the Iqiyi movie dialog dataset collected and labeled under a crowdsourcing Wizard-of-Oz framework, which is used in our paper - "Tracking of enriched dialog states for flexible conversational information access", published at ICASSP 2018.
arxiv | data readme | data download

2017

Oct. 2017: Invited Talk "Future Artificial Intelligence with Efficient Learning" at Tsinghua University Science and Technology Forum, Su-zhou, China.

Oct. 2017: Present the PAMI paper “Learning Trans-dimensional Random Fields with Applications to Language Modeling”, at the Special Session on Journal Papers, NCMMSC (National Conference on Man-Machine Communication) 2017, Lian-yun-gang, China, by Bin Wang.
slides

2016

Dec. 2016: Present “The THU-SPMI SRE-16 System with Joint Bayesian Scoring and Ladder Network based Feature Learning”, at 2016 NIST Speaker Recognition Workshop (SRE2016), San Diego, USA.
pdf | teaser presentation | poster

Nov. 2016: We release SPMI Array processing toolkit, which was used in our work - "The THU-SPMI CHiME-4 system : Lightweight design with advanced multi-channel processing, feature enhancement, and language modeling", published in CHiME Workshop, 2016.
The toolkit implements a bundle of beamforming methods, including MVDR, PMWF, GSC, ME, and we compare these techniques with the same back-end. See the paper for details.
pdf | poster | SPMIArray toolkit at github

Nov. 2016: We release the 9-hour real-world audio stream recorded from a public broadcast radio in Beijing with its motif annotation, which is used in our paper - "Scalable Discovery of Audio Fingerprint Motifs in Broadcast Streams With Determinantal Point Process Based Motif Clustering", published in IEEE/ACM Transactions on Audio, Speech and Language Processing, 2016.
pdf | data readme | data download

Oct. 17, 2016: Tutorial "Undirected Graphical Models: Theory and Applications to Speech and Language Processing" at the 10th International Symposium on Chinese Spoken Language Processing (ISCSLP), Tianjin, China.
URL | slides

August 2016: We release the SPMILM toolkit at github, which not only implements TRF-LMs, but also includes third-party codes for ngram-LMs, RNN-LMs and LSTM-LMs. Example scripts are provided to evaluate different LMs by rescoring n-best lists.
SPMILM toolkit at github | doxygen documentation

July 2016: We release the code for SSP 2016 Paper "Block-Wise MAP Inference for Determinantal Point Processes with Application to Change-Point Detection" ! The change-point selection problem is elegantly addressed through DPP-based subset selection. Five real-world data experiments are shown.
pdf | poster | longer version at arxiv | code readme | code download

Jan. 2016: Invited Talk "Machine Intelligence with Speech and Language" at Tsinghua Data Science Institute.
URL | slides

2015

Oct. 2015: Present "Trans-dimensional Random Fields (TDRF) for Sequence Modeling" at SANE workshop, held at Google in New York City, NY. We point out the potential of applying random fields for sequence modeling, demonstrated by its success in language modeling.
poster

July 2015: We release the code for our ACL-2015 Long Paper "Trans-dimensional Random Fields for Language Modeling" ! The TRF-LMs lead to performances as good as the RNN-LMs but are computationally more efficient in computing sentence probability (200x faster).
pdf | poster | code readme | code download

June 2015: Invited Talk "Probabilistic Modeling of Speech and Language" at IBM TJ Watson Research Center, New York.
slides

June 2015: Invited Talk "Probabilistic Modeling of Speech and Language" at Mitsubishi Electric Research Laboratories (MERL), Boston.

May 2015: Invited Talk "Probabilistic Modeling of Speech and Language" at ICSI, Berkeley.
abstract

March 2015: Invited Talk "Probabilistic Modeling of Speech" at the DSP Seminar, UIUC.
slides

2014

August 2014: A brief introduction to SPMI (Speech Processing and Machine Intelligence) Lab, as the orientation material for post-graduate freshman in the EE Department.
two slides in Chinese