Zhijian Ou @ Tsinghua University

Follow @ZhijianOu

2024

April, 2024: We are organizing the 2nd FutureDial Challenge - Dialog Systems with Retrieval Augmented Generation (FutureDial-RAG), co-located with SLT 2024.
Challenge Website | Chinese Blog

March, 2024:From delivering a tutorial titled "Energy-Based Models with Applications to Speech and Language Processing" at ICASSP 2022, to being invited to write a monograph with the same title for the "Foundations and Trends® in Signal Processing" series, and now to its official publication, I am humbled to contribute a stepping stone to this promising field!
URL | arxiv

2023

Dec. 9, 2023: Organize Special Session at NCMMSC 2023: "Semi-Supervised Speech and Language AI Technologies" (Semi-SALT) (半监督智能语音语言技术), Suzhou (苏州), China.
URL

Nov. 19, 2023: Invited Talk at SpeechHome Conference on Speech Technology 2023: "Some Thoughts and Projections on Speech Foundation Models" (语音大模型的若干思考与猜测), Beijing, China.
slides

Oct. 14, 2023: Invited Talk at NLPCC 2023: "Knowledge-Retrieval Dialog Systems with Semi-Supervision" (半监督知识检索对话系统), Foshan (佛山), China.
URL

Sep. 23, 2023: Organize CCF workshop: "Multilingual Speech and Language AI Technologies" (Multilingual-SALT) (多语言智能语音语言技术), Nanning (南宁), China.

Aug. 24, 2023: Talk at 2023 Interspeech: "Knowledge-Retrieval Task-Oriented Dialog Systems with Semi-Supervision" (半监督式知识检索的任务导向对话系统), Dublin, Ireland.
video | slides

July 8, 2023: Invited talk at 2023 World Artificial Intelligence Conference (WAIC): "The philosophy of semi-supervision and alignment behind ChatGPT" (ChatGPT背后的半监督和对齐的哲学), Shanghai, China.
video

June 9, 2023: Panel Discussion at 2023 BAAI Conference for the session "Large-scale model new infrastructure and intellectual operation" (大模型新基建与智力运营), Beijing, China.

Apr 26, 2023: Invited talk at 2023 China Mobile Cloud Conference: "Progress and Future Challenges of Large-scale Human-Machine Dialog" (大模型人机对话的进展及未来挑战), Suzhou, China.
video

Mar 23, 2023: A Chinese blog: "A rigorous introduction to the progress, shortcomings and AGI challenges about ChatGPT" (严谨谈谈ChatGPT的进步、不足及AGI挑战).
URL

2022

Nov 20, 2022: Invited Tutorial (2 hours) "Data-efficient Multilingual and Crosslingual Speech Recognition" (数据高效的多语言与跨语言语音识别) at CCF Advanced Disciplines Lectures (ADL), 2022.
slides

May 22, 2022: Invited Tutorial (3.5 hours) "Energy-Based Models with Applications to Speech and Language Processing" at ICASSP 2022.
slides & videos

April 29, 2022: SereTOD Challenge Kickoff Seminar (EMNLP2022半监督和强化对话系统挑战赛发布暨研讨会).
Chinese Blog | Seminar video

Mar 25, 2022: The Department (Electronic Engineering Department, Tsinghua University) is launching Speech And Language Technologies (THUEE-SALT) Frontiers Lecture Series (语音语言技术前沿讲座).
URL

Feb, 2022: We are organizing SereTOD workshop "Towards Semi-Supervised and Reinforced Task-Oriented Dialog Systems" and its related challenge, Co-located with EMNLP 2022.
Workshop Call for Papers | Challenge Call for Participation | Chinese Blog

2021

Dec 18, 2021: Invited Talk "Data-efficient Multilingual and Crosslingual Speech Recognition" (数据高效的多语言与跨语言语音识别) at CCF First Seminar on Southeast-Asia Language Technology and Application (首届东南亚非通用语种处理技术与应用研讨会), Nanning.
URL | slides

Nov 9, 2021: Invited Tutorial (3 hours) "State-of-the-Art of End-to-End Speech Recognition" at The 6th Asian Conference on Pattern Recognition (ACPR2021), Jeju Island, Korea.
URL | slides | video (part 1) | video (part 2)

Sep 24, 2021: Invited Talk "Semi-Supervised Task-Oriented Dialog Systems and Natural Language Labeling" at UIUC NLP Seminar.
slides

July 9, 2021: Invited Talk "Data-efficient Automatic Speech Recognition" at Apple office, Beijing.
slides

June 5, 2021: Chinese Blog: 2021中国移动九天人工智能新技术论坛报告——建设强泛化人机语音对话系统迈向强人工智能
Invited talk "Building Strongly Generalizable Man-Machine Spoken Dialogue Systems Towards Strong Artificial Intelligence" at China Mobile AI Forum 2021.
video

June 2021: Congratulations to Hong Liu: Tsinghua University Outstanding Undergraduate Thesis - "Data-efficient end-to-end dialog systems ".

May 31, 2021: Chinese Blog: 电子系欧智坚研究组项目获国家广电MediaAIAC一等奖
Congratulations: First-class prize in Media Artificial Intelligence Application Innovation National Competition.

Mar 29, 2021:Chinese Blog: 2021全国声学大会语言声学分论坛报告—第三代语音识别技术初探 Part-1 Part-2.
Invited talk "On exploring the third generation of speech recognition technology" at National Conference on Acoustics, Shanghai.

Mar 14, 2021: Chinese Blog: 清华SPMI课题组招聘博士后，年薪30万元起，小伙伴快砸简历

Feb 3, 2021: Chinese Blog: SPMI@SLT2021: 基于直通梯度的高效神经结构搜索与端到端语音识别融合

Jan 30, 2021: Invited Talk "Neural Architecture Search for End-to-end Speech Recognition via Straight-Through Gradients" at SLT2021 Children Speech Recognition Challenge (CSRC) Workshop, Virtual, 2021/1/30.
slides

2020

Dec 28, 2020: Chinese Blog: 十张图十句话带你快速了解SPMI2020的十篇论文

Nov 23, 2020: Chinese Blog: SPMI@EMNLP2020: 隐对话状态端到端对话模型的半监督学习

Oct 29, 2020: Chinese Blog: SPMI@INTERSPEECH2020: 数据高效端到端语音识别，知识融合的词表示学习

Aug 3, 2020: Chinese Blog: SPMI@ICASSP2020: 混合特征随机场语言模型及其语音识别应用

June 2020: Congratulations to Silin Gao: Tsinghua University Outstanding Undergraduate Thesis - "Mixed-feature Random Field Language Models with Applications to Speech Recognition".

June 2020: Co-chair the online workshop "Frontiers in Speech Conversation and Auditory", organized by CCF Speech Conversation and Auditory TC.

2019

Nov. 2019: Invited Talk "Marrying Neural Networks and Undirected Graphical Models for Efficient Speech Recognition" (联合神经网络与无向图模型的高效语音识别) at ASRU2019 Code-Switching Challenge Seminar (ASRU2019中英混杂语音识别挑战赛线下技术交流会), Beijing, 2019/11/23.
slides

Nov. 2019: We release CAT (Crf-based Asr Tookit) for data-efficient end2end speech recognition. Hope it's fun!
code at github

Oct. 2019: Co-chair the panel "Frontiers and challenges of intelligent speech processing" at CNCC (China National Computer Congress) 2019, Suzhou, China.

August 2019: Chair the session "Speech perception and recognition" at CCF Speech Conversation and Auditory TC Annual Meeting, Xining, China.

July 2019: Invited to teach a short course "Theory and Applications of Probabilistic Graphical Models" (1 credit) at Xinjiang University, China.

June 2019: Congratulations to Keyu An: Tsinghua University Outstanding Undergraduate Thesis - "Random Fields based Speech Recognition".

April 2019: Invited Talk "Toward Simple Speaker Recognition and Speech Recognition"(简洁的说话人识别及语音识别) at The Symposium on Speaker Recognition Research and Application, Kun-shan, China.
slides

Mar 24, 2019: Chinese Blog: SPMI@ICASSP2019: 基于条件随机场打造简洁灵活的端到端语音识别

2018

Nov. 2018: Congratulations to Yutian Li: ISCSLP (International Symposium on Chinese Spoken Language Processing) Best Student Paper Award.
News

Nov. 26, 2018: Tutorial "Language Modeling: State of the Art" at the 11th International Symposium on Chinese Spoken Language Processing (ISCSLP), Taipei, given by Zhijian Ou and Bin Wang.
URL | slides

Oct. 2018: Invited Talk "Neural random fields with applications to modeling of languages and images" at Apple Boston, Toyota Technological Institute at Chicago (TTIC), and UIUC (link).

July 2018: Congratulations to Bin Wang: Tsinghua University Outstanding Ph.D. Thesis.
Thesis pdf

Feb. 2018: We release the Iqiyi movie dialog dataset collected and labeled under a crowdsourcing Wizard-of-Oz framework, which is used in our paper - "Tracking of enriched dialog states for flexible conversational information access", published at ICASSP 2018.
arxiv | data readme | data download

2017

Oct. 2017: Invited Talk "Future Artificial Intelligence with Efficient Learning" at Tsinghua University Science and Technology Forum, Su-zhou, China.

Oct. 2017: Present the PAMI paper “Learning Trans-dimensional Random Fields with Applications to Language Modeling”, at the Special Session on Journal Papers, NCMMSC (National Conference on Man-Machine Communication) 2017, Lian-yun-gang, China, by Bin Wang.
slides

2016

Dec. 2016: Present “The THU-SPMI SRE-16 System with Joint Bayesian Scoring and Ladder Network based Feature Learning”, at 2016 NIST Speaker Recognition Workshop (SRE2016), San Diego, USA.
pdf | teaser presentation | poster

Nov. 2016: We release SPMI Array processing toolkit, which was used in our work - "The THU-SPMI CHiME-4 system : Lightweight design with advanced multi-channel processing, feature enhancement, and language modeling", published in CHiME Workshop, 2016.
The toolkit implements a bundle of beamforming methods, including MVDR, PMWF, GSC, ME, and we compare these techniques with the same back-end. See the paper for details.
pdf | poster | SPMIArray toolkit at github

Nov. 2016: We release the 9-hour real-world audio stream recorded from a public broadcast radio in Beijing with its motif annotation, which is used in our paper - "Scalable Discovery of Audio Fingerprint Motifs in Broadcast Streams With Determinantal Point Process Based Motif Clustering", published in IEEE/ACM Transactions on Audio, Speech and Language Processing, 2016.
pdf | data readme | data download

Oct. 17, 2016: Tutorial "Undirected Graphical Models: Theory and Applications to Speech and Language Processing" at the 10th International Symposium on Chinese Spoken Language Processing (ISCSLP), Tianjin, China.
URL | slides

August 2016: We release the SPMILM toolkit at github, which not only implements TRF-LMs, but also includes third-party codes for ngram-LMs, RNN-LMs and LSTM-LMs. Example scripts are provided to evaluate different LMs by rescoring n-best lists.
SPMILM toolkit at github | doxygen documentation

July 2016: We release the code for SSP 2016 Paper "Block-Wise MAP Inference for Determinantal Point Processes with Application to Change-Point Detection" ! The change-point selection problem is elegantly addressed through DPP-based subset selection. Five real-world data experiments are shown.
pdf | poster | longer version at arxiv | code readme | code download

Jan. 2016: Invited Talk "Machine Intelligence with Speech and Language" at Tsinghua Data Science Institute.
URL | slides

2015

Oct. 2015: Present "Trans-dimensional Random Fields (TDRF) for Sequence Modeling" at SANE workshop, held at Google in New York City, NY. We point out the potential of applying random fields for sequence modeling, demonstrated by its success in language modeling.
poster

July 2015: We release the code for our ACL-2015 Long Paper "Trans-dimensional Random Fields for Language Modeling" ! The TRF-LMs lead to performances as good as the RNN-LMs but are computationally more efficient in computing sentence probability (200x faster).
pdf | poster | code readme | code download

June 2015: Invited Talk "Probabilistic Modeling of Speech and Language" at IBM TJ Watson Research Center, New York.
slides

June 2015: Invited Talk "Probabilistic Modeling of Speech and Language" at Mitsubishi Electric Research Laboratories (MERL), Boston.

May 2015: Invited Talk "Probabilistic Modeling of Speech and Language" at ICSI, Berkeley.
abstract

March 2015: Invited Talk "Probabilistic Modeling of Speech" at the DSP Seminar, UIUC.
slides

2014

August 2014: A brief introduction to SPMI (Speech Processing and Machine Intelligence) Lab, as the orientation material for post-graduate freshman in the EE Department.
two slides in Chinese

Zhijian Ou - News