Talk videos can be found
here at bilibili, Chinese blogs here at zhihu :)
2024
- Yucheng Cai, Si Chen, Yuxuan Wu, Yi Huang, Junlan Feng, Zhijian Ou.
The 2nd FutureDial Challenge: Dialog Systems with Retrieval Augmented Generation (FutureDial-RAG).
SLT, 2024.
arxiv
- Yuetong Zhao, Hongyu Cao, Xianyu Zhao, Zhijian Ou.
An Empirical Study of Retrieval Augmented Generation with Chain-of-Thought.
ISCSLP, 2024.
arxiv
- Lukuan Dong, Donghong Qin, Fengbo Bai, Fanhua Song, Yan Liu, Chen Xu, Zhijian Ou.
Low-Resourced Speech Recognition for Iu Mien Language via Weakly-Supervised Phoneme-based Multilingual Pre-training.
ISCSLP, 2024.
arxiv
- Wenbo Zhao, Ziwei Li, Chuan Yu, Zhijian Ou.
CUSIDE-T: Chunking, Simulating Future and Decoding for Transducer based Streaming ASR.
ISCSLP, 2024.
arxiv
- Xiangzhu Kong, Tianqi Ning, Hao Huang, Zhijian Ou.
A Streaming Multi-Channel End-to-End Speech Recognition System with Realistic Evaluations.
ISCSLP, 2024.
arxiv
- Saierdaer Yusuyin, Te Ma, Hao Huang, Wenbo Zhao, Zhijian Ou.
Whistle: Data-Efficient Multilingual and Crosslingual Speech Recognition via Weakly Phonetic Supervision.
Arxiv 2406.02166.
arxiv
- Yucheng Cai, Wentao Ma, Yuchuan Wu, Shuzheng Si, Yuan Shao, Zhijian Ou, Yongbin Li.
UniPCM: Universal Pre-trained Conversation Model with Task-aware Automatic Prompt.
LREC-COLING, 2024.
pdf
- Zhijian Ou.
Energy-Based Models with Applications to Speech and Language Processing.
Foundations and Trends® in Signal Processing, Vol. 18, No. 1-2, pp 1-199, Now Publishers.
URL |
arxiv
2023
- Qiyan Song, Zhijian Ou.
Modified Frequency-Sliding Generalized Cross Correlation for Time Delay Difference Estimation of Microphone Array.
IEEE Sensors Journal, 2023, Volume 23, Issue 24, Page 31038-31049.
URL
- Hong Liu, Yucheng Cai, Yuan Zhou, Zhijian Ou, Yi Huang, Junlan Feng .
Prompt Pool Based Class-Incremental Continual Learning for Dialog State Tracking.
ASRU, 2023.
pdf |
arxiv
- Xinwei Zhang, Zhiqiang Tan, Zhijian Ou.
Persistently Trained, Diffusion-assisted Energy-based Models.
Stat, 2023.
URL |
arxiv
- Yucheng Cai, Hong Liu, Zhijian Ou, Yi Huang, Junlan Feng.
Knowledge-Retrieval Task-Oriented Dialog Systems with Semi-Supervision.
INTERSPEECH, 2023.
pdf |
arxiv |
slides |
video
- Hong Liu, Zhaobiao Lv, Zhijian Ou, Wenbo Zhao, Qing Xiao.
Exploring Energy-based Language Models with Different Architectures and Training Methods for Speech Recognition.
INTERSPEECH, 2023.
pdf |
arxiv |
slides
- Hong Liu, Yucheng Cai, Zhenru Lin, Zhijian Ou, Yi Huang, Junlan Feng.
Variational Latent-State GPT for Semi-Supervised Task-Oriented Dialog Systems.
IEEE/ACM Transactions on Audio, Speech and Language Processing, 2023, Vol. 31, Page 970-984.
URL |
arxiv
2022
- Hong Liu, Hao Peng, Zhijian Ou, Juanzi Li, Yi Huang, Junlan Feng.
Information Extraction and Human-Robot Dialogue towards Real-life Tasks: A Baseline Study with the MobileCS Dataset.
EMNLP 2022 SereTOD Workshop.
pdf |
arxiv |
code at github
- Hong Liu, Yucheng Cai, Zhijian Ou, Yi Huang, Junlan Feng.
A Generative User Simulator with GPT-based Architecture and Goal State Tracking for Reinforced Multi-Domain Dialog Systems.
EMNLP 2022 SereTOD Workshop.
pdf |
arxiv |
code at github
- Hong Liu, Yucheng Cai, Zhijian Ou, Yi Huang, Junlan Feng.
Building Markovian Generative Architectures over Pretrained LM Backbones for Efficient Task-Oriented Dialog Systems.
SLT, 2022.
pdf |
arxiv |
code at github
- Keyu An, Ji Xiao, Zhijian Ou.
Exploiting Single-Channel Speech for Multi-Channel End-to-End Speech Recognition: A Comparative Study.
ISCSLP, 2022.
pdf |
arxiv |
code at github
- Yucheng Cai, Hong Liu, Zhijian Ou, Yi Huang, Junlan Feng.
Advancing Semi-Supervised Task Oriented Dialog Systems by JSA Learning of Discrete Latent Variable Models.
SIGDIAL, 2022.
pdf |
arxiv |
code at github
- Huahuan Zheng, Keyu An, Zhijian Ou, Chen Huang, Ke Ding, Guanglu Wan.
An Empirical Study of Language Model Integration for Transducer based Speech Recognition.
INTERSPEECH, 2022.
pdf |
arxiv |
code at CAT toolkit
- Keyu An, Huahuan Zheng, Zhijian Ou, Hongyu Xiang, Ke Ding, Guanglu Wan.
CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR.
INTERSPEECH, 2022.
pdf |
arxiv |
code at CAT toolkit
- Zhijian Ou, Junlan Feng, Juanzi Li, Yakun Li, Hong Liu, Hao Peng, Yi Huang, Jiangjiang Zhao.
A Challenge on Semi-Supervised and Reinforced Task-Oriented Dialog Systems.
arxiv |
challenge website
- Hong Liu, Zhijian Ou, Yi Huang, Junlan Feng.
Jointly Reinforced User Simulator and Task-oriented Dialog System with Simplified Generative Architecture.
An early version of Markovian Generative Architectures (MGA) and Generative User Simulator (GUS)
arxiv
2021
- Huahuan Zheng*, Wenjie Peng*, Zhijian Ou, Jinsong Zhang. (* Equal contribution and random listing)
Advancing CTC-CRF Based End-to-End Speech Recognition with Wordpieces and Conformers.
arxiv |
code at CAT toolkit
- Dai, Yinpei, Yichi Zhang, Hong Liu, Zhijian Ou, Yi Huang, and Junlan Feng.
Elastic CRFs for Open-Ontology Slot Filling.
Applied Sciences, vol.11, 2021. (Selected Papers from 16th National Conference on Man-Machine Speech Communication (NCMMSC2021))
URL |
pdf
- Chengrui Zhu, Keyu An, Huahuan Zheng, Zhijian Ou.
Multilingual and crosslingual speech recognition using phonological-vector based phone embeddings.
IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2021.
pdf |
slides |
video |
code at CAT toolkit |
arxiv
- Yunfu Song, Huahuan Zheng, Zhijian Ou.
An empirical comparison of joint-training and pre-training for domain-agnostic semi-supervised learning via energy-based models.
IEEE Workshop on Machine Learning for Signal Processing (MLSP), 2021.
pdf |
slides |
video |
code at github
- Keyu An, Yi Zhang, Zhijian Ou.
Deformable TDNN with adaptive receptive fields for speech recognition.
INTERSPEECH, 2021.
pdf |
code at CAT toolkit
- Huahuan Zheng, Keyu An, Zhijian Ou.
Efficient Neural Architecture Search for End-to-end Speech Recognition via Straight-Through Gradients.
SLT, 2021.
pdf |
arxiv |
code at github
- Fan Yu, Zhuoyuan Yao, Xiong Wang, Keyu An, Lei Xie, Zhijian Ou, Bo Liu, Xiulin Li, Guanqiong Miao.
The SLT 2021 children speech recognition challenge: Open datasets, rules and baselines.
SLT, 2021.
pdf |
arxiv |
challenge website
2020
- Yichi Zhang, Zhijian Ou, Huixin Wang, Junlan Feng.
A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief States towards Semi-Supervised Learning.
EMNLP, 2020.
pdf |
arxiv |
code at github
- Keyu An, Hongyu Xiang. Zhijian Ou.
CAT: A CTC-CRF based ASR Toolkit Bridging the Hybrid and the End-to-end Approaches towards Data Efficiency and Low Latency.
INTERSPEECH, 2020.
pdf |
arxiv |
code at github
- Yichi Zhang, Yinpei Dai, Zhijian Ou, Huixin Wang, Junlan Feng.
Improved Learning of Word Embeddings with Word Definitions and Semantic Injection.
INTERSPEECH, 2020.
pdf |
code at github
- Zhijian Ou, Yunfu Song.
Joint Stochastic Approximation and Its Application to Learning Discrete Latent Variable Models.
UAI, 2020.
pdf |
arxiv |
presentation video |
code at github
- Silin Gao, Yichi Zhang, Zhijian Ou and Zhou Yu.
Paraphrase Augmented Task-Oriented Dialog Generation.
ACL, 2020.
pdf |
code at github
- Yunfu Song, Zhijian Ou, Zitao Liu, Songfan Yang.
Upgrading CRFs to JRFs and its benefits to sequence modeling and labeling.
ICASSP, Barcelona, Spain, 2020.
URL |
pdf |
slides
- Silin Gao, Zhijian Ou, Wei Yang, Huifang Xu.
Integrating discrete and neural features via mixed-feature trans-dimensional random field language models.
ICASSP, Barcelona, Spain, 2020. (oral)
URL |
pdf |
slides
- Keyu An, Hongyu Xiang. Zhijian Ou.
CAT: CRF-based ASR Toolkit.
arxiv |
code at github
- Yichi Zhang, Zhijian Ou, Zhou Yu.
Task-Oriented Dialog Systems that Consider Multiple Appropriate Responses under the Same Context.
AAAI, New York, USA, 2020.
pdf |
code at github
- Yunfu Song, Zhijian Ou.
Generative Modeling by Inclusive Neural Random Fields with Applications in Image Generation and Anomaly Detection.
arxiv |
code at github
2019
- Zhiqiang Tan, Yunfu Song, Zhijian Ou.
Calibrated Adversarial Algorithms for Generative Modeling.
Stat, 2019.
URL |
pdf
- Yunfu Song, Zhijian Ou.
Semi-supervised Seq2seq Joint-stochastic-approximation Autoencoders with Applications to Semantic Parsing.
IEEE Signal Processing Letters, vol. 27, p.31-35, 2019.
URL |
pdf |
code at github
- Hongyu Xiang, Zhijian Ou.
CRF-based Single-stage Acoustic Modeling with CTC Topology.
ICASSP, Brighton, UK, 2019.
URL |
pdf |
slides |
oral video |
code at github
- Kai Hu, Zhijian Ou, Min Hu, Junlan Feng.
Neural CRF Transducers for Sequence Labeling.
ICASSP, Brighton, UK, 2019.
URL |
pdf |
poster |
SPMISeq toolkit at github
2018
- Zhijian Ou.
A Review of Learning with Deep Generative Models from Perspective of Graphical Modeling.
arxiv
- Yunfu Song, Zhijian Ou.
Learning Neural Random Fields with Inclusive Auxiliary Generators.
arxiv |
code at github
- Yutian Li, Zhijian Ou.
THU-SPMI System For NIST 2018 Speaker Recognition Evaluation.
NIST SRE-18 Workshop, Athens, Greece, 2018 Dec.
poster
- Bin Wang, Zhijian Ou.
Improved training of neural trans-dimensional random field language models with dynamic noise-contrastive estimation.
IEEE Workshop on Spoken Language Technology (SLT), Athens, Greece, 2018 Dec.
pdf |
poster |
arxiv
- Zhangyu Xiao, Zhijian Ou, Wei Chu, Hui Lin.
Hybrid CTC-Attention based End-to-End Speech Recognition using Subword Units.
International Symposium on Chinese Spoken Language Processing (ISCSLP), Taipei, 2018 Nov.
pdf |
slides |
arxiv
- Yutian Li, Feng Gao, Zhijian Ou, Jiasong Sun.
Angular Softmax Loss for End-to-end Speaker Verification. (Best Student Paper Award)
International Symposium on Chinese Spoken Language Processing (ISCSLP), Taipei, 2018 Nov.
pdf |
slides |
arxiv
- Yichi Zhang, Zhijian Ou.
Learning Sparse Structured Ensembles With Stochastic Gradient MCMC Sampling and Network Pruning.
IEEE Workshop on Machine Learning for Signal Processing (MLSP), Aalborg, Denmark, 2018 Sept.
pdf |
slides |
longer version at arxiv
- Yinpei Dai, Zhijian Ou, Dawei Ren, Pengfei Yu.
Tracking of enriched dialog states for flexible conversational information access.
ICASSP, Calgary, Canada, 2018.
pdf |
poster |
data readme |
data download |
code at github
- Bin Wang, Zhijian Ou.
Learning neural trans-dimensional random field language models with noise-contrastive estimation.
ICASSP, Calgary, Canada, 2018.
pdf |
poster
- Bin Wang, Zhijian Ou, Zhiqiang Tan.
Learning Trans-dimensional Random Fields with Applications to Language Modeling.
IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2018, 40(4):876-890.
pdf (including appendices) |
SPMILM toolkit at github |
doxygen documentation
2017
- Bin Wang, Zhijian Ou.
Language modeling with Neural trans-dimensional random fields.
IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Okinawa, Japan, 2017 Dec.
pdf |
poster
- Yiyan Wang, Haotian Xu, Zhijian Ou.
Joint Bayesian Gaussian discriminant analysis for speaker verification.
ICASSP, New Orleans, USA, 2017 Mar.
pdf |
poster
2016
- Yiyan Wang, Haotian Xu, Zhijian Ou.
The THU-SPMI SRE-16 System with Joint Bayesian Scoring and Ladder Network based Feature Learning.
NIST SRE-16 Workshop, San Diego, USA, 2016 Dec.
pdf |
teaser presentation |
poster
- Hongyu Xiang, Bin Wang and Zhijian Ou.
The THU-SPMI CHiME-4 system : Lightweight design with advanced multi-channel processing, feature enhancement, and language modeling.
CHiME Workshop, San Francisco, USA, 2016 Sept.
pdf |
poster |
SPMIArray toolkit at github
- Bin Wang, Zhijian Ou, Yong He, Akinori Kawamura.
Model Interpolation with Trans-dimensional Random Field Language Models for Speech Recognition.
arxiv
- Haotian Xu, Zhijian Ou.
Scalable Discovery of Audio Fingerprint Motifs in Broadcast Streams With Determinantal Point Process Based Motif Clustering.
IEEE/ACM Transactions on Audio, Speech and Language Processing, 2016, 24(5).
pdf |
data readme |
data download
- Haotian Xu, Zhijian Ou.
Joint Stochastic Approximation Learning of Helmoltz Machines.
International Conference on Learning Representations (ICLR) 2016 Workshop Track, Puerto Rico, USA, 2016 May.
pdf |
poster
- Ruobai Wang, Yang Zhang, Zhijian Ou and Mark Hasegawa-Johnson.
Use of Particle Filtering and MCMC for Inference in Probabilistic Acoustic Tube Model.
IEEE Workshop on Statistical Signal Processing (SSP), Palma de Mallorca, Spain, 2016 June.
pdf |
poster
- Jinye Zhang, Zhijian Ou.
Block-Wise MAP Inference for Determinantal Point Processes with Application to Change-Point Detection.
IEEE Workshop on Statistical Signal Processing (SSP), Palma de Mallorca, Spain, 2016 June.
pdf |
poster |
longer version at arxiv |
code readme |
code download
2015
- Yang Zhang, Zhijian Ou and Mark Hasegawa-Johnson.
Incorporating AM-FM effect in voiced speech for probabilistic acoustic tube model.
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, USA, 2015 Oct.
pdf |
poster
- Bin Wang, Zhijian Ou and Zhiqiang Tan.
Trans-dimensional Random Fields for Language Modeling.
Annual Meeting of the Association for Computational Linguistics (ACL Long Paper), Beijing, China, 2015 July.
pdf |
poster |
code readme |
code download
2014
- Yang Zhang, Zhijian Ou, Mark Hasegawa-Johnson.
Improvement of Probabilistic Acoustic Tube Model for Speech Decomposition.
ICASSP, Florence, Italy, 2014 May.
pdf |
poster
- Bin Wang, Zhijian Ou, Jian Li, Akinori Kawamura.
Joint-Character-POC N-gram Language Modeling For Chinese Speech Recognition.
International Symposium on Chinese Spoken Language Processing (ISCSLP), Singapore, 2014 Sept.
pdf |
slides
2012
- Xin He, Zhijian Ou, Jiasong Sun.
Joint N-gram Chinese Language Modeling with an Application to Chinese Word Segmentation.
IEEE International Conference on Audio, Language and Image Processing (ICALIP), Shanghai, 2012.
pdf
- Zhijian Ou, Yang Zhang.
Probabilistic Acoustic Tube: A Probabilistic Generative Model of Speech for Speech Analysis/Synthesis.
International Conference on Artificial Intelligence and Statistics (AISTATS), La Palma, Spain, 2012 Apr.
pdf |
appendix |
demo |
poster
- Zhijian Ou, Huaqing Luo.
CRF-based Confidence Measures of Recognized Candidates for Lattice-based Audio Indexing.
ICASSP, Kyoto, Japan, 2012 Mar.
pdf |
poster
- Zhijian Ou, Kan Deng.
Combining Eigenvoice Speaker Modeling and VTS-based Environment Compensation for Robust Speech Recognition.
ICASSP, Kyoto, Japan, 2012 Mar.
pdf |
poster
2011
- Yun Wang, Zhijian Ou.
Combining HMM-based Melody Extraction and NMF-based Soft Masking for Separating Voice and Accompaniment from Monaural Audio.
ICASSP, Prague, Czech, 2011,5.
pdf |
oral video |
demo |
github code
2010
- Zhijian Ou, Ji Xiao.
A Study of Large Vocabulary Speech Recognition Decoding Using Finite-state Graphs.
International Symposium on Chinese Spoken Language Processing (ISCSLP), Tainan, Taiwan, 2010 Dec.
pdf
- Yimin Tan, Zhijian Ou.
Topic-weak-correlated Latent Dirichlet Allocation.
International Symposium on Chinese Spoken Language Processing (ISCSLP), Tainan, Taiwan, 2010 Dec.
pdf |
poster |
code readme |
code download
- Yi Sun, Zhijian Ou, Wei Hu, Yimin Zhang.
Excited Commentator Speech Detection with Unsupervised Model Adaptation for Soccer Highlight Extraction.
IEEE International Conference on Audio, Language and Image Processing (ICALIP), Shanghai, 2010 Nov.
pdf |
slides
- Qin Shi, Kun Li, Shilei Zhang, Stephen Chu, Ji Xiao, Zhijian Ou.
Spoken English assessment system for non-native speakers using acoustic and prosodic features.
INTERSPEECH, Makuhari, Japan, 2010,9.
pdf
- Nan Ding, Zhijian Ou.
Variational nonparametric Bayesian hidden Markov model.
ICASSP, Dallas, USA, 2010,3.
pdf |
poster
2008
- Zhijian Ou, Jun Luo.
Eigenspace Estimation with Missing Values and its Application to Eigenvoice Adaptation for Speech Recognition.
IEEE International Conference on Audio, Language and Image Processing (ICALIP), Shanghai, 2008 July.
pdf
- Cong Li, Zhijian Ou, Wei hu, Tao Wang, Yimin Zhang.
Caption-aided speech detection in videos.
ICASSP, Las Vegas, USA, 2008 Apr.
pdf |
poster
- 孙怿, 欧智坚, 胡炜.
利用无监督自适应的兴奋解说检测和体育比赛精彩片断提取.
计算机应用与软件, 2008, 25(11): 20~22.
2007
- Zhijian Ou, Jun Luo.
Latent correlation analysis of HMM parameters for speech recognition.
ICASSP, Hawaii, USA, 2007,4.
pdf |
poster
- Hui Lin, Zhijian Ou.
Switching Auxiliary Chains for Speech Recognition.
IEEE Signal Processing Letters, 2007, 14(8): 568~571.
pdf
- Xianyu Zhao, Zhijian Ou.
Closely Coupled Array Processing and Model-Based Compensation for Microphone Array Speech Recognition.
IEEE Transactions on Audio, Speech and Language Processing, 2007, 15(3): 1114~1122.
pdf
- 孙怿, 欧智坚, 孙甲松.
图模型推理的层次消息传递算法.
2007年全国模式识别学术会议, 北京, 2007,12. (科学出版社)
pdf
2006
- Hui Lin, Zhijian Ou.
Switching Auxiliary Chains for Speech Recognition based on Dynamic Bayesian Networks.
International Conference on Pattern Recognition (ICPR), HongKong, 2006 Aug.
pdf
- Hui Lin, Zhijian Ou.
Partial-tied-mixture Auxiliary Chain Models for Speech Recognition Based on Dynamic Bayesian Networks.
IEEE International Conference on Systems, Man and Cybernetics (SMC), Taipei, Taiwan, 2006 Oct.
pdf
- Hui Lin, Zhijian Ou, Xi Xiao.
Generalized Time-series Active Search with Kullback-Leibler Distance for Audio Fingerprinting.
IEEE Signal Processing Letters, 2006, 13(8): 464~468.
pdf
- Jun Luo, Zhijian Ou, Zuoying Wang.
Eigenvoice-based MAP adaptation within correlation subspace.
Frontiers of Electrical and Electronic Engineering in China - Selected Publications from Chinese Universities, Higher Education Press and Springer-Verlag, 2006, Vol.1, No.2: 130~134.
- Xianyu Zhao, Zhijian Ou, Zuoying Wang.
Using Vector Taylor Series with Noise Clustering for Speech Recognition in Non-stationary Noisy Environments.
High Technology Letters, 2006, 12(1): 18~23.
pdf
- 罗骏, 欧智坚.
一种高效的语音关键词检索系统.
通信学报, 2006, 27(2): 113~118.
pdf
2005
- Jun Luo, Zhijian Ou, Zuoying Wang.
Discriminative Speaker Adaptation with Eigenvoices.
EUROSPEECH, Lisbon, Portugal, 2005,9.
pdf
- Xianyu Zhao, Zhijian Ou, Minhua Chen, Zuoying Wang.
Closely coupled array processing and model-based compensation for microphone array speech recognition.
ICASSP, Philadelphia, USA, 2005,3.
pdf
- Xianyu Zhao, Zhijian Ou, Zuoying Wang.
Space Discriminative Function For Microphone Array Robust Speech Recognition.
High Technology Letters, 2005, 11(4): 351~354.
pdf
- 王晶莹,王作英,欧智坚.
最大似然线性回归说话人自适应算法在LPHMM中的应用. (会议优秀论文)
第八届全国人机语音通讯学术会议(汇编于: 声学技术, Vol.24: 206~209), 北京, 2005,10.
- 罗骏, 欧智坚.
一种高效的语音关键词检索系统. (推荐发表到通信学报)
全国网络与信息安全技术研讨会(NetSec2005), 北京, 2005,8, Vol.2, 121~128.
- 罗骏, 欧智坚, 王作英.
基于拼音图的两阶段关键词检索系统.
清华大学学报, 2005, 45(10): 1356~1359.
- 赵贤宇, 欧智坚, 王作英.
基于VTS的稳健语音识别的研究.
清华大学学报, 2005, 45(7): 892~895.
- 肖述才, 欧智坚, 王作英.
语音识别中的说话人聚类算法.
中文信息学报, 2005, 19(4): 84~88.
2004
- Zhijian Ou, Zuoying Wang.
Discriminative combination of multiple linear predictions for speech recognition.
ICSLP, Jeju, Korea, 2004 Oct.
pdf |
slides
- 欧智坚, 罗骏, 谢达东, 赵贤宇, 林晖, 王作英.
多功能语音/音频信息检索系统的研究与实现.
全国网络与信息安全技术研讨会(NetSec2004), 北京, 2004,8. 106~112.
- 赵贤宇, 欧智坚, 王作英.
稳健语音识别中利用词图的无监督矢量泰勒级数展开算法研究.
高技术通讯, 2004, 14: 17~22.
- 罗骏, 欧智坚, 王作英.
基于相关子空间本征音分析的MAP快速自适应算法.
清华大学学报, 2004, 44(6): 829~832.
pdf
- 罗骏, 欧智坚, 王作英.
说话人自适应训练方法在连续语音识别中的应用.
中文信息学报, 2004, 18(3): 61~65.
2003
- 欧智坚, 王作英.
汉语连续语音识别中多项式拟合语音轨迹模型的研究.
电子学报, 2003, 31(4): 608~611.
pdf
2002
- Zhijian Ou, Zuoying Wang.
A combined model of statics-dynamics of speech optimized using maximum mutual information.
ICSLP, Denver, Colorado, 2002,9. 2629~2632.
pdf
- Zhijian Ou, Zuoying Wang.
A new combined model of statics-dynamics of speech.
ICASSP, Orlando, Florida, 2002,5. 965~968.
pdf |
poster
- 欧智坚, 王作英.
从线性预测HMM到一种新的语音识别的混合模型.
电子学报, 2002, 30(9): 1313~1316.
pdf
2001
- Zhijian Ou, Zuoying Wang.
A New DP-like Speaker Clustering Algorithm.
EUROSPEECH, Aalborg, Denmark, 2001,9. 791~794.
pdf
- 欧智坚, 王作英.
一种基于DDBHMM的利用帧间相关性的混合模型.
第六届全国人机语音通讯学术会议, 深圳, 2001,11. 277~280.