电子工程系

Department of Electronic Engineering


    吴及 ,博士 教授

    中国北京市清华大学电子工程系 100084
    电话: +86-10-62781706
    传真:+86-10-62770317
    电子邮箱: wuji_ee@tsinghua.edu.cn

 


        吴及,清华大学电子工程系副系主任,长聘教授,博士生导师。

        1996年和2001年在清华大学电子工程系获得工学学士和博士学位,2013年至2015年在美国佐治亚理工学院担任访问学者。主要从事人工智能,机器学习,自然语言处理,模式识别,数据挖掘等领域的研究工作。从2006起担任清华-讯飞联合实验室主任。现在为IEEE高级会员,中国语音产业联盟技术工作组组长。

        承担国家重点研发计划,863,国家自然科学基金,工信部电子发展基金等多项国家科研项目。参加的项目“智能语音交互关键技术及应用开发平台”于2011年获国家科技进步二等奖。负责的项目“面向海量语音数据的识别、检索和内容分析技术及其应用”获2014年度北京市科学技术奖一等奖。2018年担任国家“数字诊疗装备研发”重点专项项目“大众医疗健康医学人工智能管理服务模式”项目负责人。已在Nature Communications, IEEE Trans. on ASLP,AAAI,ACL等重要学术期刊和学术会议上发表论文近一百二十篇。

 

教育背景

1996年毕业于清华大学电子工程系无线电技术与信息系统专业,获工学学士学位
2001年于清华大学电子工程系获得信号与信息处理专业博士学位


工作履历

2001至今 清华大学电子工程系


学术兼职

2018年5月至今,中国计算机学会语音对话与听觉专业组委员
2015年2月,IEEE高级会员
2009年8月至今,全国人机语音通讯学会会议常设机构委员会(NCMMSC Standing Committee)委员
IEEE BigComp2019,Workshop Chair
ISCSLP2018,Session Chair
ICBSB 2018,技术委员会委员
IEEE MLSP 2018,程序委员会委员
IEEE ISCSLP2008,2010,2012,2014,2016    程序委员会委员
IEEE SLT2014,panelist for panel discussion "Next generation SLT scientists and engineers"


社会兼职

2018年7月至今,认知智能国家重点实验室学术委员会委员
2016年9月至今,口腔数字化医疗技术和材料国家工程实验室第二届技术委员会,委员
2012年8月至今,中国语音产业联盟,技术工作组组长
2004年4月至今,工业与信息化部中文语音交互技术标准工作组,成员


研究领域

机器学习
知识工程
人机交互
医学文本理解
医学影像识别
数据挖掘和模式识别


研究概况

2018.8-2021.6:科技部重点研发计划“大众医疗健康医学人工智能管理服务模式”,2018YFC0116800,项目负责人
2015.6-2019.6:清华-讯飞语音技术联合实验室(四期),安徽科大讯飞信息科技股份有限公司
2016.1-2019.12:国家自然科学基金面上项目“音频事件检测技术研究”,61571266,项目负责人
2012.1-2015.12:国家自然科学基金面上项目“中文自动口语摘要技术研究”,61170197, 项目负责人
2012.9-2014.12:863计划项目子课题“海量非结构化数据的集成管理和分析,舆情分析示范应用”2012AA011004,研究骨干
2012.6-2015.6:清华-讯飞语音技术联合实验室(三期),安徽科大讯飞信息科技股份有限公司
2009.5-2012.5:清华-讯飞语音技术联合实验室(二期),安徽科大讯飞信息科技股份有限公司
2006.2-2009.2:清华-讯飞语音技术联合实验室(一期),安徽科大讯飞信息科技股份有限公司
2012.2- 2013.2:语音识别联合研发项目,腾讯科技(深圳)有限公司
2012.8- 2012.12:文本库采集及语言模型训练,北京三星通信技术研究有限公司
2006.6.1-2010.10:863计划十一五重点项目“多语言语音合成关键技术研究与应用产品开发“子课题“基于统计建模的个性化语音合成技术研究”,2006AA010104,课题负责人
2006.11-2008.12:“863”面上项目探索导向类课题“基于内容的高性能语音搜索技术探索研究”,2006AA01Z149,项目负责人,
2004.4-2005.5:鲁棒语音识别技术研究,北京东芝研究中心
2001-2003:“863”计划项目“智能化中文语音信息处理平台”,2001AA114071,项目负责人


奖励与荣誉

2018年,面向海量语音及网络文本的分析技术及应用,获中国产学研合作创新成果奖, 二等奖。
2014年,面向海量语音数据的识别、检索和内容分析技术研发及应用,北京市科学技术奖,一等奖
2011年,国家科学技术进步二等奖(个人排名:第8)

 

学术成果

Journal Papers:

[1] Wu J, Liu X, Zhang X, et al. Master clinical medical knowledge at certificated-doctor-level with deep learning model[J]. Nature Communications, 2018, 9(1): 4352.
[2] Dong F, Tao C, Wu J, et al. Detection of cervical lymph node metastasis from oral cavity cancer using a non-radiating, noninvasive digital infrared thermal imaging system[J]. Scientific reports, 2018, 8(1): 7219.
[3] Zhang T, Wu J. Learning long-term filter banks for audio source separation and audio scene classification[J]. EURASIP Journal on Audio, Speech, and Music Processing, 2018, 2018(1): 4.
[4] Zhang T, Wu J. Discriminative frequency filter banks learning with neural networks. EURASIP Journal on Audio, Speech, and Music Processing. 2019 Dec;2019(1):1.
[5] Zhang, K., Wu, J., Chen, H., & Lyu, P. (2018). An effective teeth recognition method using label tree with cascade network structure. Computerized Medical Imaging and Graphics, 68, 61-70.
[6] Shi J, Wu J, Lv P, et al. BreastNet: Entropy-Regularized Transferable Multi-task Learning for Classification with Limited Breast Data[J]. International Journal of Bioscience, Biochemistry and Bioinformatics, 2018, 9(1)
[7] He, Z., Wu, J., & Lv, P. (2017). Multi-label text classification based on the label correlation mixture model. Intelligent Data Analysis, 21(6), 1371-1392.
[8] Chen, Q., Wu, J., Li, S., Lyu, P., Wang, Y., & Li, M. (2016). An ontology-driven, case-based clinical decision support model for removable partial denture design. Scientific reports, 6, 27855.
[9] Wu J, Li M, Lee C H. A probabilistic framework for representing dialog systems and entropy-based dialog management through dynamic stochastic state evolution[J]. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), 2015, 23(11): 2026-2035.
[10] He Z, Wu J, Li T. Label correlation mixture model: a supervised generative approach to multilabel spoken document categorization[J]. IEEE Transactions on Emerging Topics in Computing, 2015, 3(2): 235-245.
[11] Zhang X L, Wu J. Deep belief networks based voice activity detection[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2013, 21(4): 697-710.
[12] Yang L, Wu J, Lv P. Construction Algorithm of Sub-word Unit in Speech Retrieval[J]. Computer Engineering, 2012, 38(24): 251-253.
[13] Zhang X L, Wu J. Linearithmic time sparse and convex maximum margin clustering[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2012, 42(6): 1669-1692.
[14] YU X, WU J, KONG F, et al. Fusing Multi-information for Automatic Story Segmentation of Broadcast News[J]. Journal of Chinese Information Processing, 2012 (2): 21.
[15] Wu J, Zhang X L. Sparse Kernel Maximum Margin Clustering[J]. EN,2011,21(6).
[16] Li Wei,Wu Ji, Ping Lv. Query expansion based high performance Chinese voice retrieval,Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, August 2011,24,(4),pp561-566.
[17] Zhang X, Ji W U, Ping L V. Support vector machine based VAD using the multiple observation compound feature[J]. Journal of Tsinghua University, 2011, 51(9):1209-1214.
[18] Wu, Ji, and Xiao-Lei Zhang. "Efficient multiple kernel support vector machine based voice activity detection." IEEE Signal Processing Letters 18.8 (2011): 466-469.
[19] Wu, J., & Zhang, X. L. (2011). An efficient voice activity detection algorithm by combining statistical model and energy detection. EURASIP Journal on Advances in Signal Processing, 2011(1), 18.
[20] Wu, J., & Zhang, X. L. (2011). Maximum margin clustering based statistical VAD with multiple observation compound feature. IEEE Signal Processing Letters, 18(5), 283-286.
[21] 李伟, 吴及, 吕萍. 低空间复杂度的加权有限状态转换器合成算法. 计算机应用研究, 28 (8), 2931-2934,2011.
[22] 李伟, 吴及, 吕萍. 面向海量数据的语音敏感信息检测系统[J]. 信息工程大学学报, 2010, 11(5).
[23] 李伟, 吴及, 吕萍. 基于前后向语言模型的语音识别词图生成算法[J]. 计算机应用, 2010, 30(10):2563-2566.
[24] 苏腾荣, 吴及, 王作英. 基于空间相关性变换的声学模型训练[J]. 电子与信息学报, 2010, 32(4):1003-1007.
[25] 苏腾荣, 吴及, 王作英, et al. 利用空间相关性的改进HMM模型[J]. 计算机工程与设计, 2010, 31(5):1023-1026.

Conference Papers:

[1] Yu Hao, Xien Liu, Ji Wu et Exploiting Sentence Embedding for Medical Question Answering, AAAI 2019(Accept rate: 16.2%)
[2] Zhang T, Zhang K, Wu J. Temporal transformer networks for acoustic scene classification[J]. Proc. Interspeech 2018, 2018: 1349-1353.
[3] Zhang T, Zhang K, Wu J. Data independent sequence augmentation method for acoustic scene classification, Interspeech 2018.
[4] Zhang T, Zhang K, Wu J. Multi-modal attention mechanisms in LSTM and its application to acoustic scene classification, Interspeech 2018.
[5] Zhang, X., Wu, J., He, Z., Liu, X., & Su, Y. (2018, April). Medical exam question answering with large-scale reading comprehension. In Thirty-Second AAAI Conference on Artificial Intelligence.
[6] Zhang, T., Zhou, X., & Wu, J. (2018, July). Dropframe Scheme in Recurrent Neural Networks for Time Series Modeling. In 2018 International Conference on Audio, Language and Image Processing (ICALIP) (pp. 355-360). IEEE.
[7] Li, M., & Wu, J. (2017). The MSIIP system for dialog state tracking challenge 4. In Dialogues With Social Robots (pp. 465-474). Springer, Singapore.
[8] Chen, Z., & Wu, J. (2017). A Rescoring Approach for Keyword Search Using Lattice Context Information. INTERSPEECH2017(pp. 3592-3596).
[9] Wang, H. D., Zhang, T., & Wu, J. (2017). The monkeytyping solution to the youtube-8m video understanding challenge. CVPR 2017 Workshop on YouTube-8M Large-Scale Video Understanding.
[10] He, Z., Liu, X., Lv, P., & Wu, J. (2016). Hidden softmax sequence model for dialogue structure analysis. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (Vol. 1, pp. 2063-2072).
[11] Zhang T, Chen Z, Wu J, et al. Objective Evaluation Methods for Chinese Text-To-Speech Systems[C]//INTERSPEECH. 2016: 332-336.
[12] Wang, H. D., & Wu, J. (2015, December). Collaborative filtering of call for papers. In 2015 IEEE Symposium Series on Computational Intelligence (pp. 963-970). IEEE.
[13] Wang, H. D., & Wu, J. (2015, December). Optimizing seed set for new user cold start. In 2015 IEEE Symposium Series on Computational Intelligence (pp. 957-962). IEEE.
[14] Zhang, T., & Wu, J. (2015, July). Speech emotion recognition with i-vector feature and RNN model. In 2015 IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP) (pp. 524-528). IEEE.
[15] Huang Z, Li J, Siniscalchi S M, et al. Rapid adaptation for deep neural networks through multi-task learning[C]//Sixteenth Annual Conference of the International Speech Communication Association. 2015.
[16] Wu J, Li M, Lee C H. An entropy minimization framework for goal-driven dialogue management[C]//Sixteenth Annual Conference of the International Speech Communication Association. 2015.
[17] Ding H, Wu J. Predicting retweet scale using log-normal distribution[C]//2015 IEEE International Conference on Multimedia Big Data. IEEE, 2015: 56-63.
[18] Ding H, Wu J. A Retweet Scale Prediction Model Based on Truncated Distribution Estimation. WSDM-BData 2015.
[19] Zhang T, Wu J, Wang D, et al. Audio retrieval based on perceptual similarity[C]//10th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing. IEEE, 2014: 342-348.
[20] Ma Y, Wu J. Combining n-gram and dependency word pair for multi-document summarization[C]//2014 IEEE 17th International Conference on Computational Science and Engineering. IEEE, 2014: 27-31.
[21] He Z, Wu J, Lv P. Label correlation mixture model for multi-label text categorization[C]//2014 IEEE Spoken Language Technology Workshop (SLT). IEEE, 2014: 83-88.
[22] Chen Z, Zhang T, Wu J. Subword scheme for keyword search[C]//2014 IEEE Spoken Language Technology Workshop (SLT). IEEE, 2014: 483-488.
[23] He Z, Lv P, Wu J. An effective and robust approach to Mandarin spoken language understanding in specific domain[C]//The 9th International Symposium on Chinese Spoken Language Processing. IEEE, 2014: 604-608.
[24] He Z, Lv P, Wu J. Minimum classification error rate training of supervised topic mixture model for multi-label text categorization[C]//The 9th International Symposium on Chinese Spoken Language Processing. IEEE, 2014: 39-43.
[25] Chen Z, He Z, Lv P, et al. Improving keyword search by query expansion in a probabilistic framework[C]//The 9th International Symposium on Chinese Spoken Language Processing. IEEE, 2014: 187-191.
[26] Li M, Ding H, Wu J. Global discriminative model for dependency parsing in NLP pipeline[C]//The 9th International Symposium on Chinese Spoken Language Processing. IEEE, 2014: 614-618.
[27] Li S, He Z, Wu J. An ontology semantic tree based natural language interface[C]//The 9th International Symposium on Chinese Spoken Language Processing. IEEE, 2014: 226-230.
[28] Zhang X L, Wu J. Denoising deep neural networks based voice activity detection[C]//2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2013: 853-857.
[29] Zhang X L, Wu J. Weight optimization and layered clustering-based ECOC[C]//2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2013: 3477-3481.
[30] He Z, Lv P, Li W, et al. A synchronized pruning composition algorithm of weighted finite state transducers for large vocabulary speech recognition[C]//2012 8th International Symposium on Chinese Spoken Language Processing. IEEE, 2012: 11-15.
[31] Wu Q, Zhang X, Lv P, et al. Perceptual similarity between audio clips and feature selection for its measurement[C]//2012 8th International Symposium on Chinese Spoken Language Processing. IEEE, 2012: 387-391.
[32] Zhang X L, Wu J, Chen Z P, et al. Optimized weighted decoding for error-correcting output codes[C]//2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2012: 2101-2104.
[33] Du Z, Li X, Wu J. Accelerating the Training of HTK on GPU with CUDA[C]//2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum. IEEE, 2012: 1907-1914.
[34] Li Wei , He Zhiyang, Ping Lv, Wu Ji.Topology-related ε-Removal Algorithm for Weighted Finite-state Transducer, National Conference on Man-Machine Speech Communication, NCMMSC2011,Xi’an,2011,10.
[35] Yu Xiaojie, Shao Yang, Wu Ji, Wang Xia,A Fusion Summarization Framework based on SVM and MMR, National Conference on Man-Machine Speech Communication, NCMMSC2011,Xi’an,2011,10.
[36] Wu, J., He, Z., & Lv, P. (2011). An active learning approach to task adaptation. In Twelfth Annual Conference of the International Speech Communication Association.
[37] Li, W., Wu, J., & Lv, P. (2010, November). High performance Chinese Spoken Term Detection based on term expansion. In 2010 7th International Symposium on Chinese Spoken Language Processing (pp. 430-434). IEEE.
[38] Shen, W., Wu, J., & Li, W. (2010, November). Web-based keyword adapted Language Modeling for Keyword Spotting. In 2010 7th International Symposium on Chinese Spoken Language Processing (pp. 251-255). IEEE.
[39] Wu, J., Zhang, X. L., & Li, W. (2010). A new VAD framework using statistical model and human knowledge based empirical rule. In Eleventh Annual Conference of the International Speech Communication Association.