
Principal Architect
Natural Language Processing Department at Baidu Inc.
I am a principal architect and a tech lead of deep question answering team at Baidu NLP since December 2017. Before that, I was a researcher at Microsoft Research Asia (MSRA) from September 2014 to December 2017. I obtained Ph.D. degree in computer science from Harbin Institute of Technology (HIT) under the supervision of Prof. Hsiao-Wuen Hon (MSRA), Prof. Ting Liu (HIT) and Dr. Chin-Yew Lin (MSRA) in September 2014. My research interests include question answering, information extraction and social computing.
Please contact me via legendarydan (at) gmail (dot) com
My Sina Weibo (in Chinese) and Twitter (in English)
News
- Please send me emails with your resume (for internships or FTE positions) if you are interested in working with us on question answering and machine reading comprehension. Experiences with machine (incl. but not limited to deep) learning for NLP are preferred.
- Oct 2020: We proposed RocketQA [paper], an optimized training approach to dense passage retrieval for open-domain question answering. RocketQA achieved the 1st rank at the leaderboard of MSMARCO Passage Ranking Task. It was featured in zh-cn and en-us.
- Aug 2020: Baidu, CCF (China Computer Federation) and CIPSC (Chinese Information Processing Society of China) jointly lunched the project of LUGE(千言)[portal], that is an open-source project of Chinese NLP benchmarks. Our aim is to promote the advancement of Chinese NLP technologies by the new benchmarks. Specifically, LUGE tries to evaluate models beyond just accuracy, in terms of robustness, generalization, multi-task capabilities etc., and cover rich types of tasks, including the tasks of language understanding, language generation and multimodality. Currently, we have collected more than 20 NLP datasets for 7 tasks from the great contributors of 11 organizations. LUGE was featured in videos (zh-cn, en-us) and articles (zh-cn). If you are interested in LUGE, pls. contact me.
- Apr 2020: We released a Chinese dataset namely DuReaderrobust [paper][data & code] towards evaluating the robustness of machine reading comprehension models. We hosted a shared task of DuReaderrobust [leaderboard] at 2020 Language and Intelligence Challenge, and there were more than 1,500 teams and more than 4,600 submissions in the shared task. The shared task was featured in zh-cn.
-
Nov 2019: Our proposed machine reading comprehension system D-NET [paper][code] was ranked at top 1 in the MRQA 2019 Shared Task, that tests if MRC systems can generalize beyond the datasets on which they were trained. D-NET was featured in zh-cn (
1 ,2 ) anden-us .
Professional Activities
- Area Chair: ACL 2021 (Question Answering)
- Session Chair: AACL 2020 (Question Answering)
- Program commitee/reviewer, ACL, EMNLP, NAACL, EACL, AACL, SIGIR, KDD, WSDM, WWW, CIKM, ICWSM, ACM Transactions on the Web (TWEB), ACM Transactions on Intelligent Systems and Technology (TIST), ACM Transactions on Information Systems (TOIS), Frontiers of Computer Science (FCS)
Papers [Google Scholar]
-
RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering,
preprint
Yingqi Qu, Yuchen Ding, Jing Liu, Kai Liu, Ruiyang Ren, Xin Zhao, Daxiang Dong, Hua Wu and Haifeng Wang
[Blog] -
DuReaderrobust: A Chinese Dataset Towards Evaluating the Robustness of Machine Reading Comprehension Models,
preprint
Hongxuan Tang, Jing Liu, Hongyu Li, Yu Hong, Hua Wu and Haifeng Wang
[Data & Code], [Leaderboard]. -
A Robust Adversarial Training Approach to Machine Reading Comprehension,
AAAI 2020
Kai Liu, Xin Liu, An Yang, Jing Liu, Jinsong Su, Sujian Li and Qiaoqiao She
-
CoKE: Contextualized Knowledge Graph Embedding,
preprint
Quan Wang, Pingping Huang, Haifeng Wang, Songtai Dai, Wenbin Jiang, Jing Liu, Yajuan Lyu, Yong Zhu, Hua Wu
[Code] -
D-NET: A Simple Framework for Improving the Generalization of Machine Reading Comprehension,
EMNLP 2019 Workshop on Machine Reading for Question Answering (MRQA)
Hongyu Li, Xiyuan Zhang, Yibing Liu, Yiming Zhang, Quan Wang, Xiangyang Zhou, Jing Liu, Hua Wu and Haifeng Wang
[Blog] [Code] -
Enhancing Pre-trained Language Representations with Rich Knowledge for Machine Reading Comprehension,
ACL 2019
An Yang, Quan Wang, Jing Liu, Kai Liu, Yajuan Lyu, Hua Wu, Qiaoqiao She and Sujian Li
[Blog] [Code] -
Towards Robust Neural Machine Reading Comprehension via Question Paraphrases,
IALP 2019
Ying Li, Hongyu Li and Jing Liu
-
Towards Time-Aware Distant Supervision for Relation Extraction,
preprint
Tianwen Jiang, Sendong Zhao, Jing Liu, Jin-Ge Yao, Ming Liu, Bing Qin, Ting Liu, Chin-Yew Lin
-
Answer-focused and Position-aware Neural Question Generation,
EMNLP 2018
Xingwu Sun, Jing Liu, Yajuan Lyu, Yanjun Ma and Shi Wang
-
Aggregated Semantic Matching for Short Text Entity Linking,
CoNLL 2018
Feng Nie, Shuyan Zhou, Jing Liu, Jinpeng Wang, Chin-Yew Lin and Rong Pan
-
Neural Math Word Problem Solver with Reinforcement Learning,
COLING 2018
Danqing Huang, Jing Liu, Chin-Yew Lin and Jian Yin
-
DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications,
ACL 2018 Workshop on Machine Reading for Question Answering (MRQA)
Wei He, Kai Liu, Jing Liu, Yajuan Lyu, Shiqi Zhao, Xinyan Xiao, Yuan Liu, Yizhong Wang, Hua Wu, Qiaoqiao She, Xuan Liu, Tian Wu, Haifeng Wang
[Data] [Code] -
Adaptations of ROUGE and BLEU to Better Evaluate Machine Reading Comprehension Task,
ACL 2018 Workshop on Machine Reading for Question Answering (MRQA)
An Yang, Kai Liu, Jing Liu, Yajuan Lyu, Sujian Li
-
Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification,
ACL 2018
Yizhong Wang, Kai Liu, Jing Liu, Wei He, Yajuan Lyu, Hua Wu, Sujian Li, Haifeng Wang
-
Revisiting Distant Supervision for Relation Extraction,
LREC 2018
Tingsong Jiang, Jing Liu and Chin-Yew Lin
[Data] -
A Statistical Framework for Product Description Generation,
IJCNLP 2017
Jinpeng Wang, Yutai Hou, Jing Liu, Yunbo Cao and Chin-Yew Lin
-
News Citation Recommendation with Implicit and Explicit Semantics,
ACL 2016
Hao Peng, Jing Liu and Chin-Yew Lin
-
Knowledge Base Completion via Coupled Path Ranking,
ACL 2016
Quan Wang, Jing Liu, Yuanfei Luo, Bin Wang and Chin-Yew Lin
-
RBPB: Regularization-Based Pattern Balancing Method for Event Extraction
ACL 2016
Lei Sha, Jing Liu, Chin-Yew Lin, Sujian Li, Baobao Chang and Zhifang Sui
-
Improving Ranking Consistency for Web Search by Leveraging Knowledge Base and Search Logs,
CIKM 2015
Jyun-Yu Jiang, Jing Liu, Chin-Yew Lin and Pu-Jen Cheng
-
A Regularized Competition Model for Question Difficulty Estimation in Community Question Answering Services, EMNLP 2014
Quan Wang, Jing Liu, Bin Wang and Li Guo
-
A Computational Approach to Measuring the Correlation between Expertise and Social Media Influence for Celebrities on Microblogs, ASONAM 2014
Xin Zhao, Jing Liu, Yulan He, Chin-Yew Lin and Ji-Rong Wen
-
Question Difficulty Estimation in Community Question Answering Services,
EMNLP 2013
Jing Liu, Quan Wang, Chin-Yew Lin and Hsiao-Wuen Hon
-
A Hierarchical Entity-based Approach to Structuralize User Generated Content in Social Media: A Case of Yahoo! Answers,
EMNLP 2013
Baichuan Li, Jing Liu, Chin-Yew Lin, Irwin King and Michael R. Lyu
-
What's in a Name? An Unsupervised Approach to Link Users across Communities
,
WSDM 2013
Jing Liu, Fan Zhang, Xinying Song, Young-In Song, Chin-Yew Lin and Hsiao-Wuen Hon
-
An Unsupervised Method for Author Extraction from Web Pages Containing User-Generated Content,
CIKM 2012
Jing Liu, Xinying Song, Jingtian Jiang and Chin-Yew Lin
-
Competition-based User Expertise Score Estimation,
SIGIR 2011
Jing Liu, Young-In Song and Chin-Yew Lin
-
Automatic Extraction of Web Data Records Containing User-Generated Content,
CIKM 2010
Xinying Song, Jing Liu, Yunbo Cao, Chin-Yew Lin and Hsiao-Wuen Hon
-
Microsoft Research Asia with Redmond at the NTCIR-8 Community QA Pilot Task,
NTCIR 2010
Young-In Song, Jing Liu, Tetsuya Sakai, Xinjing Wang, Guwen Feng, Yunbo Cao, Hisami Suzuki and Chin-Yew Lin
Working Experience
- Principal Architect, Baidu NLP, Dec. 2017 - present
- Researcher, Microsoft Research Asia, Sep. 2014 - Dec. 2017
- Intern, Microsoft Research Asia, Jul. 2009 - Sep. 2014
Educations
- PhD, Computer Science, Harbin Institute of Technology, Sep. 2009 - Sep. 2014
- M.Sc, Computer Science, Harbin Institute of Technology, Sep. 2007 - Jul. 2009
- B.Sc, Computer Science, Xidian University, Sep. 2003 - Jul. 2007