• CV
  • Publications
  • Experience
  • Awards
  • Teaching
  • Service
  • Student&Sponsors

About me

I am an Assistant Professor of the Department of Computer Science at George Mason University, where I co-lead the George Mason NLP group. I am also affiliated with C4I & Cyber Center, Center for Advancing Human-Machine Partnership, and Institute for Digital InnovAtion at GMU. I received my PhD degree from Department of Computer Science and Engineering at the Ohio State University (OSU) in 2021. I have spent time interning at Microsoft Semantic Machines, Carnegie Mellon University, Microsoft Research, Fujitsu Lab of America, and Tsinghua University. I work in natural language processing (NLP) and artificial intelligence (AI), particularly building natural language interfaces that can reliably assist humans in knowledge acquisition and task completion. Some specific topics include:

  • Question Answering: This includes building QA systems over structured and unstructured data.
  • Human-AI interaction: I explore how machine learning systems can proactively collaborate with and learn from humans during decision making, as demonstrated in the setting of interactive semantic parsing. I am also interested in general dialogue systems.
  • Language and code: I seek to build natural language interfaces that allow humans to communicate with computers/machines easily. This requires modeling natural language, programming language, and their interplay. Applications of this research include semantic parsing and general-purpose code generation.
  • Efficient NLP/AI: I study building machine learning models with limited supervision, especially for low-resource domains (e.g., healthcare).

I am looking for highly self-motivated students. If you are interested in doing NLP/AI/machine learning/data mining research with me, see how to reach out! (Due to the large volume, I cannot reply to every email; however, I check them very carefully.)


  • George Mason will be organizing MASC-SLL 2023, an annual NLP event in the Mid-Atlantic area. Stay tuned and join us!

  • 02/2023: I will be attending AAAI'23 in D.C. and serving as Session Chair!
  • 01/2023: I was invited to serve as Area Chair at ACL'23 (Question Answering).
  • 12/2022: Received a grant from Commonwealth Cyber Initiative (CCI) on the topic of algorithm explanation and human trust. Thanks to CCI and my collaborator Dr. Tyler Shaw from GMU Psychology!
  • 11/2022: Congrats to my student Daking Rai for a student abstract accepted to AAAI! Thanks to my collaborators!
  • 11/2022: Work about GPT-3 for psychological test item generation got accepted to Journal of Business and Psychology. Thanks to my collaborators!
  • 10/2022: One paper accepted to EMNLP 2022!
  • 10/2022: I will be giving a talk in the Department of Statistics at GMU!
  • 09/2022: I will be giving a (virtual) talk at ServiceNow Research!
  • 08/2022: I will be attending KDD'22 in person! I will serve as a mentor in the KDD Undergraduate Consortium.
  • 07/2022: I will be attending NAACL'22 in person! Please join our SUKI workshop in July 14!
  • 07/2022: I was invited to serve as Senior PC member at AAAI'23.
  • 06/2022: I was invited to serve as Area Chair at EMNLP'22 (Efficient NLP Track).
  • 04/2022: Gave a talk in the JHU CLSP seminar!
  • 02/2022: Consider submitting to the SUKI workshop at NAACL2022!
  • 12/2021: Our paper "CliniQG4QA" won the Best Paper Award at IEEE BIBM 2021!
  • 11/2021: Our (w/ Penn State, UW, UCSB, Stanford, UC Berkeley, Google Research) workshop proposal "SUKI: Workshop on Structured and Unstructured Knowledge Integration" was accpted to be co-located with NAACL 2022! Stay tuned!
  • 11/2021: Invited to serve as ACM SIGAI Newsletter co-editor.
  • 08/2021: Started my journey at George Mason University! Check out our George Mason NLP group website (co-lead with Prof. Antonios Anastasopoulos)!
  • 05/2021: I will be interning in Microsoft Semantic Machines in this summer (virtually)!
  • 04/2021: I am awarded the Graduate Student Research Award by OSU, CSE department!
  • 01/2021: One paper accepted to ICLR'21! Thanks to my collaborators at CMU!
  • 11/2020: Super honored to receive the Presidential Fellowship from OSU Graduate School! ("The Presidential Fellowship is the most prestigious award given by the Graduate School. Recipients of this award embody the highest standards of scholarship in the full range of Ohio State's graduate programs.")
  • 11/2020: Our workshop proposal on Natural Language Processing for Programming has been accepted to be co-located with ACL 2021! Congrats to collaborators from Bar-Ilan University, UT Austin, CMU, and OSU! Please stay tuned!
  • 10/2020: Honored to be selected to the Rising Stars in EECS workshop (hosted by UC Berkeley this year)!
  • 09/2020: Invited poster at Microsoft Research AI Breakthroughs Workshop (virtually).
  • 08/2020: Invited talk at VMware, Beijing (virtually).
  • 05/2020: Excited to start summer internship at CMU Language Technologies Institute with Prof. Graham Neubig!
  • 08/2019: Our work on a principled Interactive Semantic Parsing framework is accepted to EMNLP! See you in Hong Kong!
  • 07/2019: Attended ACL'19 in Florence.
  • 07/2019: Talk at ETH Zurich: "Towards Building Interactive and Collaborative Natural Language Interfaces".
  • 05/2019: One paper accepted to ACL'19 (my first ACL paper ever). Congrats to my collaborator Boyuan!
  • 01/2019: Our work exploring machine collaborations between Code Annotation and Code Retrieval is accepted by WWW'19!
  • 10/2018: We built an Interactive Semantic Parser: talk to your parser to resolve NL ambiguities (accepted by AAAI'19)!
  • 08/2018: To know more about StaQC? Check out our work "A Comprehensive Study of StaQC for Deep Code Summarization" (accepted by SIGKDD'18 Deep Learning Day)!
  • 05/2018: Feeling thrilled to start internship at Microsoft Research @ Redmond this summer!!
  • 04/2018: Attended WWW 2018 conference @ Lyon, France and present our work "StaQC: A Systematically Mined Question-Code Dataset from Stack Overflow". Check out the slides and quick StaQC examples!
  • 04/2018: Attended CRA Grad Cohort Workshop for Women (CRA-W) @ San Francisco.
  • 09/2017: Gave a talk about "Mining Code Answers to Natural Language Questions" at OSU CSE AI seminar.


(my advisee*)
  • Explaining Large Language Model-based Semantic Parsers [Paper]
    Daking Rai*, Yilun Zhou, Bailin Wang, Ziyu Yao
    AAAI 2023 (Student Abstract)
  • A Paradigm Shift from “Human Writing” to “Machine Generation” in Personality Test Development: An Application of State-of-the-art Natural Language Processing [Paper]
    Philseok Lee, Shea Fyffe, Mina Son, Zihao Jia, Ziyu Yao
    Journal of Business and Psychology 2022 (a top I/O Psychology journal)
  • UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models [Paper][Code][Website]
    Tianbao Xie, Chen Henry Wu, Peng Shi, Ruiqi Zhong, Torsten Scholak, Michihiro Yasunaga, Chien-Sheng Wu, Ming Zhong, Pengcheng Yin, Sida Wang, Victor Zhong, Bailin Wang, Chengzu Li, Connor Boyle, Ansong Ni, Ziyu Yao, Dragomir Radev, Caiming Xiong, Lingpeng Kong, Rui Zhang, Noah A. Smith, Luke Zettlemoyer, Tao Yu
    EMNLP 2022
  • Code Editing from Few Exemplars by Adaptive Multi-Extent Composition [Paper]
    Peizhao Li, Xuchao Zhang, Ziyu Yao, Wei Cheng, Haifeng Chen, Hongfu Liu
    ICLR 2022 Deep Learning for Code
  • Synthetic Question Value Estimation for Domain Adaptation of Question Answering [Paper][Code]
    Xiang Yue, Ziyu Yao, Huan Sun
    ACL 2022
  • On Advancing Natural Language Interfaces: Data Collection, Model Development, and User Interaction [Dissertation]
    Ziyu Yao
    The Ohio State University, Ph.D. Dissertation, 2021
  • Learning Structural Edits via Incremental Tree Transformations [Paper][Code]
    Ziyu Yao, Frank F. Xu, Pengcheng Yin, Huan Sun, Graham Neubig
    ICLR 2021
  • Proceedings of the 1st Workshop on Natural Language Processing for Programming (NLP4Prog 2021) [Workshop Proceedings][Website]
    Royi Lachmy, Ziyu Yao, Greg Durrett, Milos Gligoric, Junyi Jessy Li, Ray Mooney, Graham Neubig, Yu Su, Huan Sun, Reut Tsarfaty
    ACL 2021
  • CliniQG4QA: Generating Diverse Questions for Domain Adaptation of Clinical Question Answering [Paper][Code]
    Xiang Yue, Xinliang Frederick Zhang, Ziyu Yao, Simon Lin, Huan Sun
    IEEE BIBM 2021 (Best Paper Award)
  • An Imitation Game for Learning Semantic Parsers from User Interaction [Paper][Code][Slides]
    Ziyu Yao, Yiqi Tang, Scott Wen-tau Yih, Huan Sun, Yu Su
    EMNLP 2020
  • Model-based Interactive Semantic Parsing: A Unified Formulation and A Text-to-SQL Case Study [Paper][Code]
    Ziyu Yao, Yu Su, Huan Sun, Scott Wen-tau Yih
    EMNLP 2019
  • Reinforced Dynamic Reasoning for Conversational Question Generation [Paper][Code]
    Boyuan Pan, Hao Li, Ziyu Yao, Deng Cai, Huan Sun
    ACL 2019
  • CoaCor: Code Annotation for Code Retrieval with Reinforcement Learning [Paper][Code]
    Ziyu Yao, Jayavardhan Reddy Peddamail, Huan Sun
    The Web Conference (WWW) 2019 (Acceptance rate: 18%)
  • Interactive Semantic Parsing for If-Then Recipes via Hierarchical Reinforcement Learning [Paper][Appendix][Code][Slides][Poster]
    Ziyu Yao, Xiujun Li, Jianfeng Gao, Brian Sadler, Huan Sun
    AAAI 2019 (Acceptance rate: 16.2%, SPOTLIGHT)
  • IEC: Towards Interest-Eliciting Neural Conversational Agents [Paper]
    Ziyu Yao, Yizhe Zhang, Xiujun Li, Jianfeng Gao, Michel Galley, Chris Brockett, Huan Sun, Bill Dolan
    Manuscript 2019
  • A Comprehensive Study of StaQC for Deep Code Summarization [Paper][*Update]
    Jayavardhan Reddy Peddamail, Ziyu Yao, Zhen Wang, Huan Sun
    KDD 2018 Deep Learning Day
  • StaQC: A Systematically Mined Question-Code Dataset from Stack Overflow [Paper][Slides][Data&Code][Quick examples]
    Ziyu Yao, Daniel S. Weld, Wei-Peng Chen, Huan Sun
    The Web Conference (WWW) 2018 (Acceptance rate: 14.8%)
  • Semi-Supervised Multinomial Naive Bayes for Text Classification by Leveraging Word-Level Statistical Constraint [Paper]
    Li Zhao, Minlie Huang, Ziyu Yao, Rongwei Su, Yingying Jiang, Xiaoyan Zhu
    AAAI 2016
  • A Semi-Supervised Method for Filtering Chinese Spam Tweets [Paper]
    Ziyu Yao, Shouzhong Tu, Minlie Huang, Xiaoyan Zhu
    Journal of Chinese Information Processing 2016



  • Best Paper Award, IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2021.
  • Graduate Student Research Award, OSU, CSE, 2021.
  • Presidential Fellowship, OSU, 2020
  • Selected to Rising Stars in EECS, UC Berkeley, 2020.
  • Graduate Students Poster Award (Honorable Mention), OSU, 2020.
  • NSF Student Travel Award for The Web Conference 2019 (WWW'19).
  • Graduate Students Poster Award (Honorable Mention), OSU, 2019.
  • Travel Grant for CRA-W Graduate Cohort for Women, San Francisco, 2018.
  • Beijing Outstanding Graduate Award, Beijing, 2015.
  • First-class Scholarship (top 3%), BUPT, 2012, 2014.
  • Second-class Scholarship (top 5%), BUPT, 2013.


  • 2023 Spring: Natural Language Processing (CS 478)
  • 2022 Fall: Advanced Natural Language Processing (CS 678)
  • 2022 Spring: Natural Language Processing (CS 499) (grateful for receiving warm words from my lovely students)
  • 2021 Fall: Special Topic on Natural Language Processing (CS 695)
  • 2020 Fall: Foundations of Speech and Language Processing (CSE 5525 at OSU), guest lecturer.
  • 2018 Fall: Data Management in the Cloud (CSE 3244 at OSU), guest panelist.
  • 2017 Spring: Introduction to Computer Programming in C++ (CSE 1222 at OSU), student instructor and grader.


  • Editor: ACM SIGAI Newsletter co-editor
  • Organizing Committee: MASC-SLL 2023, SUKI at NAACL2022, NLP4Prog at ACL2021
  • Area Chair: ACL 2023, EMNLP 2022
  • Senior Program Committee: AAAI 2023
  • Session Chair: AAAI 2023
  • Program Committee: ACL Rolling Review (ARR), ACL, EMNLP, NAACL, AAAI, AACL-IJNLP, NLPCC, CoNLL, Mining Software Repositories (MSR)
  • Program Committee (Workshop): DL4C at ICLR2022, HCI+NLP at NAACL2022, IntEx-SemPar at EMNLP2020, NLI at ACL2020
  • Journal Reviewer: TKDD, TKDE



PhD Students:

  • Saurabh Srivastava (2022 Spring-)
  • Daking Rai (2022 Spring-)
  • Hao Yan (2022 Spring-)
  • Murong Yue (2022 Fall-)
  • Long Doan (2022 Fall-)

Mentored/Collaborated Master Students:

  • Gaurav Singh (2022 Spring)
  • Janit Bidhan (2022 Fall)

Mentored/Collaborated Undergraduate Students:

  • Mariana Ritchie (2022 Spring)
  • Brian Meike (2022 Spring)
  • Soumithri Gadepalli (2022 Spring)

We are grateful for the funding/computing support from UMD ARLIS, Commonwealth Cyber Initiative (CCI), GMU Office of Research Computing, Mason Libraries, and GMU CEC/CS!

Student Recruiting

For prospective students:

  • For prospective PhD students: I am looking for highly self-motivated PhD students with funding support. If you are interested in working with me, please email me with your CV, transcript, a description of your research interests, future career plan (e.g., academia or industry), and your representative work (e.g., publication, thesis, research project). Please also let me know which semester (e.g., Spring/Fall 2022) you plan to apply for PhD. Please start your email subject with "[Prospective PhD Student - ${Your_Application_Semester}]".
  • For other interested students (at or outside GMU): I am also interested in advising passionate undergraduate and master students at GMU, especially if you plan to apply for graduate schools. I consider PhD students from other institutes for long-term collaboration as well. Funding can be provided to students with great records and performance. If you are interested in working with me, please email me with your CV, transcript, a description of your research interests, future plan after graduation (e.g., will you apply for Master or PhD), and your experience related to NLP/AI (if any). Please start your email subject with "[Prospective Intern Student]".

© 2015 Curriculum Vitae All Rights Reseverd | Design by W3layouts