Ziyu Yao's Personal Website

Nguyen Engineering Building, 4415

4400 University Drive

Fairfax, VA, 22030

Contact: ziyuyao at gmu dot edu

Hello! I am an Assistant Professor of the Department of Computer Science at George Mason University, where I co-lead the George Mason NLP group. I am also affiliated with C4I & Cyber Center, Center for Advancing Human-Machine Partnership, and Institute for Digital InnovAtion at GMU. I received my PhD degree from Department of Computer Science and Engineering at the Ohio State University (OSU) in 2021, and have spent time interning at Microsoft Semantic Machines, Carnegie Mellon University, Microsoft Research, Fujitsu Lab of America, and Tsinghua University. I work in natural language processing (NLP) and artificial intelligence (AI). My research has been founded by National Science Foundation, Microsoft Accelerate Foundation Models Research Award, Virginia Commonwealth Cyber Initiative, and UMD’s Applied Research Laboratory for Intelligence and Security. I was the Diversity & Inclusion Co-Chair at NAACL 2024 and the lead faculty organizer of MASC-SLL 2023 (a local NLP event with a 10-year history in the Mid-Atlantic area). I co-organized SUKI at NAACL 2022 and NLP4Prog at ACL 2021.

My interested topics include:

Knowledge Grounding, Reasoning, and Planning: we advance NLP/AI systems (including Large Language Models/LLMs) in various tasks demanding knowledge grounding, reasoning, and planning. Exemplar tasks include code generation, math reasoning, motion planning, and information extraction.
Responsible and trustworthy natural language interfaces, where we study how language interfaces can be made more trustable and responsible in their interaction with humans. This includes topics of: (1) Model interpretability: Check out our task-centric survey on Mechanistic Interpretability, interpretation of CoT via neuron activation in LLMs, and mechanistic interpretaion of LLMs in syntactic code completion; (2) Human-AI interactive framework, represented by our multi-year effort in "interactive semantic parsing/code generation" -- Preprint'24, ACL'23, ICLR'21, EMNLP'20, EMNLP'19, AAAI'19; (3) Enhancing accessibility to LLMs, represented by our recent effort in saving the monetary cost of calling LLM APIs and optimizing user prompts to LLMs for task efficacy and safety.
Interdisciplinary Applications: I'm passionate about making real impacts of NLP/AI to critical domains and applications. Our recent effort includes building Multi-LLM Agents for Mathematics Education, a Foundation Model for Network Communication, and GPT3 for personality test generation in I/O Psychology.

During July 7-11, 2025, we (w/ Dr. Jennifer Suh) organized the first Math EdVenture Summer Camp at GMU, the Fairfax campus, as our commitment to the NSF RITEL project. Check out our activities here!

We are organizing a The First Workshop on the Application of LLM Explainability to Reasoning and Planning at COLM 2025! Submit your excellent work to our worshop!

We will be giving a Tutorial on Mechanistic Interpretability for Language Models at ICML 2025! Stay tuned for our schedule and materials.

Excited to release Version 2 of our task-centric survey on Mechanistic Interpretability, in collaboration with Salesforce Research, Purdue University, and George Washington University. Also check out our survey on Sparse Autoencoder (SAE) in collaboration with NJIT and University of Georgia.

I will recruit 1-3 PhD students for the Fall'26 cycle. I wrote a blog "Why you should apply for Mason CS and work with me" here. If you are interested, please contact me following the instructions on this page.**

news

06/2025	I gave a talk about “Interpretability Fills the Gap of Data Benchmarks” at the BI4LLMC workshop at FSE!
06/2025	One paper about VLM reasoning for social robot navigation is accepted to IROS 2025 (w/ Xuesu Xiao and students)!
05/2025	Two papers accepted to ACL Findings 2025 (Instruction-Tuning for Event Extraction, and Batch Prompting Attack)! Congrats to Murong, Saurabh, and Sweta!
05/2025	Saurabh, Murong, and Hao will intern at Adobe, Salesforce Research, and Amazon. Mohamed received the CVPR Travel Award. Sai received $1k prize from the CEC Undergrad Research Celebration. Saurabh received the Outstanding GTA Award from GMU CS and was selected to present at the SDM Doctoral Form. Congrats to them!!
04/2025	Our Tutorial on Mechanistic Interpretability for Language Models is accepted to ICML 2025. Hope to see everyone in Vancouver this summer!
04/2025	I gave a talk about Language Modeling Reasoning at Georgetown University! Slides
02/2025	Congrats to Mohamed and collaborators for a paper about Benchmarking VLMs in Path Plan Evaluation being accepted CVPR’25!
01/2025	Congrats to Murong and collaborators for a paper about “dynamic LLM reasoning” being accepted ICLR’25!
12/2024	I gave a talk about mechanistic understanding of LLMs in reasoning and code generation at University of Cambridge, Language Technology Lab Seminars!
08/2024	Grateful to receive a 3-year grant from National Science Foundation (NSF) on fostering the mathematical modeling competencies of middle-school students by leveraging the power of Generative AI/LLMs (see our prototype)! My first lead-PI NSF project ever;) I’m very lucky to make it with the wonderful Jennifer Suh from GMU Math Edu and Janice Zhang from William & Mary CS!
08/2024	Congrats to Daking and Saurabh for their two long papers being accepted to ACL 2024 (Main and Findings)! [EDIT] Daking’s paper about mechanistic interpretation of CoT is reported by MIT Technology Review China [English Translate]!
07/2024	Check out our latest review on Mechanistic Interpretability for Language Models. We particularly present a taxonomy and a beginner’s roadmap for people interested in this field.
05/2024	Great pleasure to speak about semantic parsing at the CTO Data Science Speaker Series at Bloomberg NYC!
05/2024	I was invited to talk about “Interactive Semantic Parsing” at Microsoft Research the PROSE team!
04/2024	CONGRATS to Murong for being awarded an Outstanding PhD Student by GMU CS, and Hao for receiving the Summer 2024 GRA Fellowship from CAHMP!
04/2024	Check out our project on “LLM Agents for Mathematics Education” as a teamwork with experts from NLP, HCI, and Mathematics Education! The project was funded by Microsoft Research, Accelerating Foundation Models Research program.
04/2024	I was invited to talk about “Building Natural Language Interfaces in the Age of LLMs” at Indiana University, Indianapolis! Slides are available here.
03/2024	I was invited to talk about “Towards Enhancing the Utilization of Large Language Models for Humans” at University of Arizona and Virginia Tech NVC! Slides are available here.
03/2024	Invited to give a guest lecture about LLMs at Department of Health Administration and Policy and School of Education at GMU!
01/2024	Our paper on LLM Cascade for Cost Saving in Reasoning (Featured in Hugging Face Daily Papers) is now accepted to ICLR’24! Congrats to Murong and collaborators from Microsoft Research and VT!
12/2023	Received a grant from Commonwealth Cyber Initiative (CCI) on the topic of LLM supply chain security, with Dr. Xiaokuan Zhang. Thank you, CCI!
11/2023	I was selected as a Top Reviewer at NeurIPS’23, thanks for the discussions with my students!
10/2023	Check out our three new preprints on Cost-saving LLMs in reasoning (Featured in Hugging Face Daily Papers), Prompt optimization for zero-shot LLMs, and Benchmarking LLMs in spatial-temporal reasoning!
10/2023	Congrats to Saurabh and Murong for papers accepted to EMNLP’23 main conference (MailEx, a new conversational event extraction benchmark) and the demo track (Gentopia, our open-source platform streamlining the creation and sharing of augmented LLMs)! Many thanks to our collaborators. See U in Singapore!
09/2023	Received Azure credits from Microsoft Research, Accelerating Foundation Models Research program for our project on LLM4Edu (in collaboration w/ Dr. Anthony E. Kelly at GMU Educational Psychology)! Thank you, Microsoft!
08/2023	Received a grant from National Science Foundation (NSF) on the topic of interpretability and explainability of large language models for code! Many thanks to NSF and my collaborators Dr. Kevin Moran (GMU -> UCF) and Dr. Denys Poshyvanyk (W&M)!
08/2023	I gave in-person talks titled “Building Natural Language Interfaces in the Age of LLMs” at Google and Apple! (Good summer time in the Bay Area )
07/2023	I was invited to serve as Diversity & Inclusion Chair at NAACL 2024!
06/2023	Congrats to Hao and Daking for receiving the Graduate Student Travel Fund from GMU!
06/2023	I was invited to serve as Senior PC member at AAAI’24!
05/2023	Two papers accepted to ACL’23 (main conference)! Congrats to Hao, Saurabh, and Daking!
04/2023	George Mason organized MASC-SLL 2023, an annual NLP event in the Mid-Atlantic area!
03/2023	I was invited to serve as reviewer at NeurIPS’23!
02/2023	I will be attending AAAI’23 in D.C. and serving as Session Chair!
01/2023	I was invited to serve as Area Chair at ACL’23 (Question Answering).
12/2022	Received a grant from Commonwealth Cyber Initiative (CCI) on the topic of algorithm explanation and human trust. Thanks to CCI and my collaborator Dr. Tyler Shaw from GMU Psychology!
11/2022	Congrats to my student Daking Rai for a student abstract accepted to AAAI! Thanks to my collaborators!
11/2022	Work about GPT-3 for psychological test item generation got accepted to Journal of Business and Psychology. Thanks to my collaborators!
10/2022	One paper accepted to EMNLP 2022!
10/2022	I will be giving a talk in the Department of Statistics at GMU!
09/2022	I will be giving a (virtual) talk at ServiceNow Research!
08/2022	I will be attending KDD’22 in person! I will serve as a mentor in the KDD Undergraduate Consortium.
07/2022	I will be attending NAACL’22 in person! Please join our SUKI workshop in July 14!
07/2022	I was invited to serve as Senior PC member at AAAI’23.
06/2022	I was invited to serve as Area Chair at EMNLP’22 (Efficient NLP Track).
04/2022	Gave a talk in the JHU CLSP seminar!
02/2022	Consider submitting to the SUKI workshop at NAACL2022!
12/2021	Our paper “CliniQG4QA” won the Best Paper Award at IEEE BIBM 2021!
11/2021	Our (w/ Penn State, UW, UCSB, Stanford, UC Berkeley, Google Research) workshop proposal “SUKI: Workshop on Structured and Unstructured Knowledge Integration” was accpted to be co-located with NAACL 2022! Stay tuned!
11/2021	Invited to serve as ACM SIGAI Newsletter co-editor.
08/2021	Started my journey at George Mason University! Check out our George Mason NLP group website (co-lead with Prof. Antonios Anastasopoulos)!
05/2021	I will be interning in Microsoft Semantic Machines in this summer (virtually)!
04/2021	I am awarded the Graduate Student Research Award by OSU, CSE department!
01/2021	One paper accepted to ICLR’21! Thanks to my collaborators at CMU!
11/2020	Super honored to receive the Presidential Fellowship from OSU Graduate School! (“The Presidential Fellowship is the most prestigious award given by the Graduate School. Recipients of this award embody the highest standards of scholarship in the full range of Ohio State’s graduate programs.”)
11/2020	Our workshop proposal on Natural Language Processing for Programming has been accepted to be co-located with ACL 2021! Congrats to collaborators from Bar-Ilan University, UT Austin, CMU, and OSU! Please stay tuned!
10/2020	Honored to be selected to the Rising Stars in EECS workshop (hosted by UC Berkeley this year)!
09/2020	Invited poster at Microsoft Research AI Breakthroughs Workshop (virtually).
08/2020	Invited talk at VMware, Beijing (virtually).
05/2020	Excited to start summer internship at CMU Language Technologies Institute with Prof. Graham Neubig!
08/2019	Our work on a principled Interactive Semantic Parsing framework is accepted to EMNLP! See you in Hong Kong!
07/2019	Talk at ETH Zurich: “Towards Building Interactive and Collaborative Natural Language Interfaces”.
07/2019	Attended ACL’19 in Florence.
05/2019	One paper accepted to ACL’19 (my first ACL paper ever). Congrats to my collaborator Boyuan!
01/2019	Our work exploring machine collaborations between Code Annotation and Code Retrieval is accepted by WWW’19!
10/2018	We built an Interactive Semantic Parser: talk to your parser to resolve NL ambiguities (accepted by AAAI’19)!
08/2018	To know more about StaQC? Check out our work “A Comprehensive Study of StaQC for Deep Code Summarization” (accepted by SIGKDD’18 Deep Learning Day)!
05/2018	Feeling thrilled to start internship at Microsoft Research @ Redmond this summer!!
04/2018	Attended WWW 2018 conference @ Lyon, France and present our work “StaQC: A Systematically Mined Question-Code Dataset from Stack Overflow”. Check out the slides and quick StaQC examples!
04/2018	Attended CRA Grad Cohort Workshop for Women (CRA-W) @ San Francisco.
09/2017	Gave a talk about “Mining Code Answers to Natural Language Questions” at OSU CSE AI seminar.

selected publications

2025

ACL’25 Findings

Efficient but Vulnerable: Benchmarking and Defending LLM Batch Prompting Attack

Murong Yue*, and Ziyu Yao

Findings of ACL 2025, 2025

PDF Code
ACL’25 Findings

Instruction-Tuning LLMs for Event Extraction with Annotation Guidelines

Saurabh Srivastava^*, Sweta Pati^*, and Ziyu Yao

Findings of ACL 2025, 2025

PDF Code
CVPR’25

Evaluating Vision-Language Models as Evaluators in Path Planning

Mohamed Aghzal*, Xiang Yue, Erion Plaku, and Ziyu Yao

CVPR, 2025

PDF Code
ICLR’25

DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search

Murong Yue*, Wenlin Yao, Haitao Mi, Dian Yu, Ziyu Yao, and Dong Yu

The Thirteenth International Conference on Learning Representations, 2025

PDF
AAAI’25W

MathVC: An LLM-Simulated Multi-Character Virtual Classroom for Mathematics Education

(Invited Presentation at Wolfram Research LLM Agent Colloquium)

Murong Yue^*, Wenhan Lyu^, Wijdane Mifdal*, Yixuan Zhang, Jennifer Suh, and Ziyu Yao

arXiv Preprint (To present at AAAI 2025 AI4Edu Workshop), 2025

PDF Website

2024

EMNLP’24

Navigating the Shortcut Maze: A Comprehensive Analysis of Shortcut Learning in Text Classification by Language Models

Yuqing Zhou, Ruixiang Tang, Ziyu Yao, and Ziwei Zhu

Findings of EMNLP, 2024

PDF Code
CASE’24

Look Further Ahead: Testing the Limits of GPT-4 in Path Planning

Mohamed Aghzal*, Erion Plaku, and Ziyu Yao

IEEE CASE 2024 (also present at AAAI 2025 LM4Plan Workshop), 2024

PDF Code
ACL’24

An Investigation of Neuron Activation as a Unified Lens to Explain Chain-of-Thought Eliciting Arithmetic Reasoning of LLMs

(Covered by MIT Technology Review China [English Translate])

Daking Rai*, and Ziyu Yao

ACL, 2024

PDF Code Website
ACL’24

Instances Need More Care: Rewriting Prompts for Instances with LLMs in the Loop Yields Better Zero-Shot Performance

Saurabh Srivastava*, Chengyue Huang, Weiguo Fan, and Ziyu Yao

Findings of ACL, 2024

PDF Code
ICLR’24W

Can Large Language Models be Good Path Planners? A Benchmark and Investigation on Spatial-temporal Reasoning

Mohamed Aghzal*, Erion Plaku, and Ziyu Yao

ICLR Workshop on LLM Agents, 2024

PDF Code
ICLR’24

Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning

(Featured in Hugging Face Daily Papers)

Murong Yue*, Jie Zhao, Min Zhang, Liang Du, and Ziyu Yao

The Twelfth International Conference on Learning Representations (also at ICLR Workshop on Reliable and Responsible Foundation Models), 2024

PDF Code