Publications | Ziyu Yao's Personal Website

2026

CVPR’26

Inside-Out: Measuring Generalization in Vision Transformers Through Inner Workings

(CVPR 2026 Highlight (Top 3%))

Yunxiang Peng, Mengmeng Ma, Ziyu Yao, and Xi Peng

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026

PDF
ACL’26

Why Do LLM-based Web Agents Fail? A Hierarchical Planning Perspective

(Selected for Oral)

Mohamed Aghzal*, Gregory J Stein, and Ziyu Yao

arXiv preprint arXiv:2603.14248 (To appear at ACL 2026), 2026

PDF
CHI’26

Designing AI Peers for Collaborative Mathematical Problem Solving with Middle School Students: A Participatory Design Study

Wenhan Lyu, Yimeng Wang, Murong Yue*, Yifan Sun, Jennifer Suh, Meredith Kier, and 2 more authors

arXiv preprint arXiv:2601.17962 (To appear at ACM CHI 2026), 2026

PDF

2025

ICDM’25W

Evaluating the Effectiveness of Persona Simulation in Opinion Prediction with GPT-4.1

Sarah Li*, and Ziyu Yao

In 2025 IEEE International Conference on Data Mining Workshops (ICDMW - Undergraduate and Graduate Honor Symposium), 2025
Preprint

Revisiting Prompt Optimization with Large Reasoning Models-A Case Study on Event Extraction

Saurabh Srivastava*, and Ziyu Yao

arXiv preprint arXiv:2504.07357, 2025

PDF
Preprint

A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models

Daking Rai*, Yilun Zhou, Shi Feng, Abulhair Saparov, and Ziyu Yao

arXiv preprint arXiv:2407.02646 (Version 2), 2025

PDF Website
Preprint

A Survey on Large Language Models for Automated Planning

Mohamed Aghzal*, Erion Plaku, Gregory J Stein, and Ziyu Yao

arXiv preprint arXiv:2502.12435, 2025

PDF
Preprint

Reassessing Code Authorship Attribution in the Era of Language Models

Atish Kumar Dipongkor, Ziyu Yao, and Kevin Moran

arXiv preprint arXiv:2506.17120, 2025

PDF
Preprint

Guiding AI to Fix Its Own Flaws: An Empirical Study on LLM-Driven Secure Code Generation

Hao Yan*, Swapneel Suhas Vaidya, Xiaokuan Zhang, and Ziyu Yao

arXiv preprint arXiv:2506.23034, 2025

PDF
NeurIPS’25

Failure by Interference: Language Models Make Balanced Parentheses Errors When Faulty Mechanisms Overshadow Sound Ones

Daking Rai*, Samuel Miller*, Kevin Moran, and Ziyu Yao

arXiv preprint arXiv:2507.00322 (To Appear at NeurIPS 2025), 2025

PDF
EMNLP’25

Feature Extraction and Steering for Enhanced Chain-of-Thought Reasoning in Language Models

Zihao Li, Xu Wang, Yuzhe Yang, Ziyu Yao, Haoyi Xiong, and Mengnan Du

arXiv preprint arXiv:2505.15634 (to appear at EMNLP 2025 Main), 2025

PDF
EMNLP’25

All for One: LLMs Solve Mental Math at the Last Token With Information Transferred From Other Tokens

(Covered by Rohan Paul (AI Influencer at X) and “AI: post transformers” Podcast)

Siddarth Mamidanna*, Daking Rai*, Ziyu Yao, and Yilun Zhou

To appear at EMNLP 2025 Main, 2025

PDF
EMNLP’25 Findings

A survey on sparse autoencoders: Interpreting the internal mechanisms of large language models

Dong Shu, Xuansheng Wu, Haiyan Zhao, Daking Rai*, Ziyu Yao, Ninghao Liu, and 1 more author

arXiv preprint arXiv:2503.05613 (to appear at EMNLP 2025 Findings), 2025

PDF
EMNLP’25 Findings

Beneath the Surface: How Large Language Models Reflect Hidden Bias

Jinhao Pan, Chahat Raj, Ziyu Yao, and Ziwei Zhu

arXiv preprint arXiv:2502.19749 (to appear at EMNLP 2025 Findings), 2025

PDF
COLM’25W

Can LLMs Simulate Personas with Reversed Performance? A Benchmark for Counterfactual Instruction Following

Sai Adith Senthil Kumar^*, Hao Yan^*, Saipavan Perepa*, Murong Yue*, and Ziyu Yao

COLM Workshop on Social Simulation with LLMs, 2025

PDF
ACL’25 Findings

Efficient but Vulnerable: Benchmarking and Defending LLM Batch Prompting Attack

Murong Yue*, and Ziyu Yao

Findings of ACL 2025, 2025

PDF Code
ACL’25 Findings

Instruction-Tuning LLMs for Event Extraction with Annotation Guidelines

Saurabh Srivastava^*, Sweta Pati^*, and Ziyu Yao

Findings of ACL 2025, 2025

PDF Code
IROS’25

Autospatial: Visual-language reasoning for social robot navigation through efficient spatial reasoning learning

Yangzhe Kong, Daeun Song, Jing Liang, Dinesh Manocha, Ziyu Yao, and Xuesu Xiao

arXiv preprint arXiv:2503.07557 (to appear at IROS 2025), 2025

PDF
AAAI’25W

Mechanistic Understanding of Language Models in Syntactic Code Completion

Samuel Miller^*, Daking Rai^*, and Ziyu Yao

AAAI Workshop on Towards Knowledgeable Foundation Models, 2025

PDF
CVPR’25

Evaluating Vision-Language Models as Evaluators in Path Planning

Mohamed Aghzal*, Xiang Yue, Erion Plaku, and Ziyu Yao

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025

PDF Code
ICLR’25

DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search

(Covered by MIT Tech Review China)

Murong Yue*, Wenlin Yao, Haitao Mi, Dian Yu, Ziyu Yao, and Dong Yu

The Thirteenth International Conference on Learning Representations, 2025

PDF
AAAI’25W

MathVC: An LLM-Simulated Multi-Character Virtual Classroom for Mathematics Education

(Invited Presentation at Wolfram Research LLM Agent Colloquium; media coverage by GovTech and The George)

Murong Yue^*, Wenhan Lyu^, Wijdane Mifdal*, Yixuan Zhang, Jennifer Suh, and Ziyu Yao

AAAI AI4Edu Workshop, 2025

PDF Website

2024

Preprint

Understanding the Effect of Algorithm Transparency of Model Explanations in Text-to-SQL Semantic Parsing

Daking Rai*, Rydia R Weiland, Kayla Margaret Gabriella Herrera, Tyler H Shaw, and Ziyu Yao

arXiv preprint arXiv:2410.16283, 2024

PDF
Preprint

IntelliExplain: Enhancing Conversational Code Generation for Non-Professional Programmers

Hao Yan*, Thomas D. Latoza, and Ziyu Yao

arXiv Preprint, 2024

PDF Code Website
EMNLP’24

Navigating the Shortcut Maze: A Comprehensive Analysis of Shortcut Learning in Text Classification by Language Models

Yuqing Zhou, Ruixiang Tang, Ziyu Yao, and Ziwei Zhu

Findings of EMNLP, 2024

PDF Code
CASE’24

Look Further Ahead: Testing the Limits of GPT-4 in Path Planning

Mohamed Aghzal*, Erion Plaku, and Ziyu Yao

IEEE CASE 2024 (also present at AAAI 2025 LM4Plan Workshop), 2024

PDF Code
ACL’24

An Investigation of Neuron Activation as a Unified Lens to Explain Chain-of-Thought Eliciting Arithmetic Reasoning of LLMs

(Covered by MIT Technology Review China [English Translate])

Daking Rai*, and Ziyu Yao

ACL, 2024

PDF Code Website
ACL’24

Instances Need More Care: Rewriting Prompts for Instances with LLMs in the Loop Yields Better Zero-Shot Performance

Saurabh Srivastava*, Chengyue Huang, Weiguo Fan, and Ziyu Yao

Findings of ACL, 2024

PDF Code
ICLR’24W

Can Large Language Models be Good Path Planners? A Benchmark and Investigation on Spatial-temporal Reasoning

Mohamed Aghzal*, Erion Plaku, and Ziyu Yao

ICLR Workshop on LLM Agents, 2024

PDF Code
Preprint

Lens: A Foundation Model for Network Traffic

Qineng Wang, Chen Qian, Xiaochang Li, Ziyu Yao, and Huajie Shao

arXiv Preprint, 2024

PDF
ICLR’24

Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning

(Featured in Hugging Face Daily Papers and AWS Builder Blog)

Murong Yue*, Jie Zhao, Min Zhang, Liang Du, and Ziyu Yao

The Twelfth International Conference on Learning Representations (also at ICLR Workshop on Reliable and Responsible Foundation Models), 2024

PDF Code

2023

EMNLP’23 Demo

Gentopia: A Collaborative Platform for Tool-Augmented LLMs

(An open-source planform for creating, evaluating, and community-sharing Augmented Language Model (ALM)-based Agents)

Binfeng Xu, Xukun Liu, Hua Shen, Zeyu Han, Yuhan Li, Murong Yue*, and 4 more authors

EMNLP’23 System Demonstration, 2023

PDF Code
EMNLP’23

MAILEX: Email Event and Argument Extraction

Saurabh Srivastava*, Gaurav Singh*, Shou Matsumoto, Ali Raz, Paulo Costa, Joshua Poore, and 1 more author

EMNLP’23, 2023

PDF Code
ACL’23

Improving Generalization in Language Model-based Text-to-SQL Semantic Parsing: Two Simple Semantic Boundary-based Techniques

(Rank#8 on Spider leaderboard as of Aug 2023)

Daking Rai*, Bailin Wang, Yilun Zhou, and Ziyu Yao

In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2023

PDF Code
ACL’23

Learning to Simulate Natural Language Feedback for Interactive Semantic Parsing

Hao Yan*, Saurabh Srivastava*, Yintao Tai*, Sida I. Wang, Wen-tau Yih, and Ziyu Yao

In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

PDF Code
AAAI’23 SA

Explaining Large Language Model-Based Neural Semantic Parsers (Student Abstract)

Daking Rai*, Yilun Zhou, Bailin Wang, and Ziyu Yao

AAAI Student Abstract, 2023

PDF
JBP’23

A paradigm shift from “human writing” to “machine generation” in personality test development: An application of state-of-the-art natural language processing

(Editor Commendation, one of 13 out of 1,000+ submissions in 2022)

Philseok Lee, Shea Fyffe, Mina Son, Zihao Jia, and Ziyu Yao

Journal of Business and Psychology, 2023

PDF

2022

EMNLP’22

UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models

Tianbao Xie^, Chen Henry Wu^, Peng Shi, Ruiqi Zhong, Torsten Scholak, Michihiro Yasunaga, and 17 more authors

In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

PDF Code Website
ICLR’22 DL4Code

Code Editing from Few Exemplars by Adaptive Multi-Extent Composition

Peizhao Li, Xuchao Zhang, Ziyu Yao, Wei Cheng, Haifeng Chen, and Hongfu Liu

In Deep Learning for Code Workshop at International Conference on Learning Representations, 2022

PDF
ACL’22

Synthetic Question Value Estimation for Domain Adaptation of Question Answering

Xiang Yue, Ziyu Yao, and Huan Sun

In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

PDF Code

2021

Dissertation

On Advancing Natural Language Interfaces: Data Collection, Model Development, and User Interaction

Ziyu Yao

2021

PDF
ICLR’21

Learning Structural Edits via Incremental Tree Transformations

Ziyu Yao, Frank F. Xu, Pengcheng Yin, Huan Sun, and Graham Neubig

In International Conference on Learning Representations, 2021

PDF Code
BIBM’21

Cliniqg4qa: Generating diverse questions for domain adaptation of clinical question answering

(Best Paper Award)

Xiang Yue, Xinliang Frederick Zhang, Ziyu Yao, Simon Lin, and Huan Sun

In 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2021

PDF Code

2020

EMNLP’20

An Imitation Game for Learning Semantic Parsers from User Interaction

Ziyu Yao, Yiqi Tang, Wen-tau Yih, Huan Sun, and Yu Su

In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020

PDF Code Slides

2019

EMNLP-IJCNLP’19

Model-based Interactive Semantic Parsing: A Unified Framework and A Text-to-SQL Case Study

Ziyu Yao, Yu Su, Huan Sun, and Wen-tau Yih

In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019

PDF Code
ACL’19

Reinforced Dynamic Reasoning for Conversational Question Generation

Boyuan Pan, Hao Li, Ziyu Yao, Deng Cai, and Huan Sun

In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019

PDF Code
WWW’19

CoaCor: Code annotation for code retrieval with reinforcement learning

Ziyu Yao, Jayavardhan Reddy Peddamail, and Huan Sun

In The World Wide Web Conference, 2019

PDF Code
AAAI’19

Interactive semantic parsing for if-then recipes via hierarchical reinforcement learning

Ziyu Yao, Xiujun Li, Jianfeng Gao, Brian Sadler, and Huan Sun

In Proceedings of the AAAI Conference on Artificial Intelligence, 2019

PDF Code Poster Slides

2018

KDD’18 DL Day

A comprehensive study of staqc for deep code summarization

Jayavardhan Reddy Peddamail, Ziyu Yao, Zhen Wang, and Huan Sun

In Deep Learning Day at KDD, 2018

PDF
WWW’18

Staqc: A systematically mined question-code dataset from stack overflow

Ziyu Yao, Daniel S Weld, Wei-Peng Chen, and Huan Sun

In Proceedings of the 2018 World Wide Web Conference, 2018

PDF Code Slides

2016

AAAI’16

Semi-supervised multinomial naive bayes for text classification by leveraging word-level statistical constraint

Li Zhao, Minlie Huang, Ziyu Yao, Rongwei Su, Yingying Jiang, and Xiaoyan Zhu

In Proceedings of the AAAI Conference on Artificial Intelligence, 2016

PDF