Zihan Wang's Homepage 王子涵

Zihan Wang

I am a CS PhD student at Northwestern. I am fortunate to be advised by the wonderful Manling Li.
My Chinese name is 王子涵. You can pronounce my name as "Zzz-han Wang".

Email / Github / Semantic Scholar / Zhihu / CV

News

🗓️ Long-term - Northwestern-MLL-Lab is seeking research collaborators/interns! More details here. If you’d like to work with me, plz drop an email. You may feel free to lead/join projects; we prefer strong coding, rapid learning skills, interdisciplinary expertise (STEM/other). Research experience is welcome, but not a requirement!
🗓️ May 22, 2025 – Thrilled to join the Manifold Podcast with Steve Hsu! We dived into robotics, small models, RL, and lessons from DeepSeek. Also shared my recent work on RAGEN and Chain-of-Experts. [Listen here].
🗓️ Apr 25, 2025 - Excited to give a talk about RAGEN at UIUC NLP Reading Group! [Slides]
🗓️ Jan 27, 2025 - Introducing RAGEN -- the world’s first reproduction of DeepSeek-R1(-Zero) methods for training agentic AI models!
🗓️ Sep 20, 2024 - Glad to announce that ESFT has been accepted to the EMNLP 2024 Main Conference! 🎉 Many thanks to all collaborators!
🗓️ Jul 4, 2024 - Thrilled to introduce our latest project at DeepSeek, Expert-Specialized Fine-Tuning (ESFT) for efficient and effective LLM customization by leveraging the highly specialized Mixture-of-Experts (MoE) architecture! 🤖✨
🗓️ Jun 2, 2024 - Grateful to be spotlighted by my alma mater RUC for my journey and achievements. (read blog)
🗓️ Feb 15, 2024 - Excited to join Northwestern as a PhD student! 🎓 Many thanks to my advisor Manling Li!
🗓️ Oct 19, 2023 - Honored to be awarded the Baosteel Outstanding Student Award 2023 🏅 as the ONLY undergrad student among science and technology departments in RUC! Special thanks to NLPIR lab! 🙏
🗓️ Jun 7, 2023 - Excited to share that I'll be joining UIUC Blender Lab 🔬 this summer as a student researcher!
🗓️ Dec 12, 2022 - I posted an article introducing ChatGPT on Capital of Statistics 💡. Do not miss it if you want to know more about ChatGPT! (link)

Research Interest

The growth of foundation models, while extremely rapid, has heightened the need to address the challenges arising from their expanding scale. My research focuses on foundation models' autonomy (RAGEN, MINT benchmark), efficiency (DeepSeek-V2, Expert-Specialized Tuning, Chain-of-Experts), and long-context understanding (Long Video Haystack&T*, RETA-LLM).

Selected Publications

See full list on Google Scholar or Semantic Scholar (Why I Love Semantic Scholar, and You Might Too)

Semantic Scholar uses AI-powered tools to summarize papers, highlight key phrases, and rank research by influence. This helps you find important studies faster. Its Semantic Reader helps you understand papers with skimming highlights and citation cards . You can also see how papers connect with citation graphs. While Google Scholar is great for broad searches, Semantic Scholar is smarter for finding high-quality and impactful research!

	[New] VAGEN: Reinforcing World Model Reasoning for Multi-Turn VLM Agents Kangrui Wang, Pingyue Zhang, Zihan Wang, Yaning Gao, Linjie Li, Qineng Wang, Hanyang Chen, Chi Wan, Yiping Lu, Zhengyuan Yang, Lijuan Wang, Ranjay Krishna, Jiajun Wu, Li Fei-Fei, Yejin Choi, Manling Li NeurIPS 2025* [Project Page] [Paper] [Docs] [Code] [Blog] VAGEN trains vision-language agents with explicit world-model reasoning and bi-level reinforcement learning, stabilizing credit assignment in sparse multi-turn environments while improving success on control, navigation, and manipulation benchmarks.
	[New] Spatial Mental Modeling from Limited Views Baiqiao Yin, Qineng Wang, Pingyue Zhang, Jianshu Zhang, Kangrui Wang, Zihan Wang, Jieyu Zhang, Keshigeyan Chandrasegaran, Han Liu, Ranjay Krishna, Saining Xie, Manling Li, Jiajun Wu, Li Fei-Fei Oral@ICCV 2025 SP4V, The Best of ICCV featured by Voxel51 [Project Page] [Paper] [Code] [Dataset] MindCube curates 21K spatial reasoning questions over 3K scenes and shows that guiding VLMs to map-then-reason boosts accuracy from 37.8% to 70.7%, highlighting cognitive mapping and reinforcement learning as keys to spatial mental modeling.
	[New] A Simple "Try Again" Can Elicit Multi-Turn LLM Reasoning Licheng Liu, Zihan Wang, Linjie Li, Chenwei Xu, Yiping Lu, Han Liu, Avirup Sil, Manling Li Preprint 2025 [Project Page] [Paper] [Code] Unary Feedback as Observation (UFO) shows that minimal prompts like “try again” keep single-turn quality while improving multi-turn accuracy by up to 14%, delivering a plug-and-play RL recipe for reflective reasoning agents.
	[New] RAGEN: Training Agents by Reinforcing Reasoning Zihan Wang, Kangrui Wang, Qineng Wang, Pingyue Zhang, Linjie Li, Zhengyuan Yang, Xing Jin, Kefan Yu, Minh Nhat Nguyen, Licheng Liu, Eli Gottlieb, Yiping Lu, Kyunghyun Cho, Jiajun Wu, Li Fei-Fei, Lijuan Wang, Yejin Choi, Manling Li Open Source Project* [Homepage] [X Post] [Paper] [Code] [Poster](Best Poster @ MMLS 2025) We introduce RAGEN built upon the general multi-turn RL framework called State-Thinking-Actions-Reward Policy Optimization (StarPO) to train LLM reasoning agents via RL in multi-turn, stochastic environments. We observe how and why models would collapse in multi-turn RL, and show several limitations of agent reasoning under current RL paradigms.
	[New] Chain-of-Experts: Unlocking the Communication Power of Mixture-of-Experts Models Zihan Wang, Rui Pan, Jiarui Yao, Róbert Csordás, Linjie Li, Lu Yin, Jiajun Wu, Tong Zhang, Manling Li†, Shiwei Liu† arXiv Preprint [Paper] [X Post] [Blog] [Code] We propose Chain-of-Experts (CoE), enabling sequential communication between MoE experts by processing tokens through multiple intra-layer iterations. CoE achieves 17.6–42% lower memory usage and reduces validation loss on Math benchmarks from 1.20 to 1.12 under comparable compute.
	[New] Re-thinking Temporal Search for Long-Form Video Understanding Jinhui Ye, Zihan Wang, Haosen Sun, Keshigeyan Chandrasegaran, Zane Durante, Cristobal Eyzaguirre, Yonatan Bisk, Juan Carlos Niebles, Ehsan Adeli, Li Fei-Fei, Jiajun Wu, Manling Li CVPR 2025, Oral@ICCV 2025 LongVid-Foundations [Project Page] [X Post] [Dataset] [Paper] [Code] [Demo] [Poster] We introduce LongVideoHaystack, a 480-hour video temporal search dataset with 15,092 human-annotated instances, where SOTA scores 2.1% Temporal F₁. Our temporal search framework T* boosts GPT-4o from 50.5% to 53.1% and LLaVA-OV from 56.5% to 62.4% on LongVideoBench XL.
	[Highlight] Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models Zihan Wang, Deli Chen, Damai Dai, Runxin Xu, Zhuoshu Li, Yu Wu EMNLP 2024 [Paper] [Code] We harness the Specialized Power of Experts in MoE LLMs through ESFT. By fine-tuning Down to 5% Experts in a layer, near-full performance can be achieved.
	[Highlight] MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback Xingyao Wang, Zihan Wang, Jiateng Liu, Yangyi Chen, Lifan Yuan, Hao Peng, Heng Ji ICLR 2024 [Paper] [Project Page] [Code] We introduce MINT, a benchmark for evaluating LLMs in Multi-turn Interactions with tools and language feedback. MINT reveals several limitations in existing RLHF and SIFT methods on multi-turn interaction.
	[Highlight] DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model DeepSeek AI (157 authors including Zihan Wang) [Paper] [Code] DeepSeek-V2 is a strong MoE model with 23B activated parameters. It achieves stronger performance compared to DeepSeek 67B, saving 42.5% training costs and boosting generation by up to 5.76x.

Invited Research Presentations

Training LLM Agents by Reinforcing Reasoning, UIUC NLP Reading Group 2025.04
LLM Agents with Language Feedback, Chinese R Conference 2023.11
Retrieval Augmented Language Models and Applications, RUC Science and Technology Fair 2023.05
Large Language Models and Applications, Capital of Statistics 2023.03
Pre-trained Language Models and Applications, RUC Mingli College 2023.01

Professional Service

Reviewer - EMNLP 2024 (Outstanding Reviewer), ICLR 2025-2026, AAAI 2025, ACL 2025, CVPR 2025, NeurIPS 2025
Session Organization - Session of Language Models & Agents in Chinese R Conference 2023, BoF and Affinity Group at EMNLP 2024, Foundation Models Meet Embodied Agents Workshop at CVPR 2025.
Academic Mentor - National University Student Innovation Program (2023, No. 3)
Article Translation: English - Unveiling DeepSeek: A Story of Even More Radical Chinese Technological Idealism, The Madness of High-Flyer: A Hidden AI Giant’s Journey into Large Models, Chinese - COS Interview with Donald B. Rubin, Core Views on AI Safety: When, Why, What, and How

Selected Societal Engagements

Blogs (See X Highlight Page for more):
- DeepSeek Culture: How Innovation Thrives
- Text Can Speak: A Chinese user's perspective on Xiaohongshu (RedNote)
- Min-p as a rule of physics
- Empirical Methods on Saving Money in Research
- To come soon: Empirical Methods on Saving Time in Research
Posts (Zhihu, Chinese Q&A platform):
- How do you determine if a professor is an academic rising star?
- From a non-elite university, winning many awards in ACM competitions. However, my low GPA keeps me away from scholarships—am I useless?
- How has your NLP research changed since GPT-4's release?
Press Coverage:
LatePost: To Open-Source or Not? Strategic Crossroads in the Era of LLM
Z Tech: How do RL and MoE Ignite LLMs (RL与MoE如何点燃大模型革命)
More: MIT Tech Review, New York Times (1, 2), CNN, Bloomberg, VentureBeat, Z Potentials, Weixin

Awards

McCormick School of Engineering Fellowship, Northwestern, 2024
Baosteel Outstanding Student Award, 7/30000+, Renmin Univ. of China, 2023
First Class Academic Excellence Award (top 3% GPA), Renmin Univ. of China, 2021
Provincial First Prize, Contemporary Undergraduate Mathematical Contest in Modeling, 2021
Honorable Mention, Mathematical Contest in Modeling and Interdisciplinary Contest in Modeling, 2021

Misc

I like to work and chat with people from diverse backgrounds (🌈), which I believe is the key to true innovation. Feel free to reach out for an online chat (or in person if you are in Evanston / Chicago Area).
I love Sandbox games like Minecraft, Danmaku games like Touhou Project, and Music games like Love Live. I also loved to design and make RPG games when I was in primary school (with RMXP on WindowsXP).
My dream was to be a vlogger and I post videos on bilibili, including vlogs, game playing records and some parody videos.
Beyond Chinese and English, I’ve picked up some Japanese due to my childhood love for anime. My favorite Anime were ワンピース and Fate/stay night.
I grew up in Wuhan, China and studied at No. 1 Middle School @ CCNU . I'm truly grateful for my time there.

Website design from Jon Barron