Peking University · CS PhD Candidate

Hongcheng Wang

I am a fourth-year CS PhD candidate at Peking University. My supervisor is Prof. Hao Dong.

我预计于 2027 年 6 月博士毕业,目前也在关注具身大脑 / 具身智能 VLA、VLM / LLM 后训练等方向的研究与算法岗位机会,欢迎交流。

My current research interests are organized into two parts:

  • Chain-of-thought Generation: I study reinforcement-learning post-training for large models, especially how to stably train <think> to better guide <answer>.
  • Human-centered Robot Decision Making: I take a Cognitive Behavioral Theory's perspective to enable robots to understand human cognition (e.g., latent demands and preferences) and behavior (e.g., observable habits and norms), so they can better serve people.
  • LLM-based Multi-Agent Reinforcement Learning (On going): I study reinforcement learning where LLMs act as multiple agents, with a particular focus on agent-level credit assignment and communication credit assignment.

Blog

Education

Research

My current work has two main directions: (1) chain-of-thought generation for large models, and (2) cognitively grounded, human-centered robot decision making in embodied settings, where robots model human demands, preferences, habits, and norms to provide better assistance.

Services

  • ICRA (2023), Reviewer
  • IROS (2024), Reviewer
  • NeurIPS (2024, 2025), Reviewer
  • ACM MM (2025), Reviewer
  • ICLR (2025, 2026), Reviewer
  • ICML (2025, 2026), Reviewer
  • RA-L (2026), Reviewer

Teaching Assistant

  • Fundamentals of Artificial Intelligence, Spring (2023, 2024, 2025, 2026)
  • Introduction to Computation (A), Fall 2025