Peking University · CS PhD Candidate
Hongcheng Wang
I am a fourth-year CS PhD candidate at Peking University. My supervisor is Prof.
Hao Dong.
我预计于 2027 年 6 月博士毕业,目前也在关注具身大脑 / 具身智能 VLA、VLM / LLM 后训练等方向的研究与算法岗位机会,欢迎交流。
My current research interests are organized into two parts:
-
Chain-of-thought Generation: I study reinforcement-learning post-training for large models,
especially how to stably train <think> to better guide <answer>.
-
Human-centered Robot Decision Making: I take a Cognitive Behavioral Theory's perspective to enable robots to understand human cognition (e.g., latent demands and preferences) and behavior (e.g., observable habits and norms), so they can better serve people.
-
LLM-based Multi-Agent Reinforcement Learning (On going): I study reinforcement learning where LLMs act as multiple agents, with a particular focus on agent-level credit assignment and communication credit assignment.