Hongchen Wei

I am currently a final-year Ph.D. student at Wuhan University, under the supervision of Prof. Zhenzhong Chen.

I received my M.E. degree from Nanjing University of Science and Technology, China, in 2023.

I received my B.Sc. degree from Xi'an Shiyou University, China, in 2020.


profile photo
Main Research Interests

My current research focuses on building executable virtual environments inspired by generative agents, which simulate real-world interaction dynamics for dynamic evaluation, synthetic data generation, and RL-based post-training. I also work on agentic multimodal understanding, especially tool-augmented long-document and long-video agents, as well as multimodal model merging for aligning perception and reasoning capabilities and enabling cross-modal knowledge transfer.

  • Executable generative-agent environments for dynamic evaluation, synthetic data generation, and Agentic RL
  • Tool-augmented multimodal agents for long-document and long-video understanding
  • Multimodal model merging for perception-reasoning alignment and cross-modal knowledge transfer
News
  • [Dec. 2025 - Present] Research intern at Microsoft Research Asia (MSRA), working on executable multi-agent environments for realistic office workflows, Agentic Document Understanding, agent evaluation, and data synthesis.
Pre-prints
PASA: Post-Merge Perception-Reasoning Asymmetry as a Self-Alignment Signal for MLLMs
Hongchen Wei, Zhenzhong Chen.
NeurIPS 2026 (under review)
Training-Free Reasoning and Reflection in MLLMs
Hongchen Wei, Zhenzhong Chen
arXiv Preprint, 2025
LongCaptioning: Unlocking the Power of Long Caption Generation in Large Multimodal Models
Hongchen Wei, Zhihong Tan, Yaosi Hu, Chang Wen Chen, Zhenzhong Chen
arXiv Preprint, 2025
LOP: Learning Optimal Pruning for Efficient On-Demand MLLMs Scaling
Zhihan Zhang, Xiang Pan, Hongchen Wei, Zhenzhong Chen
arXiv Preprint, 2025
RSFAKE-1M: A Large-Scale Dataset for Detecting Diffusion-Generated Remote Sensing Forgeries
Zhihong Tan, Jiayi Wang, Huiying Shi, Binyuan Huang, Hongchen Wei, Zhenzhong Chen
arXiv Preprint, 2025
TDSAgent: A Task-Driven Sampling Agent for Long Video Question Answering
Author list includes Hongchen Wei
Under Review, 2026
GTC: Game-Theoretic Token Compression for Video Large Language Models
Author list includes Hongchen Wei
Under Review, 2026
ETC: Extreme Token Compression via Task-aware Visual Information Distillation in VLMs
Author list includes Hongchen Wei
Under Review, 2026
Publications
See What We Cannot See: A Geo-guided Reasoning Benchmark for Object Counting under Adverse Earth Observation Conditions
Author list includes Hongchen Wei
CVPR, 2026
Visual Context Window Extension: A New Perspective for Long Video Understanding
Hongchen Wei, Zhenzhong Chen
ACM MM (CCF-A Conference), 2025
Project page
RealVG: Unleashing MLLMs for Training-Free Spatio-Temporal Video Grounding in the Wild
Hongchen Wei, Zhenzhong Chen
ACM MM (CCF-A Conference), 2025
Remote Sensing Semantic Segmentation Quality Assessment based on Vision Language Model
Huiying Shi, Zhihong Tan, Zhihan Zhang, Hongchen Wei, Yaosi Hu, Yingxue Zhang, Zhenzhong Chen
TGRS (CCF-B Journal), 2025
Improving Generalization of Image Captioning with Unsupervised Prompt Learning
Hongchen Wei, Zhenzhong Chen
TOMM (CCF-B Journal), 2024
Exploiting Cross-Modal Prediction and Relation Consistency for Semisupervised Image Captioning
Yang Yang, Hongchen Wei , Hengshu Zhu, Dianhai Yu, Hui Xiong, Jian Yang
TCYB (CCF-B Journal), 2022 (student first author)
Code
S2OSC: A Holistic Semi-Supervised Approach for Open Set Classification
Yang Yang, Hongchen Wei , Zhenqiang Sun, Guangyu Li, Yuanchun Zhou, Hui Xiong, Jian Yang
TKDD (CCF-B Journal), 2021 (student first author)
Activities
  • Reviewer: ICLR25/26, CVPR25/26, NeurIPS25/26, ICML26, TNNLS

Last updated in Jun. 2026.

Homepage credits: Jon Barron.