😁Hi there, I’m Xiaofu
👨🏻💻I’m a First-Year NLP PhD student at MBZUAI supervised by Prof. Yova Kementchedjhieva. I am also closely working with my friend Yaxin Luo. I am so glad to do research projects under the supervision of Prof. Dimitrios Papadopoulos and Prof.Weizhi Meng.
📔My research focuses on Multimodal Learning and Representation Learning, with a particular emphasis on fine-grained alignment and interpretable representations across images, text, and video. I aim to move beyond coarse understanding toward reliable distinction and verification of fine details, applying these capabilities to dense image/video captioning and temporal event understanding. I also focus on multimodal evaluation and factual consistency, developing detail-sensitive metrics and benchmarks.
Publications:
📄SPECS: Specificity-Enhanced CLIP-Score for Long Image Caption Evaluation(EMNLP 2025 Main)
Xiaofu Chen, Israfel Salazar, Yova Kementchedjhieva
📄DViN: Dynamic Visual Routing Network for Weakly Supervised Referring Expression Comprehension(CVPR 2025)
Xiaofu Chen, Yaxin Luo, Gen Luo, Jiayi Ji, Henghui Ding, Yiyi Zhou
📄APL: Anchor-based Prompt Learning for One-stage Weakly Supervised Referring Expression Comprehension(ECCV 2024)
Yaxin Luo, Jiayi Ji, Xiaofu Chen, Yuxin Zhang, Tianhe Ren, Gen Luo
