About Me

I am a PhD student at the National University of Singapore, fortunate to be supervised by Prof. HUANG Zhiyong. My research is primarily focused on multimodality, autonomous agents, and code generation.

Previously, I was a research intern at Qwen, supervised by Junyang Lin and Binyuan Hui, where I contributed to building the Qwen model series, including Qwen-VL, Qwen-3.5, and Qwen-Coder. I have also had the great opportunity to work with Prof. Tat-Seng Chua, Prof. Junxian He, and Prof. Michael Qizhe Shieh.

Recent News

  • Apr 2026: One paper accepted to ICML 2026.
  • Feb 2026: Released Qwen-3.5 and Qwen 3 Coder Next.
  • Jan 2026: Three papers accepted to ICLR 2026.
  • Sep 2025: Released Qwen3-VL.
  • Sep 2025: One paper (SE-GUI) accepted to NeurIPS 2025.
  • Aug 2025: Two papers accepted to EMNLP 2025.
  • Jul 2025: ScreenSpot-Pro accepted to ACMMM 2025 as an Oral Presentation.
  • May 2025: Three papers accepted to ACL 2025.

Selected Publications

Agents & GUI Interaction

ScreenSpot-Pro: GUI Grounding for Professional High-Resolution Computer Use (Oral Presentation)
Kaixin Li, Ziyang Meng, Hongzhan Lin, Ziyang Luo, Yuchen Tian, Jing Ma, Zhiyong Huang, Tat-Seng Chua
ACMMM 2025 · ICLR 2025 Workshop on Reasoning and Planning for LLMs
Used by Google Gemini, GPT-5.2, Qwen-VL Series, Microsoft OmniParser, and Meta's Muse Spark.
65,000+ dataset downloads.
Enhancing Visual Grounding for GUI Agents via Self-Evolutionary Reinforcement Learning
Xinbin Yuan, Jian Zhang, Kaixin Li, Zhuoxuan Cai, Jie Chen, Lujian Yao, Enguang Wang, Qibin Hou, Jinwei Chen, Peng-Tao Jiang, Bo Li
NeurIPS 2025
Grounding Computer Use Agents on Human Demonstrations
Aarash Feizi, Shravan Nayak, Xiangru Jian, Kevin Qinghong Lin, Kaixin Li, Rabiul Awal, Xing Han Lù, Johan Obando-Ceron, Juan A. Rodriguez, Nicolas Chapados, David Vazquez, Adriana Romero-Soriano, Reihaneh Rabbany, Perouz Taslakian, Christopher Pal, Spandana Gella, Sai Rajeswara
ICLR 2026
HackWorld: Evaluating Computer-Use Agents on Exploiting Web Application Vulnerabilities
Xiaoxue Ren*, Penghao Jiang*, Kaixin Li*, Zhiyong Huang, Xiaoning Du, Jiaojiao Jiang, Zhenchang Xing, Jiamou Sun, Terry Yue Zhuo
ICLR 2026
Robi Butler: Remote Multimodal Interactions with Household Robot Assistant
Anxing Xiao, Nuwan Janaka, Tianrun Hu, Anshul Gupta, Kaixin Li, Cunjun Yu, David Hsu
ICRA 2025

Coding & Code Intelligence

MMCode: Evaluating Multi-Modal Code Large Language Models with Visually Rich Programming Problems
Kaixin Li, Yuchen Tian, Qisheng Hu, Ziyang Luo, Zhiyong Huang, Jing Ma
EMNLP 2024 Findings · ICLR 2025 Workshop on Reasoning and Planning for LLMs
Used by Qwen2.5-VL and Qwen3-VL.
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution
Terry Yue Zhuo, Xiaolong Jin, Hange Liu... Kaixin Li, Yuhan Cao, Bo Liu, et al.
arXiv
Tree-of-Evolution: Tree-Structured Instruction Evolution for Code Generation in Large Language Models
Ziyang Luo, Kaixin Li, Hongzhan Lin, Yuchen Tian, Mohan Kankanhalli, Jing Ma
ACL 2025
MCTS-Judge: Test-Time Scaling in LLM-as-a-Judge for Code Correctness Evaluation
Yutong Wang, Pengliang Ji, Chaoqun Yang, Kaixin Li, Ming Hu, Jiaoyang Li, Guillaume Sartoretti
arXiv preprint (2502.12468)
InstructCoder: Empowering Language Models for Code Editing (Oral Presentation)
Kaixin Li*, Qisheng Hu*, Xu Zhao, Yuxi Xie, Tiedong Liu, Hui Chen, Qizhe Xie*, Junxian He*
ACL 2024 SRW

Open Source

  • TACO-verified: Verified code contest problems and solutions.
    5,000+ downloads · Widely used by the community and by DeepCoder.
    [Dataset]
  • IconStack-48M: The largest open-source icon dataset with 48 million images, SVGs, captions, and rich metadata.
    [Dataset]

Services

Peer Reviewer for:

  • ICLR (2024, 2025, 2026)
  • CVPR (2025)
  • ECCV (2026)
  • ACL (2025)
  • EMNLP (2025)
  • AAAI (2025)
  • ACMMM (2025)