Siyu Xu is a final-year M.Phil student at The University of Sydney, supervised by A/Prof. Chang Xu. His research focuses on large multimodal models (LMMs), particularly Vision-Language-Action (VLA) models. His goal is to develop efficient and general-purpose robotic systems capable of understanding and interacting with the world through vision, language, and actions, enabling them to seamlessly assist humans in various real-world tasks and everyday life.
🔥 News
- 2025.01: 🎉🎉 One paper is accepted to NAACL 2025 Findings.
📝 Publications

Siyu Xu, Yunke Wang, Chenghao Xia, Dihao Zhu, Tao Huang, Chang Xu
[Under review] [Project]

CollagePrompt: A Benchmark for Budget-Friendly Visual Recognition with GPT-4V
Siyu Xu, Yunke Wang, Daochang Liu, Bo Du, Chang Xu
Conference of the Nations of the Americas Chapter of the ACL (NAACL), 2025
🎖 Honors and Awards
- 2022.05 Second Class Prize - ASC22 Student Supercomputer Challenge, Asia Supercomputer Community
- 2020.10 First Place Award - Pre-training for Video Captioning Challenge, ACM International Conference on Multimedia
- 2020.10 7th Place - Artificial Intelligence Competition for Video Generation Challenge, ZHEJIANG LAB
- 2020.05 9th Place - International Audio and Video Algorithm Optimization Competition, Mongo Media
🧑🏫 Teaching
- 2024 S1, Tutor of COMP5329, Deep Learning, USYD
- 2023 S2, Tutor of COMP5328, Advanced Machine Learning, USYD
📖 Educations
- 2023.07 - 2025.06, M.Phil in Computer Vision, University of Sydney
- 2022.08 - 2023.06, Master of Information Technology, University of Sydney
💻 Internships
- 2021.03 - 2021.07, Matrixtime Robotics, China.