GAO, Yuan

Adjunct Assistant Professor

Education Background

Ph.D. (Uppsala University)

Master (University of Helsinki)

Research Field

Robot Learning algorithms, heterogeneous multi-robot systems, machine behaviour

Academic Area

Artificial Intelligence and Robotics

Personal Website

https://scholar.google.com/citations?user=HgOAYUAAAAAJ&hl=en%20https%3A//gaoyua…

gaoyuan@cuhk.edu.cn

Biography

Prof. Yuan Gao currently serves as an Adjunct Assistant Professor and PhD supervisor at The Chinese University of Hong Kong, Shenzhen, and as an Associate Research Fellow at the Shenzhen Institute of Artificial Intelligence and Robotics for Society (AIRS). He is also a principal investigator for the embodied heterogeneous multi-robot systems direction at Shenzhen Institute of Artificial Intelligence and Robotics for Society. Recognized as a Shenzhen overseas high-level talent, Prof. Gao has made significant contributions to robotics research through participation in international projects like Sweden's SSF "Co-adaptive human-robot interactive systems" and the EU Horizon 2020 ANIMATAS project. He plays key roles in major national initiatives, serving as sub-project leader/unit leader for the National Key R&D Program "Natural Human-Computer Interaction Hardware and Software Systems for Hybrid Intelligence" and as a core participant in the project "5G-Based Autonomous Collaborative Technology for Heterogeneous Multi-Robot Systems in Dynamic Open Environments". Prof. Gao leads several critical projects, including as project leader for the Guangdong Provincial Natural Science Fund project "Research on Large-Scale Intelligent Heterogeneous Multi-Robot Systems Based on Graph Representation and Multi-Agent Reinforcement Learning" and as core member and scientific metrics leader for the Shenzhen Peacock Team Project on Mobile Collaborative Robot Innovation and Entrepreneurship. His research focuses on heterogeneous multi-robot systems, robot learning algorithms (especially multi-agent reinforcement learning), multi-robot collaboration strategies, multi-sensor fusion, and the application of graph representations and large models (LLMs/VLMs) in robotics. Prof. Gao has published over 40 papers in prestigious journals and conferences, including IEEE T-RO, IEEE TITS, IEEE IoT-J, IEEE T-MECH, ACM IMWUT, ACM ACL, ACM CHI, IEEE RA-L, NeurIPS, ICRA, and IROS. Recent highlights include pioneering work on asymmetric self-play for multi-robot catching and dynamic self-organization using multi-modal large models."

---

"How do we understand human society—a scientific method, a professionally designed system, or your curiosity?" In self-growth, we pause to ponder life due to peculiar questions, and like others, I've become a researcher into some unknowns. Seeking answers is a lengthy process, requiring detours, scenery appreciation, existential reflection, and hard work.

I'm intrigued by heterogeneous multi-robot systems for data interaction and human social dynamics, akin to an astronomer's interest in telescopes and distant stars. This passion has guided me to pursue creating an extensive, complex, intelligent, and interactive robotic societal system to mirror human society, enhancing our self-understanding.

This goal involves robotics learning and control, natural language processing, image processing, neuroscience, and computational psychology, with a keen interest in deep learning, reinforcement learning, and neural methods for robot perception, control, and environmental physical modeling. These approaches not only aid in understanding ourselves but also lay a unified learning structure for an adaptive, efficient, and robust complex heterogeneous robot system.

Academic Publications

# first or corresponding author

1. Changheng Cai, Fei Xiao, Marcellus Vanza, Jian Zhu# and Yuan Gao#, et al. (2025) Multimodal Deformation Estimation of Soft Pneumatic Gripper During Operation，IEEE IROS

2. Tianqiang Yan, Ziqiao Lin, Lin Zhang, Zhenglong Sun, Yuan Gao#, et al. (2025) Entrospect: Information-Theoretic Self-Reflection Elicits Better Response Refinement of Small Language Models, ACM ACL.

3. Ping Feng Tingting Yang#, Mingyang Liang, Lin Wang, Yuan Gao#, et al. (2025) OC-HMAS: Dynamic Self-Organization and Self-Correction in Heterogeneous Multi-Agent Systems Using Multi-Modal Large Models, IEEE Internet of Things.

4. Wenqiang Lai, Tianwei Zhang, Tin Lun Lam, Yuan Gao#, et al. Vision-language Model-based Physical Reasoning for Robot Liquid Perception

5. Chongyang Wang. Gao, Y.#, et al. (2024). PepperPose: Full-Body Pose Estimation with a Companion Robot. ACM CHI.

6. Gao, Y., Chen, J., Chen, X., Wang, C., Hu, J., Deng, F., & Lam, T. L. (2023). Asymmetric Self-Play-Enabled Intelligent Heterogeneous Multirobot Catching System Using Deep Multiagent Reinforcement Learning. IEEE Transactions on Robotics.

7. Guan, H., Gao, Y., Zhao, M., Yang, Y., Deng, F., & Lam, T. L. (2022). AB-Mapper: Attention and BicNet based Multi-agent Path Planning for Dynamic Environment. 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 13799–13806.

8. Gao, Y., Sibirtseva, E., Castellano, G., & Kragic, D. (2020). Fast adaptation with meta-reinforcement learning for trust modelling in human-robot interaction. 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 305–312.

9. Gao, Y., Yang, F., Frisk, M., Hemandez, D., Peters, C., & Castellano, G. (2019). Learning socially appropriate robot approaching behavior toward groups using deep reinforcement learning. 2019 28th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), 1–8.

10. Gao, Y., Barendregt, W., Obaid, M., & Castellano, G. (2018). When robot personalisation does not help: Insights from a robot-supported learning study. 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), 705–712.

11. Gao, Y., Wallkötter, S., Obaid, M., & Castellano, G. (2018). Investigating deep learning approaches for human-robot proxemics. 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), 1093–1098.

12. Gao, Y., & Glowacka, D. (2016). Deep gate recurrent neural network. Asian Conference on Machine Learning, 350–365.

13. Gao, Y., Ilves, K., & Głowacka, D. (2015). Officehours: A system for student supervisor matching through reinforcement learning. Proceedings of the 20th International Conference on Intelligent User Interfaces (IUI), 29–32.

14. Junjie Hu, Chenyou Fan, Mete Ozay, Hua Feng, Yuan Gao, Tin Lun Lam (2025), Unlocking Drone Perception in Low AGL Heights: Progressive Semi-Supervised Learning for Ground-to-Aerial Perception Knowledge Transfer，IEEE Transactions on Intelligent Transportation Systems (TITS)

15. Chen, J., Gao, Y., Hu, J., Deng, F., & Lam, T. L. (2024). Meta Reinforcement Learning Based Sensor Scanning in 3D Uncertain Environments for Heterogeneous Multi-Robot Systems. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

16. Berthouze, N., Wang, C., Gao, Y., Fan, C., Hu, J., Lam, T., & Lane, N. (2023). Learn2Agree: Fitting With Multiple Annotators Without Objective Ground Truth.

17. Chen, J., Deng, F., Gao, Y., Hu, J., Guo, X., Liang, G., & Lam, T. L. (2023). Multirobolearn: An open-source framework for multi-robot deep reinforcement learning. 2023 IEEE International Conference on Robotics and Biomimetics (ROBIO), 1–6.

18. Hu, J., Fan, C., Jiang, H., Guo, X., Gao, Y., Lu, X., & Lam, T. L. (2023). Boosting lightweight depth estimation via knowledge distillation. International Conference on Knowledge Science, Engineering and Management, 27–39.

19. Wang, C., Gao, Y., Fan, C., Hu, J., Lam, T. L., Lane, N. D., & Bianchi-Berthouze, N. (2023). Learn2agree: Fitting with multiple annotators without objective ground truth. International Workshop on Trustworthy Machine Learning for Healthcare, 147–162.

20. Wang, Y., Lin, M., Xie, X., Gao, Y., Deng, F., & Lam, T. L. (2023). Asymptotically Efficient Estimator for Range-Based Robot Relative Localization. IEEE/ASME Transactions on Mechatronics.

21. Zhang, H., Luo, J., Gao, Y., & Ma, W. (2023). An intention inference method for the space non-cooperative target based on BiGRU-Self Attention. Advances in Space Research.

22. Ahlberg, S., Axelsson, A., Yu, P., Cortez, W. S., Gao, Y., Ghadirzadeh, A., Castellano, G., Kragic, D., Skantze, G., & Dimarogonas, D. V. (2022). Co-adaptive Human–Robot Cooperation: Summary and Challenges. Unmanned Systems, 10(2), 187–203.

23. Ahmad, M. I., Gao, Y., Alnajjar, F., Shahid, S., & Mubin, O. (2022). Emotion and memory model for social robots: a reinforcement learning based behaviour selection. Behaviour & Information Technology, 41(15), 3210–3236.

24. Chen, X., Ghadirzadeh, A., Yu, T., Wang, J., Gao, A. Y., Li, W., Bin, L., Finn, C., & Zhang, C. (2022). Lapo: Latent-variable advantage-weighted policy optimization for offline reinforcement learning. Advances in Neural Information Processing Systems, 35, 36902–36913.

25. Deng, F., Feng, H., Liang, M., Feng, Q., Yi, N., Yang, Y., Gao, Y., Chen, J., & Lam, T. L. (2022). Abnormal Occupancy Grid Map Recognition using Attention Network. 2022 International Conference on Robotics and Automation (ICRA), 8666–8672.

26. Hu, J., Fan, C., Ozay, M., Feng, H., Gao, Y., & Lam, T. L. (2022). Progressive self-distillation for ground-to-aerial perception knowledge transfer. ArXiv Preprint ArXiv:2208.13404.

27. Peng, M., Wang, C., Gao, Y., Shi, Y., & Zhou, X.-D. (2022). Multilevel hierarchical network with multiscale sampling for video question answering. ArXiv Preprint ArXiv:2205.04061.

28. Tang, J., Gao, Y., & Lam, T. L. (2022). Learning to Coordinate for a Worker-Station Multi-robot System in Planar Coverage Tasks. IEEE Robotics and Automation Letters, 7(4), 12315–12322.

29. Deng, F., Feng, H., Liang, M., Wang, H., Yang, Y., Gao, Y., Chen, J., Hu, J., Guo, X., & Lam, T. L. (2021). FEANet: Feature-enhanced attention network for RGB-thermal real-time semantic segmentation. 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 4467–4473.

30. Guan, H., Gao, Y., Zhao, M., Yang, Y., Deng, F., & Lam, T. L. (2021). AB-Mapper: Attention and BicNet Based Multi-agent Path Finding for Dynamic Crowded Environment. ArXiv Preprint ArXiv:2110.00760.

31. Peng, M., Wang, C., Gao, Y., Shi, Y., & Zhou, X.-D. (2021). Temporal pyramid transformer with multimodal interaction for video question answering. ArXiv Preprint ArXiv:2109.04735.

32. Wang, C., Gao, Y., Fan, C., Hu, J., Lam, T. L., Lane, N. D., & Bianchi-Berthouze, N. (2021). AgreementLearning: An End-to-End Framework for Learning with Multiple Annotators without Groundtruth. ArXiv Preprint ArXiv:2109.03596.

33. Wang, C., Gao, Y., Mathur, A., De C. Williams, A. C., Lane, N. D., & Bianchi-Berthouze, N. (2021). Leveraging activity recognition to enable protective behavior detection in continuous data. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 5(2), 1–27.

34. Yang, F., Gao, Y., Ma, R., Zojaji, S., Castellano, G., & Peters, C. (2021). A dataset of human and robot approach behaviors into small free-standing conversational groups. PloS One, 16(2), e0247364.

35. Li, Chengxi, Castellano, G., & Gao, Y. (2020). Efficient Learning of Socially Aware Robot Approaching Behavior Toward Groups via Meta-Reinforcement Learning. IEEE/RSJ International Conference on Intelligent Robots and Systems, 12156–12159.

36. Chen, X., Gao, Y., Ghadirzadeh, A., Bjorkman, M., Castellano, G., & Jensfelt, P. (2019). Skew-explore: Learn faster in continuous spaces with sparse rewards.

37. Hernandez, D., Denamganaı̈, K., Gao, Y., York, P., Devlin, S., Samothrakis, S., & Walker, J. A. (2019). A generalized framework for self-play training. 2019 IEEE Conference on Games (CoG), 1–8.

38. Li, Chengjie, Androulakaki, T., Gao, A. Y., Yang, F., Saikia, H., Peters, C., & Skantze, G. (2018). Effects of posture and embodiment on social distance in human-agent interaction in mixed reality. Proceedings of the 18th International Conference on Intelligent Virtual Agents, 191–196.

39. Zhang, P., Gao, A. Y., & Theel, O. (2018). Bandit learning with concurrent transmissions for energy-efficient flooding in sensor networks. EAI Endorsed Transactions on Industrial Networks and Intelligent Systems, 4(13).

40. Gao, A. Y., Barendregt, W., & Castellano, G. (2017). Personalised human-robot co-adaptation in instructional settings using reinforcement learning. IVA Workshop on Persuasive Embodied Agents for Behavior Change: PEACH 2017, August 27, Stockholm, Sweden.

41. Obaid, M., Gao, Y., Barendregt, W., & Castellano, G. (2017). Exploring users’ reactions towards tangible implicit probes for measuring human-robot engagement. Social Robotics: 9th International Conference, ICSR 2017, Tsukuba, Japan, November 22-24, 2017, Proceedings 9, 402–412.

42. Zhang, P., Gao, A. Y., & Theel, O. (2017). Less is more: Learning more with concurrent transmissions for energy-efficient flooding. Proceedings of the 14th EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services, 323–332.

43. Peng, M., Wang, C., Gao, Y., Bi, T., Chen, T., Shi, Y., & Zhou, X.-D. (2020). Recognizing micro-expression in video clip with adaptive key-frame mining. IJCAI