Xiaoshuai Hao

   Researcher at BAAI

    xshao@baai.ac.cn

        

I am currently a researcher at the Beijing Academy of Artificial Intelligence, focusing on embodied multimodal large models. Previously, I received my Ph.D. from the Institute of Information Engineering at the Chinese Academy of Sciences, advised by Prof. Bo Li.


We have several academic visitor and intern positions at Beijing Academy of Artificial Intelligence. We actively work on Multimodal Retrieval, Multi-Modal Learning, Automatic Driving Perception, and Embodied Intelligence. If you like what we do, don't hesitate to contact me.



Research Interests

  • Multimodal Retrieval
  • Multi-Modal Learning
  • Automatic Driving Perception
  • Embodied Intelligence

News

Industrial Experience

   

Amazon Web Services


2021.09-2023.01
Mentor:Yi Zhu, Mu Li
   

Samsung Research China - Beijing (SRC-B)

2023.01-2024.09
Mentor:Hui Zhang, Weiming Li
   

Beijing Academy of Artificial Intelligence

2024.09-至今
Mentor:Zhongyuan Wang
   

Recent Publications

* equal contributions     ‡ project lead     § corresponding author


A Hierarchical Reinforcement Learning Framework for Multi-UAV Combat Using Leader-Follower Strategy

Jinhui Pang, Jinglin He, Noureldin Mohamed Abdelaal Ahmed Mohamed, Changqing Lin, Zhihui Zhang, Xiaoshuai Hao§
PDF

RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete

Yuheng Ji*, Huajie Tan*, Jiayu Shi*, Xiaoshuai Hao* ‡, Yuan Zhang, et al.
PDF   |   Home

TASAR: TRANSFER-BASED ATTACK ON SKELETAL ACTION RECOGNITION

Yunfeng Diao, Baiqi Wu§, Ruixuan Zhang, Ajian Liu, Xiaoshuai Hao, Xingxing Wei, Meng Wang, He Wang§
PDF   |   Code

AS-GCL: Asymmetric Spectral Augmentation on Graph Contrastive Learning

Ruyue Liu, Rong Yin§, Yong Liu, Xiaoshuai Hao, Haichao Shi, Can Ma, Weiping Wang
PDF

MapFusion: A novel BEV feature fusion network for multi-modal map construction

Xiaoshuai Hao, Yunfeng Diao§, Mengchuan Wei, Yifan Yang, Peng Hao, Rong Yin§, Hui Zhang, et al.
PDF

STViT+: improving self-supervised multi-camera depth estimation with spatial-temporal context and adversarial geometry regularization

Zhuo Chen*, Haimei Zhao*, Xiaoshuai Hao, Bo Yuan, Xiu Li
PDF

Is Your HD MapConstructor Reliable under Sensor Corruptions?

Xiaoshuai Hao, Mengchuan Wei, Yifan Yang, Haimei Zhao, et al.
PDF   |   Home

MapDistill: Boosting Efficient Camera-based HD Map Construction via Camera-LiDAR Fusion Model Distillation

Xiaoshuai Hao*, Ruikai Li*, Hui Zhang, Dingzhe Li, Rong Yin, et al.
PDF

KALAHash: Knowledge-Anchored Low-Resource Adaptation for Deep Hashing

Shu Zhao, Tan Yu, Xiaoshuai Hao, Wenchao Ma, Vijaykrishnan Narayanan
PDF   |   Code

FTF-ER: Feature-Topology Fusion-Based Experience Replay Method for Continual Graph Learning

Jinhui Pang, Changqing Lin, Xiaoshuai Hao§, Rong Yin, Zixuan Wang, et al.
PDF   |   Code

MBFusion: A New Multi-modal BEV Feature Fusion Method for HD Map Construction

Xiaoshuai Hao, Hui Zhang, Yifan Yang, Yi Zhou, Sangil Jung, et al.
PDF

CUSTOMIZED TREATMENT PER PIXEL FOR BLIND IMAGE SUPER-RESOLUTION

Guanqun Liu, Xiaoshuai Hao§
PDF

Enhancing 3D Hand Pose Estimation via Dense Ordinal Regression Network

Yamin Mao, Zhihua Liu, Weiming Li, SoonYong Cho, Qiang Wang, Xiaoshuai Hao§
PDF

ESC-MISR: Enhancing Spatial Correlations for Multi-Image Super-Resolution in Remote Sensing

Zhihui Zhang, Jinhui Pang, Jianan Li, Xiaoshuai Hao
Best Paper Candidate Award
PDF

Dual Alignment Unsupervised Domain Adaptation for Video-Text Retrieval

Xiaoshuai Hao, Wanqian Zhang§, Dayan Wu, Fei Zhu, Bo Li
PDF

Uncertainty-Aware Alignment Network for Cross-Domain Video-Text Retrieval

Xiaoshuai Hao, Wanqian Zhang§
PDF

MixGen: A NewMulti-Modal Data Augmentation

Xiaoshuai Hao*, Yi Zhu*, Srikar Appalaraju*, Aston Zhang, Wanqian Zhang, Bo Li, Mu Li
PDF

LISTEN AND LOOK: MULTI-MODAL AGGREGATION AND CO-ATTENTION NETWORK FOR VIDEO-AUDIO RETRIEVAL

Xiaoshuai Hao, Wanqian Zhang§, Dayan Wu, Fei Zhu, Bo Li
PDF

Multi-Feature Graph Attention Network for Cross-Modal Video-Text Retrieval

Xiaoshuai Hao, Yucan Zhou§, Dayan Wu, Wanqian Zhang, Bo Li, Weiping Wang
PDF

WHAT MATTERS: ATTENTIVE AND RELATIONAL FEATURE AGGREGATION NETWORK FOR VIDEO-TEXT RETRIEVAL

Xiaoshuai Hao, Yucan Zhou§, Dayan Wu, Wanqian Zhang, Bo Li, Weiping Wang, Dan Meng
PDF

Unpublished Manuscript

* equal contributions     ‡ project lead     § corresponding author


TLA: Tactile-Language-Action Model for Contact-Rich Manipulation

Peng Hao*, Chaofan Zhang*, Dingzhe Li, Xiaoge Cao, Xiaoshuai Hao, Shaowei Cui, Shuo Wang
arXiv
PDF

AffordGrasp: In-Context Affordance Reasoning for Open-Vocabulary Task-Oriented Grasping in Clutter

Yingbo Tang, Shuaike Zhang, Xiaoshuai Hao§ ‡, Pengwei Wang, Jianlong Wu, Zhongyuan Wang, Shanghang Zhang
arXiv
PDF

Enhancing Adversarial Robustness of Vision-Language Models through Low-Rank Adaptation

Yuheng Ji*, Yue Liu*, Zhicheng Zhang, Zhao Zhang, Yuting Zhao, Xiaoshuai Hao, Gang Zhou, Xingwei Zhang, Xiaolong Zheng§
arXiv
PDF

What Foundation Models can Bring for Robot Learning in Manipulation : A Survey

Dingzhe Li, Yixiang Jin, YuHao Sun, Yong A, Hongze Yu, Jun Shi, Xiaoshuai Hao, et al.
arXiv
PDF

BCTR: Bidirectional Conditioning Transformer for Scene Graph Generation

Peng Hao, Xiaobing Wang, Yingying Jiang, Hanchao Jia, Xiaoshuai Hao§
arXiv
PDF

Communication-Efficient Personalized Federal Graph Learning via Low-Rank Decomposition

Ruyue Liu, Rong Yin§, Xiangzhen Bo, Xiaoshuai Hao, Xingrui Zhou, Yong Liu, Can Ma, Weiping Wang
arXiv
PDF

DWCL: Dual-Weighted Contrastive Learning for Multi-View Clustering

Hanning Yuan*, Zhihui Zhang*, Lianhua Chi, Qi Guo, Sijie Ruan, Jinhui Pang§, Xiaoshuai Hao§
arXiv
PDF

MapNav: ANovel Memory Representation via Annotated Semantic Maps for VLM-based Vision-and-Language Navigation

Lingfeng Zhang*, Xiaoshuai Hao* ‡, Qinwen Xu, Qiang Zhang, Xinyao Zhang, Pengwei Wang, Jing Zhang, Zhongyuan Wang, Shanghang Zhang§, Renjing Xu§
arXiv
PDF

MSC-Bench: Benchmarking and Analyzing Multi-Sensor Corruption for Driving Perception

Xiaoshuai Hao, Guanqun Liu, Yuting Zhao, Yuheng Ji, Mengchuan Wei, Haimei Zhao, Lingdong Kong, Rong Yin, Yu Liu
arXiv
PDF   |   Home

International Competition


EPIC-Kitchens Dataset Challenges
Multi-Instance Action Retrieval Track 2021

Xiaoshuai Hao, Wangqian Zhang, Dejie Yang, Shu Zhao, Dayan Wu, Bo Li, Weiping Wang
First Place
IEEE/CVF Computer Vision and Pattern Recognition (CVPR)

EPIC-Kitchens Dataset Challenges
Interaction Recognition Track 2023

Yuqi Li, Yizhi Luo, Xiaoshuai Hao, Chuanguang Yang, Zhulin An, Dantong Song, Wei Yi
Third Place Award
IEEE/CVF Computer Vision and Pattern Recognition (CVPR)

EPIC-Kitchens Dataset Challenges
Multi-Instance Action Retrieval Track 2022

Xiaoshuai Hao, Yufan Liu, Wangqian Zhang, Dayan Wu, Bo Li
Third Place Award (Joint)
IEEE/CVF Computer Vision and Pattern Recognition (CVPR)

The RoboDrive Challenge
Track 2: Robust Map Segmentation

Xiaoshuai Hao, Yifan Yang, Hui Zhang, Mengchuan Wei, Yi Zhou, Haimei Zhao, Jing Zhang
The Innovative Solution (Honorable Mention)
IEEE Conference on Robotics and Automation (ICRA)

The RoboDrive Challenge
Track 2: Robust Map Segmentation

Xiaoshuai Hao, Yifan Yang, Hui Zhang, Mengchuan Wei, Yi Zhou, Haimei Zhao, Jing Zhang
The 3rd place in the category
IEEE Conference on Robotics and Automation (ICRA)

A Challenge for Out-of-Distribution Generalization in Computer Vision (OOD-CV)
OOD-CV: Classification Track (Self-Supervised)

Yuqi Li, Yizhi Luo, Chuangang Yang, Zhulin An, Xiaoshuai Hao, Yihang Zhou
Third Place
IEEE/CVF International Conference on Computer Vision (ICCV)

Academic Services

Conference Reviewer

  • IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • IEEE/CVF International Conference on Computer Vision (ICCV)
  • European Conference on Computer Vision (ECCV)
  • Conference on Neural Information Processing Systems (NeurIPS)
  • International Conference on Learning Representations (ICLR)
  • International Conference on Machine Learning (ICML)
  • Association for the Advancement of Artificial Intelligence (AAAI)
  • IEEE International Conference on Robotics and Automation (ICRA)
  • IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Journal Reviewer

  • International Journal of Computer Vision (IJCV)
  • IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
  • IEEE Transactions on Neural Networks and Learning Systems (TNNLS)
  • IEEE Transactions on Intelligent Vehicles (TIV)
  • IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)
  • IEEE Transactions on Multimedia (TMM)
  • IEEE Robotics and Automation Letters (RA-L)