HuB: Learning Extreme Humanoid Balance

1Tsinghua University, 2Shanghai Qi Zhi Institute, 3Shanghai Artificial Intelligence Laboratory 4Tongji University 5UC Berkeley 6UC San Diego
*Equal Contribution

Extreme Humanoid Balance

(Played at the original speed)

Swallow Balance

Bruce Lee's Kick

Ne Zha Pose

High Knees

Single-Leg Stand

Deep Squat

External Perturbations

(Played at the original speed)

Swallow Balance

Bruce Lee's Kick

Ne Zha Pose

Single-Leg Stand

Deep Squat

Long-Horizon Execution

(Played at the original speed)

Bruce Lee's Kick (10+ consecutive executions)

Comparative Results

(Played at the original speed)

Evaluated on task

HuB

OmniH2O

Abstract

The human body demonstrates exceptional motor capabilities—such as standing steadily on one foot or performing a high kick with the leg raised over 1.5 meters—both requiring precise balance control. While recent research on humanoid control has leveraged reinforcement learning to track human motions for skill acquisition, applying this paradigm to balance-intensive tasks remains challenging.

In this work, we identify three key obstacles: instability from reference motion errors, learning difficulties due to morphological mismatch, and the sim-to-real gap caused by sensor noise and unmodeled dynamics. To address these challenges, we propose HuB (Humanoid Balance), a unified framework that integrates reference motion refinement, balance-aware policy learning, and sim-to-real robustness training, with each component targeting a specific challenge.

We validate our approach on the Unitree G1 humanoid robot across challenging quasi-static balance tasks, including extreme single-legged poses such as Swallow Balance and Bruce Lee's Kick. Our policy remains stable even under strong physical disturbances—such as a forceful soccer strike—while baseline methods consistently fail to complete these tasks.

Learning Framework for Extreme Humanoid Balance

Interpolate start reference image.

HuB Overview. To tackle the challenges of extreme balance tasks on humanoids, HuB integrates three components: (a) a motion refinement process that improves the quality and feasibility of reference motions; (b) a balance-aware policy learning strategy that enables stable execution of challenging balance motions; and (c) a robustness training mechanism to improve sim-to-real consistency and deployment stability.