End-to-end deployment pipeline: retarget → MuJoCo → real X2
The full deployment pipeline on a single canonical natural-walk motion. Left: the retargeted reference motion (mocap retargeted onto the X2 morphology) — the upstream input to training. Middle: the trained whole-body controller checkpoint replayed in MuJoCo. Right: the same checkpoint deployed on the AgiBot X2 Ultra hardware. Reference, simulator, real — three stages of one pipeline on the same motion.
1Abstract
We deploy a learned whole-body controller on the AgiBot X2 Ultra (31 DoF, 14-DoF dexterous hands) — to our knowledge the first publicly documented Sonic-family policy on a non-G1 humanoid. Reaching the real robot from upstream Sonic required closing two sequential transfers: a sim-to-sim transfer from the training simulator (IsaacLab) to the deployment evaluation simulator (MuJoCo), and a sim-to-real transfer from MuJoCo to hardware through a closed-API vendor motion controller.
Adapting upstream Sonic surfaced gaps at every layer. Motion retargeting introduced arm-flip and wrist-clamp failures rooted in inverse-kinematics assumptions specific to the original platform. The sim-to-sim transfer was blocked by three independent observation- and encoding-layer bugs that masqueraded as physics-tuning problems: a foot-collision URDF mismatch in the training distribution; a 6D-rotation channel-order error that survived our tolerance-based parity checks because permutations preserve aggregate norms; and a tokenizer-layout error at the runtime-to-export adapter boundary.
The sim-to-real handoff produced roughly 1.6 s of dual-publisher whir because the vendor controller exposed only coarse start/stop primitives and no graceful-overlap command; a persistent-client mode escalator with controller-reported mode as ground truth reduced measured dwell in the intermediate joint-default state to 0.20 s, and we compose a finite-state protocol around the same primitives so the operator hears a single continuous handoff. We close with a four-class failure-mode taxonomy — bug, physics-divergence, handoff, and tuning — intended as an organising principle for subsequent humanoid deployments.
2Pipeline
3Key Takeaways
-
Two sequential transfers, not one
We treat IsaacLab → MuJoCo as a first-class sim-to-sim bridge, then MuJoCo → hardware as sim-to-real. Each layer gets its own validation substrate; this separation is what made debugging tractable.
-
Three observation/encoding bugs masqueraded as physics
A foot-collision URDF mismatch, a 6D-rotation channel-order error (norms hide permutations), and a tokenizer-layout error at the export-adapter boundary. Each was confirmed by a controlled fine-tune validation rather than tolerance-based parity alone.
-
Dual-publisher whir: 1.6 s → 0.20 s
A persistent-client mode escalator using controller-reported mode as ground truth, wrapped in a finite-state protocol around the vendor's coarse start/stop primitives, collapses the audible handoff into a single continuous transition.
Ablation: foot-collision URDF fix (run 1 vs run 2)
One ablation, six rollouts, same natural-walk motion. The top row shows the 2k, 6k, and 16k checkpoints from run 1 (original foot-collision URDF) — all three collapse in under 6 s. The bottom row shows the same checkpoints from run 2 with the foot-collision URDF fix — all three run the full length without failing. A single observation-layer fix swings the success rate from 0/3 to 3/3 across checkpoints, which is exactly why we treat this class of bug as "masquerading as physics" rather than a tuning problem.
4Sim-to-Real Anchor Archive
What three matched recordings can — and cannot — prove
A committed archive of three matched simulator–hardware recordings from a single canonical checkpoint validates the URDF/MJCF kinematic chain and torso inertial model to within roughly 5° per-DoF RMS and base IMU angular velocity to within roughly 5%. We explicitly scope the archive's non-coverage: foot contact, friction, free-base dynamics, actuator saturation, and sensor noise are not, and cannot be, addressed by these recordings — and we close with a set of policy-free bench tests that would convert this indirect closed-loop evidence into direct, component-level validation.
Real vs sim: side-by-side replay + lower-body joint plots
Left half: two robots walking side-by-side in MuJoCo — the solid robot replays the real-hardware recording, the shaded robot replays the sim rollout from the same canonical checkpoint. Right half: per-joint trajectories for the lower body, real overlaid on sim. This is the direct visual evidence behind the anchor-archive numbers above — kinematic chain and torso-inertial parity to within a few degrees and a few percent on base IMU angular velocity, while contact / friction / actuator-saturation divergences (out of scope for the archive) remain visible.
5Failure-Mode Taxonomy
Four classes for organising the next deployment
We close with a four-class failure-mode taxonomy — bug, physics-divergence, handoff, and tuning — intended as an organising principle for subsequent humanoid deployments. Each class implies a different debugging substrate and a different evidence bar before a fix can be considered complete.
6More demos
Two additional clips of the same checkpoint family driving the AgiBot X2 Ultra hardware — beyond the canonical natural-walk tasks used in the abstract and ablation sections.
Single-leg balance
X2 holds a static single-leg pose under the deployed whole-body controller — a stress test for base-IMU feedback and ankle / hip authority outside the locomotion regime that dominates the training distribution.
Upper-body gestures
Whole-body policy driving coordinated arm, shoulder, and torso motions while the lower body holds a stable stance. Exercises the upper-body action channels that the natural-walk recordings touch only lightly.
7Citation
If you find this work useful, please cite:
@misc{sonic2026agibotx2,
title = {Porting {NVIDIA} Sonic to the {AgiBot} {X2} Ultra:
A Sim-to-Sim-to-Real Bridge for a Non-{G1} Humanoid},
author = {Claude Opus and Sitarama Raju Chekuri and
Zeeshaan Mohammed and Dhruv Diddi and Samarth Shukla},
year = {2026},
howpublished = {\url{https://sonic-agibot-x2.github.io/}},
}