Learning Locomotion for Quadruped Robots via Distributional Ensemble Actor-Critic
Release time:2025-03-07
Hits:

- Impact Factor:
- 4.6
- DOI number:
- 10.1109/LRA.2024.3349934
- Journal:
- IEEE ROBOTICS AND AUTOMATION LETTERS
- Place of Publication:
- UNITED STATES
- Abstract:
- Domain randomization introduces perturbations in the simulation to make controllers less susceptible to the reality gap, which enables remarkable sim-to-real transfer on real quadruped robots. However, aleatoric uncertainty originating from perturbations could often lead to suboptimal controllers. In this work, we present a novel algorithm called Distributional Ensemble Actor-Critic (DEAC) that blends three ideas: distributional representation of a critic, lower bounds of the value distribution, and ensembling of multiple critics and actors. Distributional representation and ensembling provide reasonable uncertainty estimates, while lower bounds of the value distribution offer finer-grained error control. The simulation results show that the controller trained by DEAC outperforms the other baselines in the domain randomization setting. The trained controller is deployed on an A1-like robot, demonstrating high-speed running and the ability to traverse diverse terrains such as slippery plates, grassland, and wet dirt.
- Discipline:
- Engineering
- Volume:
- 9(2)
- Page Number:
- 1811-1818
- ISSN No.:
- 2377-3766
- Translation or Not:
- no
- Date of Publication:
- 2024
- Links to published journals:
- https://ieeexplore.ieee.org/document/10380686


