Back to News
quantum-computing

Robot Learns to Walk Despite Faults and Unexpected Disturbances during Operation

Quantum Zeitgeist
Loading...
6 min read
0 likes
⚡ Quantum Brief
Seoul National University researchers developed TOLEBI, a reinforcement learning framework enabling humanoid robots to adapt to hardware faults like joint locking and power loss during operation. The system uses simulated faults during training, combined with an online joint status module for real-time condition assessment, allowing dynamic movement adjustments. Validation tests on the TOCABI humanoid robot confirmed TOLEBI’s effectiveness in both simulation and real-world scenarios, including walking on flat surfaces and descending stairs. TOLEBI introduces a fallibility reward function to minimize foot-floor contact force, ensuring stable locomotion even under adverse conditions. While successful, the framework currently struggles with multiple simultaneous failures, with future work targeting improved resilience in unstructured environments.
Robot Learns to Walk Despite Faults and Unexpected Disturbances during Operation

Summarize this article with:

Researchers are increasingly focused on developing robust bipedal locomotion systems for humanoid robots, yet current reinforcement learning methods often lack resilience to real-world hardware failures or unexpected disturbances. This paper introduces TOLEBI, a novel learning framework designed to address this critical gap, authored by Hokyun Lee, Woo-Jeong Baek, Junhyeok Cha, and Jaeheung Park, all from Seoul National University.

The team demonstrate TOLEBI’s ability to learn fault-tolerant locomotion strategies by simulating joint locking, power loss, and external disturbances, and crucially, incorporating an online joint status module for real-time condition assessment. By successfully transferring learned policies to a physical robot and validating performance in both simulation and real-world scenarios with the TOCABI humanoid, this work represents a significant advance, providing the first learning-based framework explicitly designed for robust bipedal locomotion and paving the way for more reliable and adaptable robotic systems. Reinforcement learning for robust bipedal locomotion with simulated hardware faults Scientists have developed TOLEBI, a novel learning framework enabling bipedal robots to maintain stable locomotion despite experiencing hardware faults and external disturbances.

This research addresses a critical gap in humanoid robotics, where existing locomotion algorithms often lack the robustness to handle unexpected failures during operation. The work introduces a reinforcement learning approach that proactively trains robots to adapt to conditions like joint locking, power loss, and external disruptions, significantly enhancing real-world applicability. TOLEBI achieves this by injecting simulated faults during the training process, allowing the robot to learn fault-tolerant locomotion strategies. Specifically, the framework incorporates an online joint status module that classifies joint conditions in real-time by analysing observations during operation. This allows the robot to dynamically adjust its movements based on its current physical state, mitigating the impact of any detected faults. The learned policy is then successfully transferred to a real humanoid robot, TOCABI, via a sim-to-real transfer method, demonstrating the practicality of the approach. Validation experiments, conducted both in simulation and with the physical robot on tasks including walking on flat surfaces and descending stairs, confirm the effectiveness of TOLEBI. Scientifically, the innovation lies in training a joint status estimator online, based on a curriculum of simulated motor failures, providing higher robustness. The reward function is designed to minimise foot-floor contact force, encouraging stable and adaptable locomotion. This represents the first learning-based fault-tolerant framework for bipedal locomotion, paving the way for more resilient and reliable humanoid robots capable of operating in unpredictable environments. The development fosters efficient learning methods and addresses a crucial challenge in deploying robots in real-world scenarios where unexpected failures are inevitable. Reinforcement learning of robust bipedal locomotion with simulated hardware faults A 72-qubit superconducting processor forms the foundation of this research into fault-tolerant bipedal locomotion, utilising the humanoid robot TOCABI for both simulation and real-world validation. The study addresses a gap in current robotics literature concerning controllers capable of handling hardware failures during bipedal movement, specifically focusing on joint locking, power loss, and external disturbances. To achieve this, researchers injected these fault conditions into a simulated environment during the training phase of a reinforcement learning algorithm, developing a policy for fault-tolerant locomotion. This work introduces TOLEBI, a fault-tolerant learning framework, which employs phase modulation actions and a novel fallibility reward system to guide the learning process. The fallibility reward is specifically designed to minimise foot-floor contact force, encouraging stable locomotion even under adverse conditions. Crucially, an online joint status module was incorporated, enabling the robot to classify joint conditions in real-time by analysing observed data during operation. The learned policy was then transferred to the physical TOCABI robot via sim-to-real transfer techniques, allowing for validation of the framework in a real-world setting. This transfer was facilitated by the online joint status module, which continuously assesses the robot’s condition and adjusts the control strategy accordingly. Validation experiments were conducted in both simulated and real environments, demonstrating the applicability and robustness of the proposed approach to unforeseen hardware issues and environmental disturbances. The research represents a first learning-based framework for bipedal locomotion that actively addresses and mitigates the impact of potential faults.

Robust Bipedal Locomotion Through Concurrent Policy and Joint Status Estimation The TOLEBI framework, a learning-based approach for bipedal locomotion, addresses operational challenges arising from hardware faults. Specifically, the research incorporates simulated joint locking, power loss, and external disturbances to cultivate robust locomotion strategies. The system integrates an online joint status module, enabling classification of joint conditions through real-time observation of the robot during operation. Validation experiments were conducted using the humanoid robot TOCABI, both in simulation and real-world conditions, demonstrating the applicability of the proposed approach. The framework employs a curriculum learning approach alongside motor failure simulations to facilitate effective sim-to-real transfer of learned policies. A key component is the concurrent training of the policy and the online joint status estimator, eliminating the need for additional training phases to determine joint condition. The fallibility reward function within TOLEBI is designed to maintain nominal locomotion style even under motor failure conditions. This work formulates the bipedal locomotion control problem as a Markov Decision Process, defined by a state space, action space, transition probability function, reward function, and discount factor. The agent learns a policy to maximize the expected discounted return over a finite horizon, calculated as the cumulative discounted reward across time steps. Two primary motor failure scenarios were investigated: joint locking, where actuators become immobile, and power loss, resulting in a disruption of motor command signals.

Robust Bipedal Locomotion Through Curriculum Learning and Simulated Hardware Impairments Researchers have developed TOLEBI, a learning framework designed to enable robust bipedal locomotion in humanoid robots despite potential hardware failures. The system addresses a gap in existing reinforcement learning methods, which often overlook the challenges posed by real-world occurrences such as joint locking, power loss, and external disturbances. TOLEBI employs a curriculum learning strategy, progressively exposing the learning policy to nominal walking conditions, simulated motor failures, and external disruptions to enhance resilience. Validation experiments, conducted both in simulation and with the TOCABI humanoid robot, demonstrate the framework’s ability to maintain stable locomotion even when faced with unexpected motor failures. The learned policy successfully navigated stair descents without specific training for this task, indicating a capacity for generalisation beyond the initial training scenarios. This work represents a first learning-based framework for fault-tolerant bipedal locomotion, potentially advancing the development of more reliable humanoid robots. The authors acknowledge a limitation in the current system’s ability to handle multiple simultaneous failures. Future research will focus on extending TOLEBI to address this, alongside improving robustness in unstructured environments. These developments aim to create truly resilient locomotion strategies for humanoid robots operating in complex, real-world settings. 👉 More information 🗞 TOLEBI: Learning Fault-Tolerant Bipedal Locomotion via Online Status Estimation and Fallibility Rewards 🧠 ArXiv: https://arxiv.org/abs/2602.05596 Tags:

Read Original

Tags

government-funding

Source Information

Source: Quantum Zeitgeist