Applicazione dell'apprendimento per rinforzo bio-ispirato nel caso-studio di un robot che gioca ad hockey da tavolo

Germani, Martina <1999>

Mostra/Apri

tesi26181488.pdf (2.580Mb)

Autore

Germani, Martina <1999>

Data

2023-10-25

Disponibile dal

2023-11-02

Abstract

Negli ultimi anni, la robotica ha visto notevoli progressi per quanto riguarda l’interazione in ambienti dinamici, guidati sicuramente da scoperte tecnologiche e di ricerca, che hanno consentito ai robot di operare in modo efficace in ambienti che possono cambiare rapidamente e in modo imprevedibile. Tradizionalmente, i robot sono stati confinati in ambienti statici con compiti predefiniti, tuttavia, i recenti sviluppi in percezione, pianificazione, algoritmi di controllo e sensori, hanno trasformato le loro capacità nella gestione di ambienti dinamici. L'apprendimento automatico, in particolare, quello per rinforzo (RL) e il deep RL, hanno svolto un ruolo cruciale nel migliorare l'interazione robot-ambiente. Questa tesi si concentra sull'applicazione del deep RL ad un robot planare che gioca ad air hockey, un ambiente adatto per l'addestramento autonomo degli agenti RL, con l’obiettivo che il robot impari a segnare e a difendere in un ambiente simulato così da garantirne la sicurezza. Per fare ciò, vengono identificati due compiti di apprendimento: colpire e difendere. L'obiettivo del robot è trovare traiettorie ottimali affinché l'end-effector possa segnare o difendere in modo efficace.Questa tesi adotta anche un approccio RL di ispirazione biologica, traendo ispirazione dalle neuroscienze per migliorare le prestazioni dell'agente RL, accelerandone l'apprendimento. Sebbene questo approccio sia stato applicato principalmente in spazi discreti, in questo caso è stato applicato in modo innovativo allo spazio continuo dell'air hockey, dove il robot deve muoversi in modo continuo. Dai risultati ottenuti è stato possibile vedere il successo nell'apprendimento di questi compiti e come gli insiemi di stati definiti siano sufficienti a garantire un adeguato apprendimento in situazioni simili. In futuro, la politica ottenuta verrà applicata a compiti reali di air hockey utilizzando un robot dotato di una telecamera ad eventi, una combinazione nuova nel campo.

In recent years, robotics has seen notable advances in its ability to interact with dynamic environments. This progress has been driven by technological and research breakthroughs, which have enabled robots to operate effectively in environments that can change rapidly and unpredictably. Traditionally, robots have been confined to static environments with predefined tasks, however, recent developments in perception, planning, control algorithms, and sensor technologies have transformed the capabilities of robots in managing dynamic environments.Machine learning, particularly reinforcement learning (RL) and deep RL, which combine deep neural networks with RL algorithms, have played a crucial role in improving robot-environment interaction. This thesis focuses on the application of deep RL to a planar robot playing air hockey, which is a suitable environment for autonomous training of RL agents. The goal is for the robot to learn to score and defend in a simulated environment to ensure its safety. To do this, two learning tasks are identified: hit and defence.The robot’s goal is to find optimal trajectories for the end-effector to score or defend effectively.This thesis also adopts a biologically inspired RL approach, drawing inspiration from neuroscience to improve the performance of the RL agent, accelerating learning and therefore improving the safety of the robot.Although this approach has been mainly applied in discrete spaces, in this case, it has been applied in an innovative way to the continuous space of air hockey, where the robot must move continuously, making a significant contribution.From the results obtained it was possible to see the success in learning these tasks and how the defined sets of states are sufficient to ensure adequate learning in similar situations.In the future, the policy obtained from this RL training algorithm will be applied to real air hockey tasks using a robot equipped with an event camera, a combination that is new in the field.

Tipo

info:eu-repo/semantics/masterThesis