Ottimizzazione della portabilità di un modello per la stima del destinatario di una conversazione sul robot sociale iCub

Saade, Pia <1999>

Mostra/Apri

tesi26793514.pdf (5.664Mb)

Autore

Saade, Pia <1999>

Data

2023-12-19

Disponibile dal

2023-12-21

Abstract

Le interazioni faccia a faccia tra più parti creano un ambiente sociale speciale e complesso, con tipi distinti di ruoli: oratore, destinatario/i, e altre persone presenti sulla scena. Essere considerato socievole - e da allora in poi assimilato senza sforzo nel nostro ambiente sociale - i robot devono comprendere le dinamiche che regolano questo tipo di situazioni. Questo studio esplora il problema della stima del destinatario, che è la capacità di determinare il destinatario di un'enunciazione - quello è l'ascoltatore previsto di un dato oratore - decifrandolo e utilizzando il linguaggio del corpo di chi parla, dove abbiamo optato per un deep-learning approccio. Questo modello gestisce input costituiti da immagini del volto dei relatori e vettori di posa, motivo per cui abbiamo scelto di lavorare sul problema della Stima del Destinatario, in particolare individuando il la posizione del destinatario nello spazio dalla posizione egocentrica del robot. Il mio esperimento è costituito da tre obiettivi principali citati e spiegati in le prossime sezioni Il primo obiettivo è implementare su iCub il modello ibrido di deep learning fatto in [1] per ottenere una stima in tempo reale del destinatario. Il secondo obiettivo è progettare e realizzare una raccolta dati con partecipanti umani per testare le prestazioni del modello. Il terzo obiettivo è ottimizzare le prestazioni del modello DL con i nuovi dati raccolti sul robot iCub

Face-to-face, multi-party interactions creates special and complex social environment, with distinct types of roles: speaker, addressee(s), and other persons present in the scene. To be considered sociable - and thereafter effortlessly assimilated into our social environment - robots must understand the dynamics regulating these kind of situations. This study explores the problem of addressee estimation, which is the ability to determine the addressee of an utterance - that is the intended listener of a given speaker - by deciphering and using the body language of the speaker, where we opted for a deep-learning approach.This model handles inputs that consist of pictures of speakers’ face and pose vectors, which is why we made the choices to work on the Addressee Estimation problem, particularly pinpointing the recipient’s location in space from the robot’s ego-centric position. My experiment constitutes of three main goals cited and explained in the upcoming sections The first goal is to deploy on the iCub the hybrid deep-learning model done in [1] to get a real-time estimation of the addressee. The second goal is to design and carry out a data collection with human participants to test the performance of the model. The third goal is to optimize the performances of the DL model with the new data collected on the iCub robot

Tipo

info:eu-repo/semantics/masterThesis