Comunicazione Non Verbale Culturalmente Competente Basata Su Generative Adversarial Networks

Gjaci, Ariel <1995>

dc.contributor.advisor	Recchiuto, Carmine <1984>
dc.contributor.advisor	Sgorbissa, Antonio <1970>
dc.contributor.author	Gjaci, Ariel <1995>
dc.date.accessioned	2021-06-17T14:03:47Z
dc.date.available	2021-06-17T14:03:47Z
dc.date.issued	2021-06-15
dc.identifier.uri	https://unire.unige.it/handle/123456789/3501
dc.description.abstract	I movimenti non verbali sono di solito usati dagli umani per interagire con altre persone in quanto li aiutano ad accentuare il significato delle parole, ad esprimere sentimenti ma anche per far capire le intenzioni. Oltretutto, alcuni studi hanno dimostrato che essi non dipendono solo dallo stile di una persona ma anche dalla cultura. Per queste ragioni può essere molto utile, nel campo della Robotica Sociale, avere robot umanoidi che usano gesti culturalmente dipendenti per intensificare l'espressività e l'interazione con le persone. In questo progetto verrà mostrata una nuova metodologia per raggiungere questo obiettivo usando le Reti Generative Avversarie (GANs), che hanno la capacità di imparare la struttura interna dei dati e di generare nuovi campioni. Nell'approcio proposto, il Dataset per allenare la Rete Neurale è stato costruito prendendo dati di persone diverse appartenenti alla stessa cultura. In particolare, un Dataset è stato creato usando video di Yotube che mostrano persone appartenenti alla stessa cultura: in questa tipologia di video è semplice rilevare le pose e la voce dell'interlecutore che sono necessare per l'allenamento. Però, dato che l'approcio è basato in feature dipendenti dalla frequenza 'audio, ci si aspetta che voci diverse producano feature diverse anche se le persone stanno dicendo le stesse parole, facendo diventare la dipendenza culturale un elemento difficile da imparare. Per questa ragione è stata usata un'altra rete neurale per una conversione di voci da tutte a una sola. Per far vedere i risultati in un robot sociale, una terza rete neurale è stata usata per mappare le pose da uno spazio 2D a uno 3D e il risultato ottenuto è stato riprodotto dal robot umanoide Pepper. In conclusione, il Dataset è stato usato per generare movimenti non verbali associati a un set di frasi pronunciate dal robot e il risultato di questo approcio è stato comparato con APIs per generare discorsi e con una procedura per generare gesti casuali.	it_IT
dc.description.abstract	Co-speech gestures are commonly used by humans to interact with other people since they help them to emphasize the meaning of the words, to express feelings and even for showing intentions. Moreover, some studies demonstrated that they are not only dependent from the specific style of a person but they are also strongly connected with the culture. For all these reasons it can be very useful, in the Social Robotics fi eld, to have humanoid robots that use custom culture-dependent gestures to enhance the expressiveness and the interaction with people. In this project there will be shown a new method to achieve this goal by relying on Generative Adversarial Networks (GANs), which have the capability to learn the internal structure of data and generate new samples. In the proposed approach, the Dataset for training the Neural Network is built by taking data of different people belonging to the same culture. In particular, a custom Dataset was created using Youtube videos that show people belonging to the same culture: in this kind of videos it is simple to detect both poses and voice of the main speakers that are necessary for the training. However, since the approach is based on features that depend from the frequency of speech audio, it is expected that different voices produce different features even if the people are saying the same words, making the Culture-Aware element very difficult to be learnt. For this reason it was used another Neural Network for many-to-one voice conversion. To show the results on a real social robot, a third Neural Network was used for mapping the poses from 2-D to 3-D space and the result was reproduced by the humanoid robot Pepper. In conclusion, the Dataset has been used to generate co-speech gestures associated to a set of sentences pronounced by the robot, and the result of this approach has been compared to the embedded APIs for speech generation and with a procedure for generating random talking gestures.	en_UK
dc.language.iso	en
dc.rights	info:eu-repo/semantics/openAccess
dc.title	Comunicazione Non Verbale Culturalmente Competente Basata Su Generative Adversarial Networks	it_IT
dc.title.alternative	Culture-Aware Co-Speech Gestures Using Generative Adversarial Networks	en_UK
dc.type	info:eu-repo/semantics/masterThesis
dc.subject.miur	ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI
dc.publisher.name	Università degli studi di Genova
dc.date.academicyear	2019/2020
dc.description.corsolaurea	10635 - ROBOTICS ENGINEERING
dc.description.area	9 - INGEGNERIA
dc.description.department	100023 - DIPARTIMENTO DI INFORMATICA, BIOINGEGNERIA, ROBOTICA E INGEGNERIA DEI SISTEMI

Files in this item

Name:: tesi16288368.pdf
Size:: 3.667Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Laurea Magistrale [6130]

Show simple item record