Eye Tracking Con Camera Neuromorfica
View/ Open
Author
Alessi, Roberta <1996>
Date
2023-10-25Data available
2023-11-02Abstract
Scopo di questa tesi e' quello di affrontare la sfida rappresentata da un'applicazione finalizzata allo scopo di "eye-tracking", in tempo reale e con camere ad eventi. Grazie alle eccezionali proprieta' di cui sono dotate, tra cui, l'alta risoluzione temporale, l'ampia gamma luminosa (Dynamic Range), il basso consumo energetico, e il fatto che non sono affette da motion blur, il nostro obiettivo e' stato dimostrare come il loro utilizzo possa essere particolarmente vantaggioso per uno scenario "real-time" che richiede tempi di risposta immediati. Considerando inoltre l'attuale stato dell'arte, questo lavoro introduce il primo sistema di eye-tracking interamente ed esclusivamente basato su eventi, che funziona dal vivo e non solo su simulazione. The aim of this thesis was to approach the problem of a real-time eyes-tracking application based on an event-driven camera. Thanks to their outstanding properties, such as high temporal resolution and high dynamic range, low power consumption, and being less affected by motion blur respect to traditional cameras, we were able to demonstrate how their usage can be especially advantageous in a real time scenario. With this work, we introduce, considering the actual state-of-the-art, the first eyes-tracking system based only on events which runs real-time.
Our system consists of two components: a low frequency Detection thread which runs ``globally" for the whole image while looking for the eyes with convolutional kernels, and a high frequency Tracker thread which starts based on the detected eye position and runs in a smaller region of interest (ROI).
Both the threads can run independently from each others.
The experiments were conducted both real-time and on a dataset recorded in-house. To demonstrate the performance of our algorithm, we run it also on the benchmark datasets. Our algorithm proved to be robust against head movements and worked well real-time, even though our error, computed as MSE between detected positions and ground truth positions, is slightly higher than theirs.
This work also presents a detailed ablation study on the algorithm.
Finally, the last contribute is ``In-House" dataset, which has a duration around 30 seconds and is fully labeled, and it is publicly available.
This dataset is recorded with a 3rd generation ATIS event camera. The subjects in there is visible from head to shoulder while performing slight head and eye movements.
Type
info:eu-repo/semantics/masterThesisCollections
- Laurea Magistrale [5082]