Tecniche per l'identificazione e l'inseguimento delle mani in video acquisiti in prima persona

Castro Cuenca, Andrew Steven <2000>

View/Open

tesi26802538.pdf (903.0Kb)

Author

Castro Cuenca, Andrew Steven <2000>

Date

2023-12-19

Data available

2023-12-21

Abstract

L’obiettivo di questo elaborato è quello di provare e misurare l’efficacia di un programma per l’identificazione e l’inseguimento delle mani in video acquisiti in prima persona, chiamato Detecron2DeepSortPlus. La seguente tesi si baserà sull’uso essenziale di Python e le sue numerose librerie, che verranno approfondite in seguito, e di GitHub. Durante la preparazione del programma sono stati installati Anaconda, Pytorch e COCO, un dataset di immagini per l’identificazione e la segmentazione delle immagini. Inoltre, è stato fondamentale l’uso di Detectron2 nell’utilizzo di algoritmi per l’identificazione degli oggetti. Per l’effettiva prova e conseguente misurazione dell’efficacia del programma è stato utilizzato un formato di lettura dei dati MOT16, proveniente da MOTChallenge un dataset di immagini per il tracking degli oggetti. Per confrontare i dati in MOT16 è stata utilizzata la libreria di python py-motmetrics, inoltre per la creazione di un groundtruth è stato usato un tool chiamato MOT16_Annotator. I video usati per la prova del programma sono stati presi da un dataset della GTEA (Georgia Tech Egocentric Activity). Infine, è stato realizzato un video direttamente da me, simulando una visione in prima persona mentre alle prese con normali attività quotidiane, come prova definitiva per il programma.

The goal of this work is to test and measure the effectiveness of a program for hand detection and tracking in first-person acquired videos, called Detecron2DeepSortPlus. The following thesis will rely on the essential use of Python and its numerous libraries, which will be explored later, and GitHub. During the program's preparation, Anaconda, PyTorch, and COCO, an image dataset for object identification and segmentation, were installed. Additionally, the use of Detectron2 was crucial in implementing algorithms for object identification. For the actual testing and subsequent measurement of the program's effectiveness, a data reading format from MOT16 was used, originating from MOTChallenge, an image dataset for object tracking. To compare the data in MOT16, the Python library py-motmetrics was used. Furthermore, a tool called MOT16_Annotator was employed for creating ground truth. The videos used to test the program were taken from a dataset from GTEA (Georgia Tech Egocentric Activity). Finally, a video was created by myself, simulating a first-person view while engaging in normal daily activities, serving as the ultimate test for the program.

Type

info:eu-repo/semantics/bachelorThesis