Show simple item record

dc.contributor.advisorRicca, Filippo <1969>
dc.contributor.authorBoriassi, Tommaso <2001>
dc.date.accessioned2025-03-27T15:37:44Z
dc.date.available2025-03-27T15:37:44Z
dc.date.issued2025-03-25
dc.identifier.urihttps://unire.unige.it/handle/123456789/11669
dc.description.abstractDuring my internship at STAM S.r.l., I designed and developed a complete backend infrastructure for the acquisition, management, and storage of data from various sources. This system is part of a broader business context aimed at developing methods to train models with limited data. The architecture I created features a modular structure with well-isolated components and clear interfaces. I implemented a multi-protocol download system that supports both HTTP/REST for Sentinel Hub APIs (satellite image acquisition) and SFTP for retrieving datasets with annotations. To optimize bandwidth usage and reduce processing times, I developed an efficient caching mechanism based on SHA-256 hashes to avoid redundant downloads. The system supports three different storage destinations: local filesystem for development and testing, Google Cloud Storage for long-term storage, and MinIO as an S3-compatible alternative. I designed the system to handle errors gracefully by implementing failure isolation strategies that allow partial operations to be completed even when some storage systems are unavailable. The orchestration of the entire workflow is managed by Apache Airflow, with a parameterized DAG that enables users to easily select the download protocol and storage systems to use. I containerized the entire environment with Docker. Durante il mio tirocinio presso STAM S.r.l., ho progettato e sviluppato un'infrastruttura backend completa per l'acquisizione, la gestione e l'archiviazione di dati provenienti da diverse fonti. Questo sistema si inserisce in un contesto aziendale più ampio che mira a sviluppare metodi per addestrare modelli con dati limitati. L'architettura che ho creato è caratterizzata da una struttura modulare con componenti ben isolati e interfacce chiare. Ho implementato un sistema di download multi-protocollo che supporta sia HTTP/REST per le API di Sentinel Hub (acquisizione di immagini satellitari) sia SFTP per il recupero di dataset con annotazioni.it_IT
dc.description.abstractDuring my internship at STAM S.r.l., I designed and developed a complete backend infrastructure for the acquisition, management, and storage of data from various sources. This system is part of a broader business context aimed at developing methods to train models with limited data. The architecture I created features a modular structure with well-isolated components and clear interfaces. I implemented a multi-protocol download system that supports both HTTP/REST for Sentinel Hub APIs (satellite image acquisition) and SFTP for retrieving datasets with annotations. To optimize bandwidth usage and reduce processing times, I developed an efficient caching mechanism based on SHA-256 hashes to avoid redundant downloads. The system supports three different storage destinations: local filesystem for development and testing, Google Cloud Storage for long-term storage, and MinIO as an S3-compatible alternative. I designed the system to handle errors gracefully by implementing failure isolation strategies that allow partial operations to be completed even when some storage systems are unavailable. The orchestration of the entire workflow is managed by Apache Airflow, with a parameterized DAG that enables users to easily select the download protocol and storage systems to use. I containerized the entire environment with Docker.en_UK
dc.language.isoit
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.titleINTEGRAZIONE DI MODULI SOFTWARE DI BACKEND, COSTRUZIONE PIPELINE CON AIRFLOW E DOCKERit_IT
dc.title.alternativeINTEGRATION OF BACKEND SOFTWARE MODULES, PIPELINE CONSTRUCTION WITH AIRFLOW AND DOCKERen_UK
dc.typeinfo:eu-repo/semantics/bachelorThesis
dc.publisher.nameUniversità degli studi di Genova
dc.date.academicyear2023/2024
dc.description.corsolaurea8759 - INFORMATICA
dc.description.area7 - SCIENZE MAT.FIS.NAT.
dc.description.department100023 - DIPARTIMENTO DI INFORMATICA, BIOINGEGNERIA, ROBOTICA E INGEGNERIA DEI SISTEMI


Files in this item

This item appears in the following Collection(s)

Show simple item record