Come i Modelli Linguistici di grandi dimensioni stanno rivoluzionando il reverse engineering dei binari

Giannini, Matteo <2001>

dc.contributor.advisor	Lagorio, Giovanni <1973>
dc.contributor.author	Giannini, Matteo <2001>
dc.date.accessioned	2025-10-23T14:30:03Z
dc.date.available	2025-10-23T14:30:03Z
dc.date.issued	2025-10-15
dc.identifier.uri	https://unire.unige.it/handle/123456789/13388
dc.description.abstract	L’ingegneria inversa binaria è una disciplina fondamentale della cybersicurezza, utilizzata per analizzare il software senza avere accesso al suo codice sorgente al fine di individuare vulnerabilità. La complessità di questo compito, aggravata dalla perdita di informazioni che avviene durante la compilazione, lo rende fortemente dipendente dall’esperienza dell’analista. Questa tesi esplora il potenziale dei Large Language Models (LLMs) nel supportare e ottimizzare tale processo, concentrandosi sulla loro integrazione con il framework Ghidra. Il lavoro valuta l’efficacia dei LLMs attraverso tre metodologie di integrazione progressive: un’analisi manuale del codice decompilato, un flusso di lavoro assistito da plugin e un approccio avanzato basato sul Model Context Protocol (MCP). All’interno di quest’ultimo ambito, la tesi presenta un contributo pratico originale allo sviluppo del plugin GhidraMCP. Gli approcci sono stati validati sperimentalmente su binari reali, confrontando le prestazioni di diversi modelli LLM. I risultati mettono in evidenza i punti di forza e le limitazioni di ciascuna strategia di integrazione, dimostrando come protocolli avanzati come MCP possano trasformare i LLMs in collaboratori efficaci nel campo dell’ingegneria inversa.	it_IT
dc.description.abstract	Binary reverse engineering is a critical discipline in cybersecurity for analyzing software without access to its source code to identify vulnerabilities. The complexity of this task, which is exacerbated by information loss during compilation, makes it heavily reliant on the analyst’s expertise. This thesis explores the potential of Large Language Models (LLMs) to assist and optimize this process, focusing on their integration with the Ghidra frame- work. The work evaluates the effectiveness of LLMs through three progressive integration methodologies: a manual analysis of decompiled code, a plugin-assisted workflow and an advanced approach based on the Model Context Protocol (MCP). Within this latter scope, the thesis presents an original practical contribution to the development of the GhidraMCP plugin. The approaches were experimentally validated on real-world binaries by comparing the performance of different LLM models. The results highlight the strengths and limi- tations of each integration strategy, demonstrating how advanced protocols like MCP can transform LLMs into effective collaborators in the reverse engineering field.	en_UK
dc.language.iso	en
dc.rights	info:eu-repo/semantics/restrictedAccess
dc.title	Come i Modelli Linguistici di grandi dimensioni stanno rivoluzionando il reverse engineering dei binari	it_IT
dc.title.alternative	How Large Language Models are revolutionizing Binary Reverse Engineering	en_UK
dc.type	info:eu-repo/semantics/masterThesis
dc.subject.miur	INF/01 - INFORMATICA
dc.publisher.name	Università degli studi di Genova
dc.date.academicyear	2024/2025
dc.description.corsolaurea	11160 - COMPUTER ENGINEERING
dc.description.area	9 - INGEGNERIA
dc.description.department	100023 - DIPARTIMENTO DI INFORMATICA, BIOINGEGNERIA, ROBOTICA E INGEGNERIA DEI SISTEMI

Files in this item

Name:: tesi35178846.pdf
Size:: 1.276Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Laurea Magistrale [7415]

Show simple item record