Unsupervised Rule-Based-out-of-distribution Detection

View/ Open
Author
Tewolde, Selam Gebrehiwot <1998>
Date
2025-03-24Data available
2025-03-27Abstract
Detecting anomalies and out-of-distribution (OoD) data in machine learning remains a critical challenge, particularly in real-world applications where anomalous data often lacks labels and can have severe consequences. This thesis develops an unsupervised framework to address this problem, providing a scalable solution for detecting anomalies in dynamic and complex domains like autonomous systems and healthcare. By perturbing image features, the model generates synthetic data that simulates realistic anomalies without relying on costly labeled data. This allows the system to identify deviations from the patterns seen in the in-distribution data. At the same time, histogram-based analysis monitors the effectiveness of the rules, enabling real-time detection of out-of-distribution (OoD) data. The findings demonstrate that the method is more effective at detecting unknown anomalies in dynamic and complex environments. This work contributes to the advancement of anomaly detection techniques and presents a practical solution for industries facing limited labeled data, with significant implications for improving the safety and reliability of machine learning systems in critical applications.
Keywords: Out-of-distribution, OoD, Out of distribution detection, ODD, synthetic data, Data generation, rule based methods Detecting anomalies and out-of-distribution (OoD) data in machine learning remains a critical challenge, particularly in real-world applications where anomalous data often lacks labels and can have severe consequences. This thesis develops an unsupervised framework to address this problem, providing a scalable solution for detecting anomalies in dynamic and complex domains like autonomous systems and healthcare. By perturbing image features, the model generates synthetic data that simulates realistic anomalies without relying on costly labeled data. This allows the system to identify deviations from the patterns seen in the in-distribution data. At the same time, histogram-based analysis monitors the effectiveness of the rules, enabling real-time detection of out-of-distribution (OoD) data. The findings demonstrate that the method is more effective at detecting unknown anomalies in dynamic and complex environments. This work contributes to the advancement of anomaly detection techniques and presents a practical solution for industries facing limited labeled data, with significant implications for improving the safety and reliability of machine learning systems in critical applications.
Keywords: Out-of-distribution, OoD, Out of distribution detection, ODD, synthetic data, Data generation, rule based methods
Type
info:eu-repo/semantics/masterThesisCollections
- Laurea Magistrale [5638]