Anomaly Detection with Supervised/Unsupervised Machine Learning

This thesis delves into the domain of anomaly detection within time series data, utilizing a combination of statistical and machine learning techniques, forming the foundation for training models aimed at predicting a signal’s behavior based on historical patterns. To enhance model robustness, synthetic anomalies are introduced. This dynamic strategy expands the training set, enabling the model to better identify and adapt to a broader spectrum of potential issues, particularly in cases where real-world anomalies are scarce. The anomaly detection process is strengthened through the integration of statistical methods, such as Standard Deviation and Interquartile range analysis. These detected anomalies serve as inputs for machine learning models, further refining the anomaly detection process and enhancing the versatility and accuracy of the models for being applied to the actual dataset. The proposed method relies on segmentation and adjustable parameters for anomaly tests. Its effectiveness is rigorously evaluated using metrics such as Precision, Recall, F1-Score, and ROC AUC Score. Additionally, an in-depth analysis of the method’s performance is conducted, considering the prevalence of anomalies and variations in specific model parameters. The methods are applied to a validation dataset with signals derived from HVAC systems. In unsupervised ML, both Standard deviation and Interquartile range methods proved effective for simple signals, but their performance decreased for more complex ones. On the other hand, supervised ML, employing a Support vector machine and Random forest, consistently demonstrated strong performance across both simple and complex signal scenarios. Notably, in specific instances, these metrics achieve a perfect score of 100%.

Anomaly Detection with Supervised/Unsupervised Machine Learning

SUNDARAVEL, MAGESH

2022/2023

Abstract

This thesis delves into the domain of anomaly detection within time series data, utilizing a combination of statistical and machine learning techniques, forming the foundation for training models aimed at predicting a signal’s behavior based on historical patterns. To enhance model robustness, synthetic anomalies are introduced. This dynamic strategy expands the training set, enabling the model to better identify and adapt to a broader spectrum of potential issues, particularly in cases where real-world anomalies are scarce. The anomaly detection process is strengthened through the integration of statistical methods, such as Standard Deviation and Interquartile range analysis. These detected anomalies serve as inputs for machine learning models, further refining the anomaly detection process and enhancing the versatility and accuracy of the models for being applied to the actual dataset. The proposed method relies on segmentation and adjustable parameters for anomaly tests. Its effectiveness is rigorously evaluated using metrics such as Precision, Recall, F1-Score, and ROC AUC Score. Additionally, an in-depth analysis of the method’s performance is conducted, considering the prevalence of anomalies and variations in specific model parameters. The methods are applied to a validation dataset with signals derived from HVAC systems. In unsupervised ML, both Standard deviation and Interquartile range methods proved effective for simple signals, but their performance decreased for more complex ones. On the other hand, supervised ML, employing a Support vector machine and Random forest, consistently demonstrated strong performance across both simple and complex signal scenarios. Notably, in specific instances, these metrics achieve a perfect score of 100%.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				DIPARTIMENTO DI INGEGNERIA INDUSTRIALE E DELL'INFORMAZIONE
			
	Corso di studio
	
				INDUSTRIAL AUTOMATION ENGINEERING - INGEGNERIA DELL'AUTOMAZIONE INDUSTRIALE [06417]
			
	Anno Accademico
	
				2022
			
	Titolo inglese
	
				Anomaly Detection with Supervised/Unsupervised Machine Learning
			
	Abstract in italiano
	
				This thesis delves into the domain of anomaly detection within time series data, utilizing a combination of statistical and machine learning techniques, forming the foundation for training models aimed at predicting a signal’s behavior based on historical patterns.
To enhance model robustness, synthetic anomalies are introduced. This dynamic strategy expands the training set, enabling the model to better identify and adapt
to a broader spectrum of potential issues, particularly in cases where real-world anomalies are scarce.
The anomaly detection process is strengthened through the integration of statistical methods, such as Standard Deviation and Interquartile range analysis. These
detected anomalies serve as inputs for machine learning models, further refining the anomaly detection process and enhancing the versatility and accuracy of the models for being applied to the actual dataset.
The proposed method relies on segmentation and adjustable parameters for anomaly tests. Its effectiveness is rigorously evaluated using metrics such as Precision, Recall, F1-Score, and ROC AUC Score. Additionally, an in-depth analysis of the method’s performance is conducted, considering the prevalence of anomalies
and variations in specific model parameters.
The methods are applied to a validation dataset with signals derived from HVAC systems. In unsupervised ML, both Standard deviation and Interquartile
range methods proved effective for simple signals, but their performance decreased for more complex ones. On the other hand, supervised ML, employing a Support
vector machine and Random forest, consistently demonstrated strong performance across both simple and complex signal scenarios. Notably, in specific instances, these metrics achieve a perfect score of 100%.
			
	Relatore
	
				FACCHINETTI, TULLIO
			
	Appare nelle tipologie:
	
				Lauree Magistrali

File in questo prodotto:

Non ci sono file associati a questo prodotto.

È consentito all'utente scaricare e condividere i documenti disponibili a testo pieno in UNITESI UNIPV nel rispetto della licenza Creative Commons del tipo CC BY NC ND.
Per maggiori informazioni e per verifiche sull'eventuale disponibilità del file scrivere a: [email protected].

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14239/17201