Deep Learning-Enhanced Near-Infrared Spectroscopy for Cross-Instrumental Petrolatum Discrimination

Near-Infrared Spectroscopy (NIRS) is a cornerstone for quality discrimination in the pharmaceutical industry, yet the high cost of laboratory equipment often lim- its its widespread field application [6, 14]. This thesis investigates the feasibility of replacing expensive, high-resolution instruments (INSTR1) with a discrete, low- cost portable prototype (INSTR2) for the classification of pharmaceutical-grade petrolatum. The central research question explores whether advanced computa- tional pipelines can effectively compensate for the physical limitations and hardware discrepancies of a lower-resolution sensor. To address this, an experimental campaign was conducted, evaluating a pipeline of 23 different architectures ranging from traditional statistical classifiers to modern Deep Learning (DL) models, including Transformers and 1D-Convolutional Neural Networks (CNNs). A critical focus was placed on evaluating how algorithmic stabil- ity and execution time are affected by sample rotation noise and spectral resolution. The results reveal a Resolution Paradox: the high-resolution data from INSTR1 introduces significant multicollinearity and background noise, making generalization difficult for most architectures. Conversely, the discrete, 27-wavelength signature of INSTR2 proved to be intrinsically linearly separable, allowing a wide variety of models to achieve near-perfect classification. Addressing the problem of algorithmic efficiency for industrial edge computing revealed that heavy architectures are fundamentally over-engineered for this task. The “Gold Standard” emerged as Linear Discriminant Analysis (LDA). By maxi- mizing class separation, LDA achieved a deterministic 100.0% accuracy with zero variance on both instruments. Most importantly, it achieved this perfection directly on raw spectral data, completely eliminating the need for Principal Component Analysis (PCA) dimensionality reduction and minimizing computational overhead. These findings demonstrate that a rigorous mathematical foundation can perfectly bridge the gap between low-cost hardware and laboratory-grade reliability, enabling real-time quality control.

Deep Learning-Enhanced Near-Infrared Spectroscopy for Cross-Instrumental Petrolatum Discrimination

GATTI, ASIA

2025/2026

Abstract

Near-Infrared Spectroscopy (NIRS) is a cornerstone for quality discrimination in the pharmaceutical industry, yet the high cost of laboratory equipment often lim- its its widespread field application [6, 14]. This thesis investigates the feasibility of replacing expensive, high-resolution instruments (INSTR1) with a discrete, low- cost portable prototype (INSTR2) for the classification of pharmaceutical-grade petrolatum. The central research question explores whether advanced computa- tional pipelines can effectively compensate for the physical limitations and hardware discrepancies of a lower-resolution sensor. To address this, an experimental campaign was conducted, evaluating a pipeline of 23 different architectures ranging from traditional statistical classifiers to modern Deep Learning (DL) models, including Transformers and 1D-Convolutional Neural Networks (CNNs). A critical focus was placed on evaluating how algorithmic stabil- ity and execution time are affected by sample rotation noise and spectral resolution. The results reveal a Resolution Paradox: the high-resolution data from INSTR1 introduces significant multicollinearity and background noise, making generalization difficult for most architectures. Conversely, the discrete, 27-wavelength signature of INSTR2 proved to be intrinsically linearly separable, allowing a wide variety of models to achieve near-perfect classification. Addressing the problem of algorithmic efficiency for industrial edge computing revealed that heavy architectures are fundamentally over-engineered for this task. The “Gold Standard” emerged as Linear Discriminant Analysis (LDA). By maxi- mizing class separation, LDA achieved a deterministic 100.0% accuracy with zero variance on both instruments. Most importantly, it achieved this perfection directly on raw spectral data, completely eliminating the need for Principal Component Analysis (PCA) dimensionality reduction and minimizing computational overhead. These findings demonstrate that a rigorous mathematical foundation can perfectly bridge the gap between low-cost hardware and laboratory-grade reliability, enabling real-time quality control.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				DIPARTIMENTO DI INGEGNERIA INDUSTRIALE E DELL'INFORMAZIONE
			
	Corso di studio
	
				COMPUTER ENGINEERING [06415]
			
	Anno Accademico
	
				2025
			
	Titolo inglese
	
				Deep Learning-Enhanced Near-Infrared Spectroscopy for Cross-Instrumental Petrolatum Discrimination
			
	Abstract in italiano
	
				Near-Infrared Spectroscopy (NIRS) is a cornerstone for quality discrimination in
the pharmaceutical industry, yet the high cost of laboratory equipment often lim-
its its widespread field application [6, 14]. This thesis investigates the feasibility
of replacing expensive, high-resolution instruments (INSTR1) with a discrete, low-
cost portable prototype (INSTR2) for the classification of pharmaceutical-grade
petrolatum. The central research question explores whether advanced computa-
tional pipelines can effectively compensate for the physical limitations and hardware
discrepancies of a lower-resolution sensor.
To address this, an experimental campaign was conducted, evaluating a pipeline
of 23 different architectures ranging from traditional statistical classifiers to modern
Deep Learning (DL) models, including Transformers and 1D-Convolutional Neural
Networks (CNNs). A critical focus was placed on evaluating how algorithmic stabil-
ity and execution time are affected by sample rotation noise and spectral resolution.
The results reveal a Resolution Paradox: the high-resolution data from INSTR1
introduces significant multicollinearity and background noise, making generalization
difficult for most architectures. Conversely, the discrete, 27-wavelength signature of
INSTR2 proved to be intrinsically linearly separable, allowing a wide variety of
models to achieve near-perfect classification.
Addressing the problem of algorithmic efficiency for industrial edge computing
revealed that heavy architectures are fundamentally over-engineered for this task.
The “Gold Standard” emerged as Linear Discriminant Analysis (LDA). By maxi-
mizing class separation, LDA achieved a deterministic 100.0% accuracy with zero
variance on both instruments. Most importantly, it achieved this perfection directly
on raw spectral data, completely eliminating the need for Principal Component
Analysis (PCA) dimensionality reduction and minimizing computational overhead.
These findings demonstrate that a rigorous mathematical foundation can perfectly
bridge the gap between low-cost hardware and laboratory-grade reliability, enabling
real-time quality control.
			
	Relatore
	
				FACCHINETTI, TULLIO
			
	Appare nelle tipologie:
	
				Lauree Magistrali

File in questo prodotto:

File	Dimensione	Formato
thesis.pdf embargo fino al 02/11/2026 Dimensione 5.69 MB Formato Adobe PDF Richiedi una copia	5.69 MB	Adobe PDF	Richiedi una copia

È consentito all'utente scaricare e condividere i documenti disponibili a testo pieno in UNITESI UNIPV nel rispetto della licenza Creative Commons del tipo CC BY NC ND.
Per maggiori informazioni e per verifiche sull'eventuale disponibilità del file scrivere a: [email protected].

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14239/34975