Experimental analysis of an hardware accelerator for AI algorithms based on PCM devices

This thesis presents an in-depth study of Phase-Change Memory (PCM) devices as the core technology for Analog In-Memory Computing (AIMC) architectures, with a specific focus on their application to neural network inference for character recognition tasks. The increasing demand for energy-efficient and high-throughput artificial intelligence (AI) accelerators has highlighted the limitations of conventional von Neumann architectures, where data movement between memory and processing units constitutes a significant bottleneck. AIMC emerges as a paradigm shift by enabling the direct execution of matrix–vector multiplications (MVM) within the memory itself, leveraging the physical properties of non-volatile memory technologies such as PCM. The thesis is structured around three complementary pillars: theoretical foundations, physical modeling, and experimental validation. The first part revisits the mathematical and algorithmic principles underpinning neural networks, with particular emphasis on convolutional neural networks (CNNs) optimized for character recognition. This includes the formalization of inference processes and the implications of mapping trained models onto analog hardware. The second part focuses on the physical and electrical characterization of PCM devices, analyzing their non-ideal behaviors, such as conductance drift over time, resistance variability, and crystallization kinetics under thermal stress. A formal model is developed to describe the temporal and thermal evolution of cell resistance, integrating the effects of drift and phase transitions, and enabling predictive compensation strategies. Experimentally, the work employs the MATRIX chip, featuring a large-scale 1024×4096 PCM array, to implement MVM operations through temporally encoded inputs. The analog outputs are generated by integrating the cell currents over time, effectively realizing a physical multiply-and-accumulate (MAC) operation governed by Ohm’s law and Kirchhoff’s current law. A two-phase encoding is introduced to handle the signed nature of input data. The experimental campaigns include extensive measurements before and after thermal baking processes to study the endurance and reliability of PCM-based computations. The thesis introduces corrective methodologies, including Output Data Rescaling (ODR) and temperature-aware calibration, to mitigate the impact of drift and programming inaccuracies on inference performance. In the final stage, a comprehensive analytical framework is proposed to model the cumulative error along the entire MVM processing chain. The model accounts for three principal sources of uncertainty: (i) weight programming inaccuracies intrinsic to PCM variability, (ii) input encoding errors due to DAC quantization, and (iii) output errors stemming from ADC digitization. This unified error propagation model is validated against experimental data and provides theoretical insights into system-level trade-offs and design optimizations. Last, but not least, a brief analysis of a possible future development of the system has been proposed. Overall, this work advances the understanding and practical implementation of PCM-based AIMC systems, demonstrating their potential for enabling low-power, high-efficiency AI accelerators suitable for edge computing applications.

Experimental analysis of an hardware accelerator for AI algorithms based on PCM devices

FERRARI, FILIPPO MARIA

2024/2025

Abstract

This thesis presents an in-depth study of Phase-Change Memory (PCM) devices as the core technology for Analog In-Memory Computing (AIMC) architectures, with a specific focus on their application to neural network inference for character recognition tasks. The increasing demand for energy-efficient and high-throughput artificial intelligence (AI) accelerators has highlighted the limitations of conventional von Neumann architectures, where data movement between memory and processing units constitutes a significant bottleneck. AIMC emerges as a paradigm shift by enabling the direct execution of matrix–vector multiplications (MVM) within the memory itself, leveraging the physical properties of non-volatile memory technologies such as PCM. The thesis is structured around three complementary pillars: theoretical foundations, physical modeling, and experimental validation. The first part revisits the mathematical and algorithmic principles underpinning neural networks, with particular emphasis on convolutional neural networks (CNNs) optimized for character recognition. This includes the formalization of inference processes and the implications of mapping trained models onto analog hardware. The second part focuses on the physical and electrical characterization of PCM devices, analyzing their non-ideal behaviors, such as conductance drift over time, resistance variability, and crystallization kinetics under thermal stress. A formal model is developed to describe the temporal and thermal evolution of cell resistance, integrating the effects of drift and phase transitions, and enabling predictive compensation strategies. Experimentally, the work employs the MATRIX chip, featuring a large-scale 1024×4096 PCM array, to implement MVM operations through temporally encoded inputs. The analog outputs are generated by integrating the cell currents over time, effectively realizing a physical multiply-and-accumulate (MAC) operation governed by Ohm’s law and Kirchhoff’s current law. A two-phase encoding is introduced to handle the signed nature of input data. The experimental campaigns include extensive measurements before and after thermal baking processes to study the endurance and reliability of PCM-based computations. The thesis introduces corrective methodologies, including Output Data Rescaling (ODR) and temperature-aware calibration, to mitigate the impact of drift and programming inaccuracies on inference performance. In the final stage, a comprehensive analytical framework is proposed to model the cumulative error along the entire MVM processing chain. The model accounts for three principal sources of uncertainty: (i) weight programming inaccuracies intrinsic to PCM variability, (ii) input encoding errors due to DAC quantization, and (iii) output errors stemming from ADC digitization. This unified error propagation model is validated against experimental data and provides theoretical insights into system-level trade-offs and design optimizations. Last, but not least, a brief analysis of a possible future development of the system has been proposed. Overall, this work advances the understanding and practical implementation of PCM-based AIMC systems, demonstrating their potential for enabling low-power, high-efficiency AI accelerators suitable for edge computing applications.

Scheda

Scheda DC

	Facoltà/Dipartimento
	
				DIPARTIMENTO DI INGEGNERIA INDUSTRIALE E DELL'INFORMAZIONE
			
	Corso di studio
	
				ELECTRONIC ENGINEERING [06416]
			
	Anno Accademico
	
				2024
			
	Titolo inglese
	
				Experimental analysis of an hardware accelerator for AI algorithms based on PCM devices
			
	Abstract in italiano
	
				This thesis presents an in-depth study of Phase-Change Memory (PCM) devices as
the core technology for Analog In-Memory Computing (AIMC) architectures, with
a specific focus on their application to neural network inference for character recognition
tasks. The increasing demand for energy-efficient and high-throughput artificial
intelligence (AI) accelerators has highlighted the limitations of conventional
von Neumann architectures, where data movement between memory and processing
units constitutes a significant bottleneck. AIMC emerges as a paradigm shift
by enabling the direct execution of matrix–vector multiplications (MVM) within
the memory itself, leveraging the physical properties of non-volatile memory technologies
such as PCM.
The thesis is structured around three complementary pillars: theoretical foundations,
physical modeling, and experimental validation. The first part revisits
the mathematical and algorithmic principles underpinning neural networks, with
particular emphasis on convolutional neural networks (CNNs) optimized for character
recognition. This includes the formalization of inference processes and the
implications of mapping trained models onto analog hardware.
The second part focuses on the physical and electrical characterization of PCM
devices, analyzing their non-ideal behaviors, such as conductance drift over time, resistance variability, and crystallization kinetics under thermal stress. A formal
model is developed to describe the temporal and thermal evolution of cell resistance,
integrating the effects of drift and phase transitions, and enabling predictive
compensation strategies.
Experimentally, the work employs the MATRIX chip, featuring a large-scale
1024×4096 PCM array, to implement MVM operations through temporally encoded
inputs. The analog outputs are generated by integrating the cell currents
over time, effectively realizing a physical multiply-and-accumulate (MAC) operation
governed by Ohm’s law and Kirchhoff’s current law. A two-phase encoding
is introduced to handle the signed nature of input data.
The experimental campaigns include extensive measurements before and after
thermal baking processes to study the endurance and reliability of PCM-based
computations. The thesis introduces corrective methodologies, including Output
Data Rescaling (ODR) and temperature-aware calibration, to mitigate the impact
of drift and programming inaccuracies on inference performance.
In the final stage, a comprehensive analytical framework is proposed to model
the cumulative error along the entire MVM processing chain. The model accounts
for three principal sources of uncertainty: (i) weight programming inaccuracies intrinsic
to PCM variability, (ii) input encoding errors due to DAC quantization, and
(iii) output errors stemming from ADC digitization. This unified error propagation
model is validated against experimental data and provides theoretical insights
into system-level trade-offs and design optimizations.
Last, but not least, a brief analysis of a possible future development of the
system has been proposed.
Overall, this work advances the understanding and practical implementation of PCM-based AIMC systems, demonstrating their potential for enabling low-power, high-efficiency AI accelerators suitable for edge computing applications.
			
	Relatore
	
				CABRINI, ALESSANDRO
			
	Correlatore
	
				ZURLA, RICCARDO
			
	Appare nelle tipologie:
	
				Lauree Magistrali

File in questo prodotto:

File	Dimensione	Formato
Thesis_Ferrari_Filippo_Maria_PDF_A.pdf embargo fino al 06/04/2027 Dimensione 17.57 MB Formato Adobe PDF Richiedi una copia	17.57 MB	Adobe PDF	Richiedi una copia

È consentito all'utente scaricare e condividere i documenti disponibili a testo pieno in UNITESI UNIPV nel rispetto della licenza Creative Commons del tipo CC BY NC ND.
Per maggiori informazioni e per verifiche sull'eventuale disponibilità del file scrivere a: unitesi@unipv.it.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14239/33542