This thesis presents an in-depth study of Phase-Change Memory (PCM) devices as the core technology for Analog In-Memory Computing (AIMC) architectures, with a specific focus on their application to neural network inference for character recognition tasks. The increasing demand for energy-efficient and high-throughput artificial intelligence (AI) accelerators has highlighted the limitations of conventional von Neumann architectures, where data movement between memory and processing units constitutes a significant bottleneck. AIMC emerges as a paradigm shift by enabling the direct execution of matrix–vector multiplications (MVM) within the memory itself, leveraging the physical properties of non-volatile memory technologies such as PCM. The thesis is structured around three complementary pillars: theoretical foundations, physical modeling, and experimental validation. The first part revisits the mathematical and algorithmic principles underpinning neural networks, with particular emphasis on convolutional neural networks (CNNs) optimized for character recognition. This includes the formalization of inference processes and the implications of mapping trained models onto analog hardware. The second part focuses on the physical and electrical characterization of PCM devices, analyzing their non-ideal behaviors, such as conductance drift over time, resistance variability, and crystallization kinetics under thermal stress. A formal model is developed to describe the temporal and thermal evolution of cell resistance, integrating the effects of drift and phase transitions, and enabling predictive compensation strategies. Experimentally, the work employs the MATRIX chip, featuring a large-scale 1024×4096 PCM array, to implement MVM operations through temporally encoded inputs. The analog outputs are generated by integrating the cell currents over time, effectively realizing a physical multiply-and-accumulate (MAC) operation governed by Ohm’s law and Kirchhoff’s current law. A two-phase encoding is introduced to handle the signed nature of input data. The experimental campaigns include extensive measurements before and after thermal baking processes to study the endurance and reliability of PCM-based computations. The thesis introduces corrective methodologies, including Output Data Rescaling (ODR) and temperature-aware calibration, to mitigate the impact of drift and programming inaccuracies on inference performance. In the final stage, a comprehensive analytical framework is proposed to model the cumulative error along the entire MVM processing chain. The model accounts for three principal sources of uncertainty: (i) weight programming inaccuracies intrinsic to PCM variability, (ii) input encoding errors due to DAC quantization, and (iii) output errors stemming from ADC digitization. This unified error propagation model is validated against experimental data and provides theoretical insights into system-level trade-offs and design optimizations. Last, but not least, a brief analysis of a possible future development of the system has been proposed. Overall, this work advances the understanding and practical implementation of PCM-based AIMC systems, demonstrating their potential for enabling low-power, high-efficiency AI accelerators suitable for edge computing applications.

This thesis presents an in-depth study of Phase-Change Memory (PCM) devices as the core technology for Analog In-Memory Computing (AIMC) architectures, with a specific focus on their application to neural network inference for character recognition tasks. The increasing demand for energy-efficient and high-throughput artificial intelligence (AI) accelerators has highlighted the limitations of conventional von Neumann architectures, where data movement between memory and processing units constitutes a significant bottleneck. AIMC emerges as a paradigm shift by enabling the direct execution of matrix–vector multiplications (MVM) within the memory itself, leveraging the physical properties of non-volatile memory technologies such as PCM. The thesis is structured around three complementary pillars: theoretical foundations, physical modeling, and experimental validation. The first part revisits the mathematical and algorithmic principles underpinning neural networks, with particular emphasis on convolutional neural networks (CNNs) optimized for character recognition. This includes the formalization of inference processes and the implications of mapping trained models onto analog hardware. The second part focuses on the physical and electrical characterization of PCM devices, analyzing their non-ideal behaviors, such as conductance drift over time, resistance variability, and crystallization kinetics under thermal stress. A formal model is developed to describe the temporal and thermal evolution of cell resistance, integrating the effects of drift and phase transitions, and enabling predictive compensation strategies. Experimentally, the work employs the MATRIX chip, featuring a large-scale 1024×4096 PCM array, to implement MVM operations through temporally encoded inputs. The analog outputs are generated by integrating the cell currents over time, effectively realizing a physical multiply-and-accumulate (MAC) operation governed by Ohm’s law and Kirchhoff’s current law. A two-phase encoding is introduced to handle the signed nature of input data. The experimental campaigns include extensive measurements before and after thermal baking processes to study the endurance and reliability of PCM-based computations. The thesis introduces corrective methodologies, including Output Data Rescaling (ODR) and temperature-aware calibration, to mitigate the impact of drift and programming inaccuracies on inference performance. In the final stage, a comprehensive analytical framework is proposed to model the cumulative error along the entire MVM processing chain. The model accounts for three principal sources of uncertainty: (i) weight programming inaccuracies intrinsic to PCM variability, (ii) input encoding errors due to DAC quantization, and (iii) output errors stemming from ADC digitization. This unified error propagation model is validated against experimental data and provides theoretical insights into system-level trade-offs and design optimizations. Last, but not least, a brief analysis of a possible future development of the system has been proposed. Overall, this work advances the understanding and practical implementation of PCM-based AIMC systems, demonstrating their potential for enabling low-power, high-efficiency AI accelerators suitable for edge computing applications.

Experimental analysis of an hardware accelerator for AI algorithms based on PCM devices

FERRARI, FILIPPO MARIA
2024/2025

Abstract

This thesis presents an in-depth study of Phase-Change Memory (PCM) devices as the core technology for Analog In-Memory Computing (AIMC) architectures, with a specific focus on their application to neural network inference for character recognition tasks. The increasing demand for energy-efficient and high-throughput artificial intelligence (AI) accelerators has highlighted the limitations of conventional von Neumann architectures, where data movement between memory and processing units constitutes a significant bottleneck. AIMC emerges as a paradigm shift by enabling the direct execution of matrix–vector multiplications (MVM) within the memory itself, leveraging the physical properties of non-volatile memory technologies such as PCM. The thesis is structured around three complementary pillars: theoretical foundations, physical modeling, and experimental validation. The first part revisits the mathematical and algorithmic principles underpinning neural networks, with particular emphasis on convolutional neural networks (CNNs) optimized for character recognition. This includes the formalization of inference processes and the implications of mapping trained models onto analog hardware. The second part focuses on the physical and electrical characterization of PCM devices, analyzing their non-ideal behaviors, such as conductance drift over time, resistance variability, and crystallization kinetics under thermal stress. A formal model is developed to describe the temporal and thermal evolution of cell resistance, integrating the effects of drift and phase transitions, and enabling predictive compensation strategies. Experimentally, the work employs the MATRIX chip, featuring a large-scale 1024×4096 PCM array, to implement MVM operations through temporally encoded inputs. The analog outputs are generated by integrating the cell currents over time, effectively realizing a physical multiply-and-accumulate (MAC) operation governed by Ohm’s law and Kirchhoff’s current law. A two-phase encoding is introduced to handle the signed nature of input data. The experimental campaigns include extensive measurements before and after thermal baking processes to study the endurance and reliability of PCM-based computations. The thesis introduces corrective methodologies, including Output Data Rescaling (ODR) and temperature-aware calibration, to mitigate the impact of drift and programming inaccuracies on inference performance. In the final stage, a comprehensive analytical framework is proposed to model the cumulative error along the entire MVM processing chain. The model accounts for three principal sources of uncertainty: (i) weight programming inaccuracies intrinsic to PCM variability, (ii) input encoding errors due to DAC quantization, and (iii) output errors stemming from ADC digitization. This unified error propagation model is validated against experimental data and provides theoretical insights into system-level trade-offs and design optimizations. Last, but not least, a brief analysis of a possible future development of the system has been proposed. Overall, this work advances the understanding and practical implementation of PCM-based AIMC systems, demonstrating their potential for enabling low-power, high-efficiency AI accelerators suitable for edge computing applications.
2024
Experimental analysis of an hardware accelerator for AI algorithms based on PCM devices
This thesis presents an in-depth study of Phase-Change Memory (PCM) devices as the core technology for Analog In-Memory Computing (AIMC) architectures, with a specific focus on their application to neural network inference for character recognition tasks. The increasing demand for energy-efficient and high-throughput artificial intelligence (AI) accelerators has highlighted the limitations of conventional von Neumann architectures, where data movement between memory and processing units constitutes a significant bottleneck. AIMC emerges as a paradigm shift by enabling the direct execution of matrix–vector multiplications (MVM) within the memory itself, leveraging the physical properties of non-volatile memory technologies such as PCM. The thesis is structured around three complementary pillars: theoretical foundations, physical modeling, and experimental validation. The first part revisits the mathematical and algorithmic principles underpinning neural networks, with particular emphasis on convolutional neural networks (CNNs) optimized for character recognition. This includes the formalization of inference processes and the implications of mapping trained models onto analog hardware. The second part focuses on the physical and electrical characterization of PCM devices, analyzing their non-ideal behaviors, such as conductance drift over time, resistance variability, and crystallization kinetics under thermal stress. A formal model is developed to describe the temporal and thermal evolution of cell resistance, integrating the effects of drift and phase transitions, and enabling predictive compensation strategies. Experimentally, the work employs the MATRIX chip, featuring a large-scale 1024×4096 PCM array, to implement MVM operations through temporally encoded inputs. The analog outputs are generated by integrating the cell currents over time, effectively realizing a physical multiply-and-accumulate (MAC) operation governed by Ohm’s law and Kirchhoff’s current law. A two-phase encoding is introduced to handle the signed nature of input data. The experimental campaigns include extensive measurements before and after thermal baking processes to study the endurance and reliability of PCM-based computations. The thesis introduces corrective methodologies, including Output Data Rescaling (ODR) and temperature-aware calibration, to mitigate the impact of drift and programming inaccuracies on inference performance. In the final stage, a comprehensive analytical framework is proposed to model the cumulative error along the entire MVM processing chain. The model accounts for three principal sources of uncertainty: (i) weight programming inaccuracies intrinsic to PCM variability, (ii) input encoding errors due to DAC quantization, and (iii) output errors stemming from ADC digitization. This unified error propagation model is validated against experimental data and provides theoretical insights into system-level trade-offs and design optimizations. Last, but not least, a brief analysis of a possible future development of the system has been proposed. Overall, this work advances the understanding and practical implementation of PCM-based AIMC systems, demonstrating their potential for enabling low-power, high-efficiency AI accelerators suitable for edge computing applications.
File in questo prodotto:
File Dimensione Formato  
Thesis_Ferrari_Filippo_Maria_PDF_A.pdf

embargo fino al 06/04/2027

Dimensione 17.57 MB
Formato Adobe PDF
17.57 MB Adobe PDF   Richiedi una copia

È consentito all'utente scaricare e condividere i documenti disponibili a testo pieno in UNITESI UNIPV nel rispetto della licenza Creative Commons del tipo CC BY NC ND.
Per maggiori informazioni e per verifiche sull'eventuale disponibilità del file scrivere a: unitesi@unipv.it.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14239/33542