This thesis presents the development and integration of a reinforcement learning framework for optimizing airport stand allocation, with a specific focus on the turnaround process at Bergamo Orio al Serio Airport (BGY). The work is composed of two main components: an event-driven simulator that accurately reproduces aircraft operations and ground handling processes using realworld flight schedules and stochastic delay models; and an adaptation of the MuZero algorithm, a model-based reinforcement learning method, to interact with and learn from this asynchronous environment. MuZero is a reinforcement learning algorithm developed by DeepMind that achieves superhuman performance by learning a model of the environment without having prior access to its dynamics. While MuZero is typically applied to synchronous, fully observable domains, this project required substantial architectural adaptations for deployment in an event-driven airport simulation environment. A key challenge was the development of the simulator itself from the ground up, designed to accurately model the airport’s operational dynamics. Additional complexities included the integration of MuZero with a dynamic and partially observable state space, the generation of valid actions in real-time, and synchronization between MuZero’s planning process and the simulator’s event-based execution. Training was conducted over multiple self-play iterations using actual operational data from four months of airport activity. The agent’s performance was evaluated through competitive pitting against previous policy versions, showing progressive improvements in efficiency and decision quality. The integration was validated through standalone testing of the simulator and a comprehensive reinforcement learning loop. This research demonstrates the feasibility and potential of applying advanced reinforcement learning algorithms like MuZero to real-world, stochastic, and asynchronous logistics environments. It lays a foundation for future applications in airport operations, logistics, and broader multi-agent coordination scenarios.

Utilizzo del MuZero Reinforcement Learning per l'allocazione degli stand presso l'Aeroporto di Orio al Serio

NURKOO, ASHINA
2023/2024

Abstract

This thesis presents the development and integration of a reinforcement learning framework for optimizing airport stand allocation, with a specific focus on the turnaround process at Bergamo Orio al Serio Airport (BGY). The work is composed of two main components: an event-driven simulator that accurately reproduces aircraft operations and ground handling processes using realworld flight schedules and stochastic delay models; and an adaptation of the MuZero algorithm, a model-based reinforcement learning method, to interact with and learn from this asynchronous environment. MuZero is a reinforcement learning algorithm developed by DeepMind that achieves superhuman performance by learning a model of the environment without having prior access to its dynamics. While MuZero is typically applied to synchronous, fully observable domains, this project required substantial architectural adaptations for deployment in an event-driven airport simulation environment. A key challenge was the development of the simulator itself from the ground up, designed to accurately model the airport’s operational dynamics. Additional complexities included the integration of MuZero with a dynamic and partially observable state space, the generation of valid actions in real-time, and synchronization between MuZero’s planning process and the simulator’s event-based execution. Training was conducted over multiple self-play iterations using actual operational data from four months of airport activity. The agent’s performance was evaluated through competitive pitting against previous policy versions, showing progressive improvements in efficiency and decision quality. The integration was validated through standalone testing of the simulator and a comprehensive reinforcement learning loop. This research demonstrates the feasibility and potential of applying advanced reinforcement learning algorithms like MuZero to real-world, stochastic, and asynchronous logistics environments. It lays a foundation for future applications in airport operations, logistics, and broader multi-agent coordination scenarios.
2023
Using the MuZero Reinforcement Learning for Stand Allocation at the Orio al Serio Airport
File in questo prodotto:
File Dimensione Formato  
NurkooAshinaThesis.pdf

accesso aperto

Descrizione: This thesis presents the development and integration of MuZero, a reinforcement learning framework, for optimizing airport stand allocation, with a specific focus on the turnaround process at Bergamo Orio al Serio Airport (BGY).
Dimensione 2.45 MB
Formato Adobe PDF
2.45 MB Adobe PDF Visualizza/Apri

È consentito all'utente scaricare e condividere i documenti disponibili a testo pieno in UNITESI UNIPV nel rispetto della licenza Creative Commons del tipo CC BY NC ND.
Per maggiori informazioni e per verifiche sull'eventuale disponibilità del file scrivere a: unitesi@unipv.it.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14239/33416