The generation of sketches represents a significant challenge in the field of image processing as it requires the emulation of human creativity and the accurate representation of visual content. This thesis aims to address this challenge through the use of the diffusion model, an advanced generative model based on stochastic diffusion processes. The main objective of this thesis is to explore the application of the diffusion model in the generation of sketches. The diffusion model progressively removes noise from random data in order to obtain sketches consistent with the provided labels. This reversal process can be seen as a sequence of steps, each of which exploits a U-Net neural network, a convolutional network known for its ability to capture both local and global details in images, to predict the noise to be removed and bring the generated data closer to the desired sample. Within the scope of this thesis, two distinct approaches for generating sketches have been examined. The first approach aims to generate sketches directly from the training set images, while the second focuses on generating trajectories defined by a series of points, each represented by offsets and pen-states. Both approaches use the diffusion model to model the generation process. The results of the experiments reveal that the second approach, based on trajectories, shows lower quality compared to the first. Numerous generated images present higher levels of noise, especially when dealing with more complex drawing classes. However, some samples generated with the trajectory-based approach demonstrate high creative potential. In conclusion, this thesis represents an in-depth investigation into drawing generation using the diffusion model. The experiments carried out provide important indications about challenges and opportunities in this field. Future research directions include improving training data pre-processing through detection and removal of incomplete sketches and improving trajectory-based approach through model architecture optimization and pen-state threshold management automation. With further developments, these techniques could enable the creation of creative and high-quality sketches in a variety of contexts and practical applications.
Applicazione del diffusion model alla generazione di disegni
Application of the diffusion model to sketch generation
STELLA, ANDREA
2022/2023
Abstract
The generation of sketches represents a significant challenge in the field of image processing as it requires the emulation of human creativity and the accurate representation of visual content. This thesis aims to address this challenge through the use of the diffusion model, an advanced generative model based on stochastic diffusion processes. The main objective of this thesis is to explore the application of the diffusion model in the generation of sketches. The diffusion model progressively removes noise from random data in order to obtain sketches consistent with the provided labels. This reversal process can be seen as a sequence of steps, each of which exploits a U-Net neural network, a convolutional network known for its ability to capture both local and global details in images, to predict the noise to be removed and bring the generated data closer to the desired sample. Within the scope of this thesis, two distinct approaches for generating sketches have been examined. The first approach aims to generate sketches directly from the training set images, while the second focuses on generating trajectories defined by a series of points, each represented by offsets and pen-states. Both approaches use the diffusion model to model the generation process. The results of the experiments reveal that the second approach, based on trajectories, shows lower quality compared to the first. Numerous generated images present higher levels of noise, especially when dealing with more complex drawing classes. However, some samples generated with the trajectory-based approach demonstrate high creative potential. In conclusion, this thesis represents an in-depth investigation into drawing generation using the diffusion model. The experiments carried out provide important indications about challenges and opportunities in this field. Future research directions include improving training data pre-processing through detection and removal of incomplete sketches and improving trajectory-based approach through model architecture optimization and pen-state threshold management automation. With further developments, these techniques could enable the creation of creative and high-quality sketches in a variety of contexts and practical applications.È consentito all'utente scaricare e condividere i documenti disponibili a testo pieno in UNITESI UNIPV nel rispetto della licenza Creative Commons del tipo CC BY NC ND.
Per maggiori informazioni e per verifiche sull'eventuale disponibilità del file scrivere a: unitesi@unipv.it.
https://hdl.handle.net/20.500.14239/16826