Analisi dell'evoluzione di temi esistenziali nel corso di una psicoterapia attraverso tecniche di topic modeling, un contributo alla process outcome research

With this paper we set ourselves the objective of contributing to the literature existing regarding Process Outcome Research (POR), the line of research it investigates processes and outcomes to understand the effectiveness of the treatments in question: it is therefore it is necessary, in order to have a fluid understanding of the work in question, to introduce the POR placing it in the scientific context, illustrating its history and evolution, highlighting advantages and contributions, but also limitations. Since this is research in psychotherapy, it is necessary to introduce and describe the key concepts and the assumptions of the basic notions of Artificial Intelligence, increasingly considered conspicuous application of Machine Learning and Natural Language Processing models. The present work has set itself as a specific objective the investigation of two models in in particular, LDA (Latent Dirichlet Allocation) and BERTopic (Bidirectional Encoder Representations from Transformers), and their comparison to verify their effectiveness in the analysis of psychotherapy transcripts. The case analyzed is I.'s psychotherapy, consisting of 28 sessions conducted by a specialist of the cognitive neuropsychological approach; the interviews were audio and video recorded to then be transcribed respecting the standards proposed by Mergenthaler and Stinson (1992). A Microsoft Excel worksheet was subsequently created which I am on the topics have been labeled; the methodological analyzes on the dataset were conducted following Giorgi's phenomenological method (1985). The hypotheses underlying this research see the possibility that the two models are in able to extract topics consistent with the labels developed by the researchers and identify them of new ones. Furthermore, it is believed that based on its functioning and implementations performed previously, BERTopic can offer more precise and interpretable results. The results extracted from the two algorithms allowed us to conduct an analysis on the dataset employing a mixed methodology, both qualitative and quantitative. Specifically, substantial coherence emerged between the labels identified in the works precedents and topics extracted from the models; furthermore, BERTopic was able to extract topics very specific and to identify two never previously encountered, one inherent to temporal dimension and the second referring to geographical connotations.

Con il presente elaborato ci prefissiamo l’obbiettivo di contribuire alla letteratura esistente riguardo la Process Outcome Research (POR), il filone di ricerca che indaga processi ed esiti per comprendere l’efficacia dei trattamenti in oggetto: è dunque doveroso, al fine di una comprensione fluida del lavoro in oggetto, introdurre la POR collocandola nel contesto scientifico, illustrandone storia ed evoluzione, evidenziando vantaggi e contributi, ma anche limitazioni. Trattandosi di ricerca in psicoterapia, è necessario introdurre e descrivere i concetti chiave e i presupposti delle nozioni base di Intelligenza Artificiale, considerata la sempre più cospicua applicazione di modelli di Machine Learning e Natural Language Processing. Il presente lavoro si è posto come obbiettivo specifico l’indagine di due modelli in particolare, LDA (Latent Dirichlet Allocation) e BERTopic (Bidirectional Encoder Representations from Transformers), e la loro comparazione per verificarne l’efficacia nell’analisi di trascritti di psicoterapia. Il caso analizzato è la psicoterapia di I., composta di 28 sedute condotte da uno specialista dell’approccio cognitivo neuropsicologico; i colloqui sono stati audio e video registrati per poi essere trascritti rispettando gli standard proposti da Mergenthaler e Stinson (1992). È stato successivamente creato un foglio di lavoro Microsoft Excel sul quale sono stati etichettati i topic; le analisi metodologiche sul dataset sono state condotte seguendo il metodo fenomenologico di Giorgi (1985). Le ipotesi alla base della presente ricerca vedono la possibilità che i due modelli siano in grado di estrarre topic coerenti con le etichette sviluppate dai ricercatori di individuarne di nuovi. Inoltre, si ritiene che, sulla base del suo funzionamento e delle implementazioni eseguite in precedenza, BERTopic possa offrire risultati più precisi e interpretabili. I risultati estratti dai due algoritmi hanno permesso di condurre un’analisi sul dataset impiegando una metodologia mista, sia qualitativa che quantitativa. Nello specifico, è emersa una sostanziale coerenza tra le etichette individuate nei lavori precedenti e i topic estratti dai modelli; inoltre, BERTopic è stato in grado di estrarre topic molto specifici e di individuarne due mai riscontrati precedentemente, uno inerente alla dimensione temporale e il secondo riferito a connotazioni geografiche.