Comportamenti proattivi in dialoghi task-oriented ad iniziativa mista: annotazione di un corpus e analisi per fini computazionali.

Proactivity is a characteristic phenomenon of collaborative human-human interaction: it is regarded as the ability to provide the addressee with some useful and not explicitly requested information. In the context of task-oriented dialogues, that is in the context of dialogues that aim at the achievement of one or more particular goals – may it be the booking of a train ticket, or the selection of a fitting job offer, and so on – it is made use of the term proactivity to point to the loci in the dialogue where a participant attempts to achieve in a quicker and more efficient way the selected goal by exhibiting a proactive behaviour, which by definition: (i) is self-prompted and not simply reactive, that is the speaker does not act merely in response to the requests the other participant has made; (ii) is somehow effective for the achievement of the dialogue goal, since the speaker has a long-term, goal-directed behaviour that predicts future states and needs. The presented research project introduces the research field of proactivity in dialogue and then offers a corpus-based analysis of the presence of proactive behaviour within some mixed-initiative, task-oriented, human-human dialogue corpora. The analysed data-sets, namely the Italian NESPOLE! Corpus (2003), the Italian Ubuntu Chat Corpus (2012), MultiWOZ 2.2 (2021) and the JILDA Corpus (2021), have been collected through various methodologies; dialogues cover different domains. From these data-sets a significant sample has been extracted and annotated for the recognition of proactive behaviour and for a proactivity classification based on assigning a communicative function to the dialogue act associated to the utterance that holds some proactive content. The annotation relies on an essential structure and is carried out at the utterance level in dialogue turns offered by either human participant to the conversation. The annotation schema is available for thorough examination to whomever is willing to replicate the labeling: it is attached to the thesis as appendix A: ‘Guidelines for the annotation of proactivity in task-oriented dialogues’. Our annotation work can prove its validity in supporting linguistic research on proactive behaviour and on its occurrences in the context of task-oriented dialogues, subjects that come of interest also owing to the fact that only of late the scientific community has directed its attention towards proactivity, insofar as we see shortage of dedicated studies. Yet our annotation can give some inputs in the applied and computational field to the implementation of utterance-level proactive behaviours in dialogue systems aimed at human-machine collaborative interaction: artificial models indeed have lack of those cooperative strategies that are regularly employed by humans, such as grounding phenomena, clarifying questions or reformulations and – which above all interest us – proactive behaviour.

La proattività è caratteristica tipica della conversazione collaborativa tra umani: corrisponde alla capacità di fornire all’interlocutore informazioni utili non precedentemente richieste. Nel contesto dei dialoghi task-oriented – ovvero dialoghi che mirano al raggiungimento di uno scopo, sia esso prenotare una cena a un ristorante, selezionare un’offerta di lavoro adeguata e così via – si fa uso del termine proattività per indicare in generale i luoghi del dialogo in cui un interlocutore, per raggiungere prima e meglio lo scopo prefissato, mostra un comportamento proattivo che per definizione: (i) si origina per iniziativa di chi lo propone; (ii) apporta informazione nuova e perciò non introdotta in precedenza o richiesta dall’interlocutore; (iii) è in qualche modo utile a raggiungere lo scopo del dialogo. Si propongono, nel presente elaborato di tesi, un’introduzione teorica allo studio della proattività nel dialogo e un’analisi corpus-based del comportamento proattivo presente all’interno di alcuni dataset dialogici di conversazioni task-oriented ad iniziativa mista tra esseri umani. I dataset presi in esame differiscono per metodologia di raccolta e dominio di interesse; essi sono il corpus in lingua inglese MultiWOZ 2.2 (2021) e i corpora in lingua italiana NESPOLE! (2003), Ubuntu Chat Corpus (2012) e JILDA (2021). Su un campione significativo di dialoghi è stata effettuata un’annotazione, fondata su uno schema essenziale, per l’identificazione della proattività e per una sua classificazione sulla base della funzione comunicativa dell’atto di dialogo in cui il contenuto proattivo è stato osservato. L’annotazione si colloca al livello dell’utterance nei turni di dialogo riferibili a ciascun partecipante umano della conversazione. È disponibile in appendice alla tesi il documento delle ‘Guidelines for the annotation of proactivity in task-oriented dialogues’ a cui è possibile fare riferimento per esaminare lo schema d’annotazione e replicarne l’utilizzo. L’annotazione non solo risulta efficace nel supportare una ricerca linguistica del fenomeno della proattività e delle sue occorrenze in contesti dialogici task-oriented, che interessano anche per la scarsità di studi dedicati all’argomento dal momento che l’attenzione della comunità scientifica solo recentemente vi si è rivolta. Essa può anche offrire spunti in ambito più strettamente applicativo e computazionale per un’implementazione di comportamenti di tipo proattivo a livello di utterance in sistemi di dialogo per l’interazione collaborativa tra uomo e macchina, dove si trova carenza di strategie di cooperazione quali fenomeni di grounding, domande e riformulazioni chiarificatrici oppure – cosa che soprattutto ci interessa – fenomeni proattivi.