This work is concerned with the estimation and forecasting of financial returns volatility of SPDR S&P 500 Trust (SPY) in the presence of high-frequency data, paying particular attention to information flow. The first chapter introduces the two main theories that aim to explain the mechanism by which new information affects price variability: the Mixture of Distribution Hypothesis (MDH) and the Sequential Information Arrival Hypothesis (SIAH). We estimate volatility by means of the realized variance (RV) estimator, which allows to infer with precision the quadratic variation (QV) of the return process. To capture the long memory of the RV series we employ a collection of Heterogeneous Autoregressive (HAR) models. The “OLS nature” of the HAR specification allows to easily incorporate trading volume as an additional explanatory variable that proxies information flow. We exploit the Empirical Mode Decomposition (EMD) method to decompose trading volume into a long- and a short-run component and test whether the newly obtained series uncover additional information that is useful in forecasting volatility. The predictions of all the models considered are then compared by means of the Model Confidence Set (MCS) procedure. No significant differences in terms of prediction accuracy are detected, suggesting that the contribution of volume is negligible. Therefore, we find evidence in favor of MDH.
Questo lavoro tratta della stima e della previsione della volatilità dello SPDR S&P 500 Trust (SPY) con dati ad alta frequenza, ponendo particolare attenzione al ruolo del flusso informativo. Il primo capitolo introduce le due principali teorie che si propongono di spiegare il meccanismo tramite il quale le news influenzano la variabilità del prezzo: la Mixture of Distributions Hypothesis (MDH) e la Sequential Information Arrival Hypothesis (SIAH). Per stimare la volatilità utilizziamo lo stimatore varianza realizzata (RV), che permette di inferire con precisione la variazione quadratica (QV) del processo dei ritorni. Per descrivere la lunga memoria della serie storica della RV utilizziamo diversi modelli della classe HAR (Heterogeneous Autoregressive). Grazie alla “natura OLS” di questi modelli, possiamo facilmente includere il volume tra i regressori, utilizzandolo come proxy per il flusso informativo. Tramite il metodo EMD (Empirical Mode Decomposition) scomponiamo la serie del volume in una componente di breve periodo e in una componente di lungo periodo, in modo da verificare se le serie così ottenute contengano dell’informazione aggiuntiva che possa risultare utile ai fini della previsione della volatilità. Le previsioni di tutti i modelli considerati sono infine confrontate tramite la procedura MCS (Model Confidence Set). Non troviamo alcuna differenza significativa nella precisione delle previsioni dei diversi modelli: il contributo del volume è trascurabile. Di conseguenza, l’evidenza è a favore della MDH.
Trading Volume and Realized Variance Forecasting: an Empirical Analysis of SPY
MORELLI, ALESSANDRO
2021/2022
Abstract
This work is concerned with the estimation and forecasting of financial returns volatility of SPDR S&P 500 Trust (SPY) in the presence of high-frequency data, paying particular attention to information flow. The first chapter introduces the two main theories that aim to explain the mechanism by which new information affects price variability: the Mixture of Distribution Hypothesis (MDH) and the Sequential Information Arrival Hypothesis (SIAH). We estimate volatility by means of the realized variance (RV) estimator, which allows to infer with precision the quadratic variation (QV) of the return process. To capture the long memory of the RV series we employ a collection of Heterogeneous Autoregressive (HAR) models. The “OLS nature” of the HAR specification allows to easily incorporate trading volume as an additional explanatory variable that proxies information flow. We exploit the Empirical Mode Decomposition (EMD) method to decompose trading volume into a long- and a short-run component and test whether the newly obtained series uncover additional information that is useful in forecasting volatility. The predictions of all the models considered are then compared by means of the Model Confidence Set (MCS) procedure. No significant differences in terms of prediction accuracy are detected, suggesting that the contribution of volume is negligible. Therefore, we find evidence in favor of MDH.È consentito all'utente scaricare e condividere i documenti disponibili a testo pieno in UNITESI UNIPV nel rispetto della licenza Creative Commons del tipo CC BY NC ND.
Per maggiori informazioni e per verifiche sull'eventuale disponibilità del file scrivere a: unitesi@unipv.it.
https://hdl.handle.net/20.500.14239/2838