The EGMS (European Ground Motion Service) provides ground-motion measurements derived from Sentinel-1 InSAR data over Europe, which are particularly useful for geospatial analysis in landslide-prone regions. However, the datasets available in EGMS are massive and comprise bursts and swaths with varying degrees of overlap in ground-motion data. This generates duplicate and overlapping entries, making it difficult to use EGMS ground-motion data directly for geospatial analysis in GIS software. Within this framework, the current thesis proposes and applies an automated, repeatable Extract, Transform, and Load (ETL) process for point datasets in EGMS, aiming to produce cleaned and normalized datasets from the raw EGMS files. The developed approach will combine the use of Python-based data-engineering tools (organization of the files, naming, burst-to-swath level fusion, elimination of repeated and overlapping, and export of cleaned datasets) and QGIS-based geoprocessing (clipping of datasets to predefined polygons of study areas and overlap of the resulting datasets with the SIFRAP landslide map). This methodology will be applied to several study areas in Northern Italy. To improve usability for non-programmers, the Python processing pipeline was also compiled as an executable (.exe) with a graphical user interface. This tool enables automated merging of the dataset and filtering of duplicates without requiring Python installation. Additionally, this tool creates processing logs. The main outcome of this effort is the development of an overarching, scalable workflow that can take raw EGMS download files and produce processed, spatially filtered datasets ready for integration into QGIS and use in future studies (e.g., deformation or machine learning analyses). Although not intended for the analysis of EGMS deformation data, this work provides a useful starting point and framework for conducting future EGMS-related geospatial analyses.

The EGMS (European Ground Motion Service) provides ground-motion measurements derived from Sentinel-1 InSAR data over Europe, which are particularly useful for geospatial analysis in landslide-prone regions. However, the datasets available in EGMS are massive and comprise bursts and swaths with varying degrees of overlap in ground-motion data. This generates duplicate and overlapping entries, making it difficult to use EGMS ground-motion data directly for geospatial analysis in GIS software. Within this framework, the current thesis proposes and applies an automated, repeatable Extract, Transform, and Load (ETL) process for point datasets in EGMS, aiming to produce cleaned and normalized datasets from the raw EGMS files. The developed approach will combine the use of Python-based data-engineering tools (organization of the files, naming, burst-to-swath level fusion, elimination of repeated and overlapping, and export of cleaned datasets) and QGIS-based geoprocessing (clipping of datasets to predefined polygons of study areas and overlap of the resulting datasets with the SIFRAP landslide map). This methodology will be applied to several study areas in Northern Italy. To improve usability for non-programmers, the Python processing pipeline was also compiled as an executable (.exe) with a graphical user interface. This tool enables automated merging of the dataset and filtering of duplicates without requiring Python installation. Additionally, this tool creates processing logs. The main outcome of this effort is the development of an overarching, scalable workflow that can take raw EGMS download files and produce processed, spatially filtered datasets ready for integration into QGIS and use in future studies (e.g., deformation or machine learning analyses). Although not intended for the analysis of EGMS deformation data, this work provides a useful starting point and framework for conducting future EGMS-related geospatial analyses.

Design and Implementation of an Automated ETL Workflow for EGMS Ground Motion Data Using Python and QGIS

HAJIZADEHIGDIR, SINA
2024/2025

Abstract

The EGMS (European Ground Motion Service) provides ground-motion measurements derived from Sentinel-1 InSAR data over Europe, which are particularly useful for geospatial analysis in landslide-prone regions. However, the datasets available in EGMS are massive and comprise bursts and swaths with varying degrees of overlap in ground-motion data. This generates duplicate and overlapping entries, making it difficult to use EGMS ground-motion data directly for geospatial analysis in GIS software. Within this framework, the current thesis proposes and applies an automated, repeatable Extract, Transform, and Load (ETL) process for point datasets in EGMS, aiming to produce cleaned and normalized datasets from the raw EGMS files. The developed approach will combine the use of Python-based data-engineering tools (organization of the files, naming, burst-to-swath level fusion, elimination of repeated and overlapping, and export of cleaned datasets) and QGIS-based geoprocessing (clipping of datasets to predefined polygons of study areas and overlap of the resulting datasets with the SIFRAP landslide map). This methodology will be applied to several study areas in Northern Italy. To improve usability for non-programmers, the Python processing pipeline was also compiled as an executable (.exe) with a graphical user interface. This tool enables automated merging of the dataset and filtering of duplicates without requiring Python installation. Additionally, this tool creates processing logs. The main outcome of this effort is the development of an overarching, scalable workflow that can take raw EGMS download files and produce processed, spatially filtered datasets ready for integration into QGIS and use in future studies (e.g., deformation or machine learning analyses). Although not intended for the analysis of EGMS deformation data, this work provides a useful starting point and framework for conducting future EGMS-related geospatial analyses.
2024
Design and Implementation of an Automated ETL Workflow for EGMS Ground Motion Data Using Python and QGIS
The EGMS (European Ground Motion Service) provides ground-motion measurements derived from Sentinel-1 InSAR data over Europe, which are particularly useful for geospatial analysis in landslide-prone regions. However, the datasets available in EGMS are massive and comprise bursts and swaths with varying degrees of overlap in ground-motion data. This generates duplicate and overlapping entries, making it difficult to use EGMS ground-motion data directly for geospatial analysis in GIS software. Within this framework, the current thesis proposes and applies an automated, repeatable Extract, Transform, and Load (ETL) process for point datasets in EGMS, aiming to produce cleaned and normalized datasets from the raw EGMS files. The developed approach will combine the use of Python-based data-engineering tools (organization of the files, naming, burst-to-swath level fusion, elimination of repeated and overlapping, and export of cleaned datasets) and QGIS-based geoprocessing (clipping of datasets to predefined polygons of study areas and overlap of the resulting datasets with the SIFRAP landslide map). This methodology will be applied to several study areas in Northern Italy. To improve usability for non-programmers, the Python processing pipeline was also compiled as an executable (.exe) with a graphical user interface. This tool enables automated merging of the dataset and filtering of duplicates without requiring Python installation. Additionally, this tool creates processing logs. The main outcome of this effort is the development of an overarching, scalable workflow that can take raw EGMS download files and produce processed, spatially filtered datasets ready for integration into QGIS and use in future studies (e.g., deformation or machine learning analyses). Although not intended for the analysis of EGMS deformation data, this work provides a useful starting point and framework for conducting future EGMS-related geospatial analyses.
File in questo prodotto:
File Dimensione Formato  
Tesi Sina.pdf

accesso aperto

Descrizione: Tesi Sina
Dimensione 19.05 MB
Formato Adobe PDF
19.05 MB Adobe PDF Visualizza/Apri

È consentito all'utente scaricare e condividere i documenti disponibili a testo pieno in UNITESI UNIPV nel rispetto della licenza Creative Commons del tipo CC BY NC ND.
Per maggiori informazioni e per verifiche sull'eventuale disponibilità del file scrivere a: unitesi@unipv.it.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14239/34082