The main post-transcriptional regulation process is splicing. Splicing removes the introns from pre-mRNA transcripts and joins exons to form a mature mRNA molecule which can be transported in the cytoplasm or can be retained into the nucleus. Transcripts produced from protein-coding genes are mainly exported in the cytoplasm where they are translated into proteins. On the contrary, long non-coding RNAs act as RNA molecules and are frequently retained into the nucleus. Splicing could be constitutive (i.e., always producing the same assortment of exons) or alternative (i.e., combining different exons or combining exons in different assortment). There are seven different types of alternative splicing events, including exon skipping, alternative 5’-splice site, alternative 3’-splice sites, mutually exclusive exons, intron retention, alternative first exon and alternative last exon. It has been estimated that 90% to 95% of the human genes undergo alternative splicing which thus represents a fundamental mechanism to increase proteome diversity. Many alternative splicing events alter the properties of the encoded proteins affecting functional domains or altering localization signals; the same has been described for lncRNAs that underwent alternative splicing to a similar extent and in which subcellular localization is essential to exert their function. The aim of this study is the identification of alternative splicing events altering the subcellular localization of specific transcripts. To this aim, we analysed RNA-sequencing data obtained from the nuclear or the cytosolic fraction of seven different cell lines and available in the ArrayExpress database under the accession number E-GEOD-30567. After applying a standard bioinformatics pipeline for quality control, genome alignment and for transcripts quantification, we performed a differential alternative splicing which allowed us to identify alternative splicing events altering the subcellular of the corresponding transcript. The analysis was conducted for protein-coding genes and long non-coding RNAs separately due to their overall differences in gene expression levels. We obtained several significant events from each of the seven cell lines, which we further characterized to investigate the subcellular localization of the significant events that were more nuclear or cytoplasmically enriched. Furthermore, we deeper investigate specific alternative splicing events which were specific for the pathological status of each cell lines.
Analisi di eventi di splicing alternativo che influenzano la localizzazione subcellulare dei trascritti. Il principale processo di regolazione post-trascrizionale è lo splicing. Lo splicing rimuove gli introni dai trascritti immaturi e unisce gli esoni per formare una molecola di mRNA matura che può essere trasportata nel citoplasma o può essere trattenuta nel nucleo. I trascritti prodotti dai geni codificanti proteine vengono esportati principalmente nel citoplasma dove vengono tradotti in proteine. Al contrario, gli RNA lunghi non codificanti agiscono come molecole di RNA e sono spesso trattenuti nel nucleo. Lo splicing può essere costitutivo (cioè produrre sempre lo stesso assortimento di esoni) o alternativo (cioè combinare esoni diversi o combinare esoni in assortimento diverso). È stato stimato che il 90% - 95% dei geni umani subisce splicing alternativo il quale rappresenta quindi un meccanismo fondamentale per aumentare la diversità del proteoma. Molti eventi di splicing alternativo alterano le proprietà delle proteine codificate influenzando domini funzionali o alterando segnali di localizzazione; lo stesso è stato descritto per gli RNA lunghi non codificanti che subiscono splicing alternativo in misura simile ai geni codificanti proteine e la cui localizzazione subcellulare è essenziale per la loro funzione. Lo scopo di questo studio è l'identificazione di eventi di splicing alternativo che alterano la localizzazione subcellulare di trascritti specifici. A questo scopo, abbiamo analizzato i dati di sequenziamento dell'RNA ottenuti dalla frazione nucleare o citosolica di sette diverse linee cellulari e disponibili nel database ArrayExpress con il numero di accesso E-GEOD- 30567. Dopo aver applicato una pipeline bioinformatica standard per il controllo di qualità, l'allineamento del genoma e per la quantificazione dei trascritti, abbiamo eseguito uno splicing alternativo differenziale che ci ha permesso di identificare eventi di splicing alternativi che alterano il subcellulare del trascritto corrispondente. L'analisi è stata condotta separatamente per i geni codificanti proteine e per gli RNA lunghi non-codificanti a causa delle loro differenze complessive nei livelli di espressione genica. Abbiamo ottenuto diversi eventi significativi da ciascuna delle sette linee cellulari, che abbiamo ulteriormente caratterizzato per studiarne la localizzazione subcellulare. Inoltre, abbiamo caratterizzato alcuni specifici eventi di splicing alternativi che erano specifici per lo stato patologico di ciascuna linea cellulare.
Analysis of alternative splicing events affecting the subcellular localization of transcripts
MURESAN, DIANA ANDREEA
2021/2022
Abstract
The main post-transcriptional regulation process is splicing. Splicing removes the introns from pre-mRNA transcripts and joins exons to form a mature mRNA molecule which can be transported in the cytoplasm or can be retained into the nucleus. Transcripts produced from protein-coding genes are mainly exported in the cytoplasm where they are translated into proteins. On the contrary, long non-coding RNAs act as RNA molecules and are frequently retained into the nucleus. Splicing could be constitutive (i.e., always producing the same assortment of exons) or alternative (i.e., combining different exons or combining exons in different assortment). There are seven different types of alternative splicing events, including exon skipping, alternative 5’-splice site, alternative 3’-splice sites, mutually exclusive exons, intron retention, alternative first exon and alternative last exon. It has been estimated that 90% to 95% of the human genes undergo alternative splicing which thus represents a fundamental mechanism to increase proteome diversity. Many alternative splicing events alter the properties of the encoded proteins affecting functional domains or altering localization signals; the same has been described for lncRNAs that underwent alternative splicing to a similar extent and in which subcellular localization is essential to exert their function. The aim of this study is the identification of alternative splicing events altering the subcellular localization of specific transcripts. To this aim, we analysed RNA-sequencing data obtained from the nuclear or the cytosolic fraction of seven different cell lines and available in the ArrayExpress database under the accession number E-GEOD-30567. After applying a standard bioinformatics pipeline for quality control, genome alignment and for transcripts quantification, we performed a differential alternative splicing which allowed us to identify alternative splicing events altering the subcellular of the corresponding transcript. The analysis was conducted for protein-coding genes and long non-coding RNAs separately due to their overall differences in gene expression levels. We obtained several significant events from each of the seven cell lines, which we further characterized to investigate the subcellular localization of the significant events that were more nuclear or cytoplasmically enriched. Furthermore, we deeper investigate specific alternative splicing events which were specific for the pathological status of each cell lines.È consentito all'utente scaricare e condividere i documenti disponibili a testo pieno in UNITESI UNIPV nel rispetto della licenza Creative Commons del tipo CC BY NC ND.
Per maggiori informazioni e per verifiche sull'eventuale disponibilità del file scrivere a: unitesi@unipv.it.
https://hdl.handle.net/20.500.14239/15596