MiRNAs are small non-coding RNA molecules, which act as negative regulators for gene expression. Preserved in many organisms, miRNAs are involved in numerous biological processes and their deregulation has been observed in a large number of human diseases. The discovery, which dates back to 1993, revolutionized the world of molecular biology. The phrase "man is what he eats" could be enriched with new meanings in the light of the studies done by a group of Chinese researchers, who verified that the miRNAs contained in plants, assimilated through food, are able to accumulate in tissues and fluids of the human who consumes them and to interact with the expression of his genes. This is evident in the case of a miRNA of rice, which binds to some receptors that control the removal of LDL from the bloodstream, preventing its activity. With the advent of new technologies several miRNAs have been discovered, both in food of plant origin (rice, apples), and of animal origin (milk, bovine).No specific information on their presence in the human diet. They can represent a new horizon in miRNA research. For example, mirBASE reports 808 miRNAs of cattle, while only 243 of them were found in cow's milk. My thesis, which constitutes a part of a much wider project, carried out by the Laboratory of Bioinformatics and Synthetic Biology (Dept. of Industrial Engineering and Information) in collaboration with the Dept. of Biology and Biotechnology, aimed to study the inter-species regulation potential of miRNAs of a model legume (Medicagotruncatula) towards human genes. In particular, for each miRNA of M. truncatula, the work involved the comparison between the predicted human and plant target genes, in terms of coding sequence (CDS) of DNA and protein, and of protein domains associated with the protein, in order to evaluate the genetic and functional similarities between the inter-species targets. The miRNAs examined, expressed by M.truncatula, were 372 and their targets were plant-based and man-identified through the use of the fingerol bioinformatics as target and RNAhybrid psRNAs. The analysis pipeline provided for the use of the lists of: miRNAs, predicted targets of the plant and targets predicted human, in order to obtain a similarity score among the targets. The pipeline was implemented by means of Matlab (MathWorks). Starting from the target lists, (codes related to M.Ttruncatula's genes the CDS e have been recovered FASTA files, related to transcriptase and proteome of M. truncatula, or by the get genbank command for H. sapiens. The recovered sequences are been organized in a deicell array, in which the respective targets were associated with each miRNA with their CDS and protein sequences of the two species. Finally, through the Smith Waterman algorithm, it was possible to align each gene and protein target with each individual target by a method of random permutations of the plant all the genes and target proteins of the same miRNA in humans; the score (p-value) of each alignment was calculated. Using as a parameter a p-value <0.05 have been identified, both for CDS and for proteins, 309 possible similarities between human and plant target sequences, which are currently under study. These similarities were further confirmed by analyzing the common domains of the various proteins obtained through pfam. This bioinformatics study represents the starting point for the study of miRNA-mediated regulation potential between two species, even if the results must, however, be validated
I miRNA sono piccole molecole di RNA non codificante, che agiscono come regolatori negativi dell’espressione genica. La frase “l’uomo è ciò che mangia” potrebbe arricchirsi di nuovi significati alla luce degli studi fatti da un gruppo di ricercatori cinesi, che hanno recentemente verificato che i miRNA contenuti nelle piante, assimilati attraverso l’alimentazione, sono in grado di accumularsi nei tessuti e nei fluidi dell’individuo che li consuma e di interagire con l’espressione dei suoi geni. Ciò è evidente nel caso di un miRNA del riso, che si lega ad alcuni recettori che controllano la rimozione degli LDL dal flusso sanguigno, impedendone l’attività. Con l’avvento delle nuove tecnologie sono stati scoperti vari miRNA presenti tanto in alimenti di origine vegetale (riso, mele), quanto di origine animale (latte bovino). Nessun database, tuttavia, di miRNA esistenti (es. GEO, TCGA) riporta informazioni specifiche sulla loro presenza nella dieta dell’uomo. Ad esempio, mirBASE riporta 808 miRNA bovini, mentre solo 243 di essi sono stati trovati nel latte di mucca. Tali informazioni possono rappresentare un nuovo orizzonte nella ricerca sui miRNA. La mia tesi, che costituisce una parte di un progetto molto più ampio, portato avanti dal Laboratorio di Bioinformatica e Biologia Sintetica (Dip. di Ingegneria Industriale e dell’Informazione) in collaborazione con il Dip. di Biologia e Biotecnologie, ha avuto lo scopo di studiare il potenziale di regolazione inter-specie dei miRNA di un legume modello (Medicago truncatula) verso geni dell’uomo. In particolare, per ciascun miRNA di M. truncatula, il lavoro ha previsto il confronto tra i geni target predetti di uomo e pianta, in termini di sequenza codificante (CDS) di DNA e proteina, e dei domini proteici associati alla proteina, al fine di valutare le similitudini genetiche e funzionali tra i target inter-specie. I miRNA esaminati, espressi da M. truncatula sono stati 372 e i loro target predetti in pianta e uomo sono stati individuati tramiti l’uso di tool bioinformatici come psRNATarget e RNAhybrid. La pipeline di analisi ha previsto l’utilizzo delle liste di: miRNA, target predetti di pianta e target predetti umani, al fine di ottenere un punteggio di similarità tra i target. La pipeline è stata implementata mediante il linguaggio Matlab (MathWorks). A partire dalle liste di target (codici relativi ai geni di M. truncatula e NM code relativi ai geni di H. sapiens) sono state recuperate CDS e proteine utilizzando due file FASTA, relativi a trascrittoma e proteoma di M. truncatula, o mediante il comando getgenbank per H. sapiens. Le sequenze recuperate sono state organizzate in un cell array, in cui ad ogni miRNA erano associati i rispettivi target con le loro sequenze CDS e proteiche delle due specie. Tramite l’algoritmo di Smith Waterman, infine, è stato possibile allineare ad ogni singolo gene e proteina della pianta tutti i geni e proteine target dello stesso miRNA nell’uomo; lo score (p-value)di ciascun allineamento è stato calcolato mediante un metodo di permutazioni casuali. Utilizzando come parametro un p-value<0,05 sono stati individuati, tanto per le CDS che per le proteine, 309 possibili similitudini tra sequenze di target umani e di pianta, che attualmente sono in fase di studio. Queste similitudini sono state ulteriormente confermate analizzando i domini comuni delle varie proteine ricavate tramite pfam. Tale studio bioinformatico rappresenta il punto di partenza per lo studio del potenziale di regolazione mediata da miRNA tra le due specie, anche se i risultati dovranno, comunque, essere validati sperimentalmente per valutare l’effettiva regolazione esercitata da parte dei miRNA in studio.
Sviluppo di una pipeline bioinformatica per lo studio della potenziale regolazione genica nell’uomo da parte di miRNA di pianta
BILELLO, VITO
2017/2018
Abstract
MiRNAs are small non-coding RNA molecules, which act as negative regulators for gene expression. Preserved in many organisms, miRNAs are involved in numerous biological processes and their deregulation has been observed in a large number of human diseases. The discovery, which dates back to 1993, revolutionized the world of molecular biology. The phrase "man is what he eats" could be enriched with new meanings in the light of the studies done by a group of Chinese researchers, who verified that the miRNAs contained in plants, assimilated through food, are able to accumulate in tissues and fluids of the human who consumes them and to interact with the expression of his genes. This is evident in the case of a miRNA of rice, which binds to some receptors that control the removal of LDL from the bloodstream, preventing its activity. With the advent of new technologies several miRNAs have been discovered, both in food of plant origin (rice, apples), and of animal origin (milk, bovine).No specific information on their presence in the human diet. They can represent a new horizon in miRNA research. For example, mirBASE reports 808 miRNAs of cattle, while only 243 of them were found in cow's milk. My thesis, which constitutes a part of a much wider project, carried out by the Laboratory of Bioinformatics and Synthetic Biology (Dept. of Industrial Engineering and Information) in collaboration with the Dept. of Biology and Biotechnology, aimed to study the inter-species regulation potential of miRNAs of a model legume (Medicagotruncatula) towards human genes. In particular, for each miRNA of M. truncatula, the work involved the comparison between the predicted human and plant target genes, in terms of coding sequence (CDS) of DNA and protein, and of protein domains associated with the protein, in order to evaluate the genetic and functional similarities between the inter-species targets. The miRNAs examined, expressed by M.truncatula, were 372 and their targets were plant-based and man-identified through the use of the fingerol bioinformatics as target and RNAhybrid psRNAs. The analysis pipeline provided for the use of the lists of: miRNAs, predicted targets of the plant and targets predicted human, in order to obtain a similarity score among the targets. The pipeline was implemented by means of Matlab (MathWorks). Starting from the target lists, (codes related to M.Ttruncatula's genes the CDS e have been recovered FASTA files, related to transcriptase and proteome of M. truncatula, or by the get genbank command for H. sapiens. The recovered sequences are been organized in a deicell array, in which the respective targets were associated with each miRNA with their CDS and protein sequences of the two species. Finally, through the Smith Waterman algorithm, it was possible to align each gene and protein target with each individual target by a method of random permutations of the plant all the genes and target proteins of the same miRNA in humans; the score (p-value) of each alignment was calculated. Using as a parameter a p-value <0.05 have been identified, both for CDS and for proteins, 309 possible similarities between human and plant target sequences, which are currently under study. These similarities were further confirmed by analyzing the common domains of the various proteins obtained through pfam. This bioinformatics study represents the starting point for the study of miRNA-mediated regulation potential between two species, even if the results must, however, be validatedÈ consentito all'utente scaricare e condividere i documenti disponibili a testo pieno in UNITESI UNIPV nel rispetto della licenza Creative Commons del tipo CC BY NC ND.
Per maggiori informazioni e per verifiche sull'eventuale disponibilità del file scrivere a: unitesi@unipv.it.
https://hdl.handle.net/20.500.14239/25970