[PDF] DEPCOD: a tool to detect and visualize co-evolution of protein





Previous PDF Next PDF



SVT TB TP 5.1. - Diversité des Animaux / Phylogénie - T. JEAN

Lycée Valentine Labbé (59) • Classe préparatoire TB • SVT • Partie 5 • TP 5.1. Construction d'arbres phylogénétiques avec Phylogène ... Second edition.



TP2 : La parenté des êtres vivants

On dispose du logiciel Phylogène et sa fiche technique d'utilisation. 1 – La théorie ! Les ressemblances entre les Vertébrés montrent qu'ils sont apparentés 



Thème 1 : La Terre dans lUnivers la vie et lévolution du vivant

TP n°7 seconde : utilisant la base de données du logiciel Phylogène. ... Pour comparer des êtres vivants avec Phylogène il faut créer un tableau appelé ...



tp-phylogene-complet.pdf

Travail avec le logiciel "Phylogène". Les relations de parenté entre l'homme et les autres Vertébrés peuvent être précisées en.



Correction FA7/TP7 Un regard sur lévolution de lHomme. Homo

Correction TP7 : Construire un arbre phylogénétique avec un logiciel : phylogène. On a choisi les séquences d'une enzyme : NAD (En TP : globines ; livre ...







Mettre les élèves en activité au collège pour les former les évaluer

utilisation d'une base de données informatisée (comme « Phylogène collège ») 12 Les TP et la diversification de l'action pédagogique rapport de l'IGEN



Terminale S - Parenté entre êtres vivants

Phylogène et sont susceptibles d'être traitées à l'aide de fonctions propres à ce logiciel comme celle de construction d'arbre par exemple.



Académie de : XXXXX

Seconde : 3 2nde. 35. 80. 34. 35. Les retours des élèves sur : a. Leur implication ... TP. Suivre ou proposer un protocole. Activités différenciées.



LES BASES DE LA CLASSIFICATION

• Ouvrir le logiciel Phylogène puis ouvrir le fichier des séquences ADN • Sélectionner dans la matrice (partie basse de la page) : o 3 Homo sapiens (Italien1 Français Néerlandais) o 3 Néanderthaliens (NEANDERTHAL_CROATIE VINDIJA ELSIDRON) o 1 Dénisovien (DENISOVA) et o 1 Singe Bonobo (PAN_PANISCUS)



TP N°2 : Comparons la biodiversité actuelle à celle du passé

TP2 et des informations fournies par le logiciel phylogène pour reconstituer la forêt à l’origine des principaux gisements de charbon de cette période (-300 millions d'années) 1 Lepidodendron 2 Sigillaire 3 Cordaites 4 Calamites 5 Arbre-fougère À l’aide des informations fournies complétez le tableau ci-dessous Espèces



La place de l'Homme parmi les Vertébrés

Travail avec le logiciel "Phylogène" Les relations de parenté entre l'homme et les autres Vertébrés peuvent être précisées en utilisant des données anatomiques morphologiques embryologiques chromosomiques et moléculaires concernant les organismes vivant actuellement Objectifs :

Quels sont les principes de la classification phylogénétique?

Linné, 1761. Afin de classer ces espèces, la classification phylogénétique se base sur certains principes. Un des premiers principes est celui de l'économie d'hypothèses, aussi appelé principe de la parcimonie, afin de mieux faire le lien entre les espèces il faut que ce lien de parenté puisse être probable.

Comment fonctionne la phylogénie ?

Dans la phylogénie, on fonctionne sur une durée beaucoup plus longue pouvant atteindre plusieurs centaines de millions d’années avec des traces indirectes (les caractères hérités des ancêtres) et des traces incomplètes et rares (les fossiles).

Quels sont les différents types de données de phylogène?

Pour cela, Phylogène dispose d’un vaste ensemble de données sur de nombreuses espèces : mode de vie, morphologie, anatomie, ainsi que des données moléculaires (séquences de gènes et de protéines).

Qu'est-ce que la phylogénie moléculaire?

La phylogénie moléculaire utilise les gènes des organismes vivants pour élaborer des arbres phylogénétiques en comparant les caractères moléculaires (séquences nucléotidiques ou protéiques). La phylogénie moléculaire essaie de retracer l'accumulation des mutations dans les génomes au cours de l'évolution des espèces.

W246-W253Nucleic Acids Research, 2022, Vol. 50, Web Server issue Published online 10 May 2022 https://doi.org/10.1093/nar/gkac349 DEPCOD: a tool to detect and visualize co-evolution of protein domains

Fei Ji

1,2,†

, Gracia Bonilla

1,2,†

, Rustem Krykbaev 1 , Gary Ruvkun 1,2 , Yuval Tabach 3 and

Ruslan I. Sadreyev

1,4,* 1 Department of Molecular Biology, Massachusetts General Hospital, Boston, MA, USA, 2

Department of Genetics,

Harvard Medical School, Boston, MA, USA,

3 Department of Developmental Biology and Cancer Research, Faculty of Medicine, The Hebrew University of Jerusalem, Ein Kerem 9112102, Israel and 4

Department of Pathology,

Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA

Received March 08, 2022; Revised April 13, 2022; Editorial Decision April 21, 2022; Accepted April 26, 2022

ABSTRACT

Proteins with similar phylogenetic patterns of con- servation or loss across evolutionary taxa are strong candidates to work in the same cellular pathways or engage in physical or functional interactions.

Our previously published tools implemented our

method of normalized phylogenetic sequence pro- filing to detect functional associations between non- homologous proteins. However, many proteins con- sist of multiple protein domains subjected to differ- ent selective pressures, so using protein domain as the unit of analysis improves the detection of simi- lar phylogenetic patterns. Here we analyze sequence conservation patterns across the whole tree of life for every protein domain from a set of widely stud- ied organisms. The resulting new interactive web- server, DEPCOD (DEtection of Phylogenetically COr- lected pre-defined protein domain or a user-supplied sequence as a query to detect other domains from the same organism that have similar conservation patterns. Top similarities on two evolutionary scales (the whole tree of life or eukaryotic genomes) are displayed along with known protein interactions and shared complexes, pathway enrichment among the hits, and detailed visualization of sources of de- tected similarities. DEPCOD reveals functional rela- tionships between often non-homologous domains that could not be detected using whole-protein se- quences. The web server is accessible athttp:// genetics.mgh.harvard.edu/DEPCOD.GRAPHICAL ABSTRACT

INTRODUCTION

A shared evolutionary history of two proteins across var- ious organisms may suggest similar functions, shared cel- lular pathways and protein complexes, or functional inter- actions between these proteins regardless of whether they are homologous to each other (1-6). Our ?rst generation PhyloGene webserver (7), publicly available since 2015, im- (NPP) of whole-protein sequences, which has been used to detect protein functional associations and predict function of previously uncharacterized proteins, identify new mem- bers of metabolic and regulatory pathways, and reveal pro- tein and pathway adaptions of speci?c organisms (8-17). Proteins that act in the same pathway often have very sim- ilar patterns of conservation, retention, or loss of their ho- mologs in particular taxa of organisms, which can be rep- resented in the form of their phylogenetic pro?les. As an

To whom correspondence should be addressed. Tel: +1 617 643 5697; Email: sadreyev@molbio.mgh.harvard.edu

The authors wish it to be known that, in their opinion, the ?rst two authors should be regarded as joint First Authors.

C?The Author(s) 2022. Published by Oxford University Press on behalf of Nucleic Acids Research.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which

permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.Downloaded from https://academic.oup.com/nar/article/50/W1/W246/6583236 by guest on 22 September 2023

Nucleic Acids Research, 2022, Vol. 50, Web Server issueW247 intensively studied example, a query of one electron trans- port complex protein will reveal many other complex I pro- teins with no homology to the query protein but a similar phylogenetic pattern of sequence conservation, instantly re- can be applied to less studied pathways as well. between individual representatives of the same sequence family in taxonomically distant organisms. However, dif- ferent domains within the same protein often evolve un- der different evolutionary constrains and occur in various combinations in different species, especially between taxa at higher levels of taxonomic hierarchy. Protein domains are the functional modules of proteins that can fold, func- tion, and evolve often independently of other domains in the same protein. Variation of particular protein domains or even abrupt changes of domain architecture during evo- lution may often be due to the relaxation of past functional requirements and changing evolutionary pressures on do- main function as organisms specialize for new niches or evolve displacing pathways. These differences of conserva- tion patterns among different domains of the same protein reduce the level of sequence similarity in the analyses of conservation at the level of the whole protein. Therefore, focusing on protein domains as independent evolutionary units should bring more biological relevance and clearer correlations in detecting similar patterns of sequence con- servation. Every protein domain in a given genome can be assigned a phylogenetic pro?le of its relative conservation, variation, or loss based on its sequence similarity to the ho- mologs across hundreds of diverse eukaryotic and prokary- otic species. This phylogenetic pattern can be used to search of conservation or loss. This general approach has been dis- cussed and implemented previously, but mainly in the con- text of domain identi?cation within protein sequences (19-

21) and prediction of protein-protein interactions (22). To

our knowledge, there are no publicly available web server tools for the similarity detection and visualization of phylo- genetic pro?les of individual protein domains. Here, we developed a new interactive web server, DE- PCOD (DEtection of Phylogenetically COrrelated Do- mains), which allows the user to submit a query protein do- main from an individual protein in a selected organism and (a) instantly identify the taxa of organisms that have con- served, varied, or lost this domain; (b) detect other protein domains in the same organism that have a correlated pat- tern of sequence conservation across a wide range of taxo- nomically diverse species; (c) understand functions of these domains, known physical interactions and shared protein complexes with the query and (d) inspect possible sources and evolutionary relevance of the similarity between their conservation patterns. This new web server reveals functional relationships be- tween individual domains beyond the detection based on a combination of methodological and functional features, ?les at two scales: 244 eukaryotic genomes or 506 genomes

from all three domains of life:Eukaryota,Bacteria,andArchaea; (ii) incorporation of phylogeny of the searched

genomes into the correlation of conservation patterns; (iii) teractions and shared protein complexes between the query and the detected hits; (iv) analysis of GO (25,26), KEGG (27,28) and Reactome (29) pathway enrichment among the detected hits and (v) visualization of details and sources of detected patterns similarities (conservation values for in- dividual species, taxonomic trees, links to the information about detected domains and domain families, etc).

MATERIALS AND METHODS

Using PFAM (30) domain annotation within protein se- quences in the genomes of widely studied organisms (H. sapiens,M. musculus,D. melanogaster,C. elegans,S. cere- visae,A. thaliana,E. coli,

B. subtilis), we split the whole se-

quence of each protein into PFAM domains (30) and parts without detected PFAM homologs. The resulting sequences were used to generate a domain based normalized phylogenetic pro?le (NPP) for each do- similar to our previously described approach (7,9). In brief, each domain was used as a query for the BLASTP search in our comprehensive database compiled of all proteins from

506 representative genomes from all three domains of life.

Scores of top BLAST hits with moderate to high signi?- cance were normalized by the BLAST score of the query to itself, and then transformed into genome-speci?cZ-scores based on the distributions of normalized BLAST scores across a given representative genome. As a result, each do- main from the query genomes of model organisms was as- signed a NPP, a vector ofZ-scores for its closest homolog in each of 506 representative target genomes. To assess the similarity between evolutionary conserva- tion patterns of two domains from the same query genome, their NPPs were compared to each other. As a measure of similarity between pro?les, we used Pearson correlation co- chose PearsonRover an alternative of Spearman correla- tion coef?cient (a more robust but less sensitive measure of correlation) since in our tests PearsonRproduced more ac- curate and functionally relevant results. As DEPCOD pro- ?les are based on a larger number of target genomes, we introduced a new modi?cation into the calculation of Pear- son correlation coef?cient, which downweighs closely re- lated species among 506 genomes by weighing each tar- get genome proportionally to the number of genomes sam- pled from the same taxonomic clade in the NCBI taxon- omy (31,32). To estimate statistical signi?cance of the re- sulting pro?le similarity, we calculated aZ-score using the empirical distribution generated by random shuf?ing of weighted Pearson coef?cients across target genomes. Based on extensive manual inspection of DEPCOD hits for multi- ple queries, we suggest the approximate cutoffs of Pearson R>0.6 and signi?canceZ-score>4.0 as a combined crite- rion of a con?dent pro?le similarity to the query. To high- light these con?dent hits, we introduced a separate column in the output heatmap, 'Correlated and signi?cant". How-

ever, the user is encouraged to inspect the hits beyond thisDownloaded from https://academic.oup.com/nar/article/50/W1/W246/6583236 by guest on 22 September 2023

W248Nucleic Acids Research, 2022, Vol. 50, Web Server issue combined criterion, as these domains can sometimes also show functional associations with the query.

For each pre-de?ned standard domain from a model

genome, the NPP is pre-computed and stored along with responding Pearson correlation coef?cients and their statis- of pre-de?ned domains in the model genome of choice, then the query NPP and top hits are quickly retrieved from the pre-computed set of domain NPPs. If the user chooses to submit an amino acid sequence as a query, then query NPP is calculated by ?rst running BLASTP with the user- supplied sequence against the set of proteins in all target genomes. Depending of the user"s choice, these calculations can be performed across 244 eukaryotic genomes or the full set of 506 species includingArchaeaandBacteria.

To provide the information about known physical

protein-protein interactions of each hit with the query, we used the con?dence values for interactions from BioGRID (23) and Hu.Map 2.0 (24), as well as predictions of protein complexes from Hu.Map 2.0.quotesdbs_dbs26.pdfusesText_32
[PDF] innovation évolutive arbre phylogénétique

[PDF] ancêtre commun des vertébrés

[PDF] la place de l'homme dans l'évolution

[PDF] l'homme descend du singe svt

[PDF] la place de l'homme dans l'évolution svt 3ème

[PDF] theorie de darwin pdf

[PDF] arbre phylogénétique exercice

[PDF] arbre probabilité seconde

[PDF] probabilité conditionnelle cours

[PDF] arbre pondéré 1ere es

[PDF] exercices maths terminale stmg

[PDF] arbre syntaxique phrase complexe

[PDF] arbre syntaxique linguistique

[PDF] l'arbre syntagmatique des phrases exercices pdf

[PDF] arbre syntaxique exercices corrigés pdf