BIRS Workshop Lecture Videos
Graph Theory in Orthology Detection Hernández Rosales, Maribel
During evolution genes go throw many events, such as duplication, speciation, loss, horizontal gene transfer, among others. Two genes are said to be paralogs if they are the product of a duplication event, and orthologs if they are the product of a speciation event. The distinction of paralogs and orthologs is an important problem in comparative and evolutionary genomics. Moreover, orthology detection is a first step towards any functional annotation study.
The evolutionary history of a set of genes can be represented as a phylogenetic tree where leaves represent genes and internal nodes evolutionary events. In this work, we investigate a graph theory-based method for the prediction of large-scale orthologous genes and the reconstruction of their evolutionary history [1,2,3]. We represent genes as vertices of a graph and place an edge between two genes if their sequence similarity is high. We characterize mathematically the topological properties of this graph in order to have only valid orthology relations. Surprisingly, graphs that represent valid orthology relations are P4-free, i.e. graphs which do not contain induced paths of length four. These graphs have been studied earlier and have been characterized as cographs . We further investigate a set of induced subgraphs that give us evidence of noise in the data set or of wrong orthology predictions. In order to remove those induced subgraphs, we need to come up with a solution for the cograph editing problem, which has been found to be NP-complete . Here we also present a work-in-process heuristic for the cograph edting problem that will help us to induce valid orthology relations.
Acknowledgments: This research was supported by Conacyt Mexico and DAAD Germany.
 Marcus Lechner, Maribel Hernandez-Rosales, Daniel Doerr, Nicolas Wieseke, An- nelyse Thevenin, Jens Stoye, Sonja J. Prohaska and Peter F. Stadler. Orthology Detection Combining Clustering and Synteny for Very Large Data Sets. PlosONE, 9(8):e105015, (2014).
 Marc Hellmuth, Maribel Hernandez-Rosales, Katharina T. Huber, Vincent Moul- ton, Peter F. Stadler, and Nicolas Wieseke. Orthology relations, symbolic ul- trametrics, and cographs. J. Math. Biol. 66(1-2):399-420, (2013).
 Maribel Hernandez-Rosales, Marc Hellmuth, Nicolas Wieseke, Katharina Huber, Vincent Moulton, and Peter F. Stadler. From event-labeled gene trees to species trees. BMC Bioinformatics 13(Suppl. 19):S6, (2012)
 Corneil DG, Lerchs H, Stewart Burlingham LK. Complement reducible graphs. Discrete Appl Math 3:163â 174, (1981).
 Liu Y, Wang J, Guo J, Chen J. Cographs editing: complexity and parametrized algorithms. In: Fu B, Du DZ (eds) COCOON 2011. Lecture notes computer science, vol 6842. Springer, Berlin, pp 110â 121, (2011).
Item Citations and Data
Attribution-NonCommercial-NoDerivatives 4.0 International