An Efficient Algorithm for Identifying Mutated Subnetworks Associated with Survival in Cancer

Sarkar A., Atay Y. , Erickson A. L. , Arisi I., Saltini C., Kahveci T.

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, vol.17, no.5, pp.1582-1594, 2020 (Journal Indexed in SCI) identifier identifier identifier

  • Publication Type: Article / Article
  • Volume: 17 Issue: 5
  • Publication Date: 2020
  • Doi Number: 10.1109/tcbb.2019.2911069
  • Page Numbers: pp.1582-1594


Protein-protein interaction (PPI) network models interconnections between protein-encoding genes. A group of proteins that perform similar functions are often connected to each other in the PPI network. The corresponding genes form pathways or functional modules. Mutation in protein-encoding genes affect behavior of pathways. This results in initiation, progression, and severity of diseases that propagates through pathways. In this work, we integrate mutation, survival information of patients, and PPI network to identify connected subnetworks associated with survival. We define the computational problem using a fitness function called log-rank statistic to score subnetworks. Log-rank statistic compares the survival between two populations. We propose a novel method, Survival Associated Mutated Subnetwork (SAMS) that adopts genetic algorithm strategy to find the connected subnetwork within the PPI network whose mutation yields highest log-rank statistic. We test on real cancer and synthetic datasets. SAMS generate solutions in negligible time while the state-of-art method in literature takes exponential time. Log-rank statistic of SAMS selected mutated subnetworks are comparable to the method. Our result genesets show significant overlap with well-known cancer driver genes derived from curated datasets and studies in literature, display high text-mining score in terms of number of citations combined with disease-specific keywords in PubMed, and identify pathways having high biological relevance.