Estimating support for protein-protein interaction data with applications to function prediction. Article

Zeng, E, Ding, C, Narasimhan, G et al. (2008). Estimating support for protein-protein interaction data with applications to function prediction. . 7 73-84. 10.1142/9781848162648_0007

cited authors

  • Zeng, E; Ding, C; Narasimhan, G; Holbrook, SR

abstract

  • Almost every cellular process requires the interactions of pairs or larger complexes of proteins. High throughput protein-protein interaction (PPI) data have been generated using techniques such as the yeast two-hybrid systems, mass spectrometry method, and many more. Such data provide us with a new perspective to predict protein functions and to generate protein-protein interaction networks, and many recent algorithms have been developed for this purpose. However, PPI data generated using high throughput techniques contain a large number of false positives. In this paper, we have proposed a novel method to evaluate the support for PPI data based on gene ontology information. If the semantic similarity between genes is computed using gene ontology information and using Resnik's formula, then our results show that we can model the PPI data as a mixture model predicated on the assumption that true protein-protein interactions will have higher support than the false positives in the data. Thus semantic similarity between genes serves as a metric of support for PPI data. Taking it one step further, new function prediction approaches are also being proposed with the help of the proposed metric of the support for the PPI data. These new function prediction approaches outperform their conventional counterparts. New evaluation methods are also proposed.

publication date

  • January 1, 2008

Digital Object Identifier (DOI)

start page

  • 73

end page

  • 84

volume

  • 7