Detection of Distant Structural Similarities in a Set of Proteins Using a Fast Graph-based Method

Ina Koch and Thomas Lengauer

We introduce a method for finding weak structural similarities in a set of protein structures. Proteins are considered at their secondary structure level. The method uses a rigorous graph-theoretical algorithm which finds all structural similarities. Protein structures are modelled as undirected labelled graphs, the so-called protein graphs. We suggest that for detecting the similarities between two protein structures it is sufficient to find similarities in the protein core which consists of tightly packed secondary structure elements. Therefore, we can restrict ourselves to solving the maximal common connected subgraph problem instead of the maximal common subgraph problem. We have modified the algorithm by Bron and Kerbosch for solving that problem. The speed of the algorithm increases drastically. After calculating all maximal common connected substructures for all pairwise comparisons in a set of protein graphs the common substructure in all proteins can be calculated by intersecting them. In this paper we characterize the method briefly and explain the modelling of the protein structure in detail. For the pairwise alignment the similarity of porin (1OMF) with bacteriochlorophyll a (3BCL) and BirA protein (1BIB) with DNA polymerase III (2POL) will be discussed. In the case of the multiple structure alignment the similarity in variants of four phosphatases and in subtilisin Carlsberg, carboxypeptidase, elongation factor Tu, and flavodoxin will be represented. Our first experiments show that the method works correctly and fast. The method can be used for arbitrary graphs. Thus, different graph-theoretical models of protein structures can be examined.


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.