PROBCONS: Probabilistic Consistency-Based Multiple Alignment of Amino Acid Sequences

Chuong B. Do, Michael Brudno, and Serafim Batzoglou

Obtaining an accurate multiple alignment of protein sequences is a difficult computational problem for which many heuristic techniques sacrifice optimality to achieve reasonable running times. The most commonly used heuristic is progressive alignment, which merges sequences into a multiple alignment by pairwise comparisons along the nodes of a guide tree. To improve accuracy, consistency-based methods take advantage of conservation across many sequences to provide a stronger signal for pairwise comparisons. In this paper, we introduce the concept of probabilistic consistency for multiple sequence alignments. We also present PROBCONS, an HMM-based protein multiple sequence aligner, based on an approximation of the probabilistic consistency objective function. On the BAliBASE benchmark alignment database, PROBCONS demonstrates a statistically significant improvement in accuracy compared to several leading alignment programs while maintaining practical running times.


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.