A Divide and Conquer Approach to Multiple Alignment

Authors

Andreas Dress

Sören Perrey

and Georg Füllen

Proceedings:

Proceedings of the Twentieth International Conference on Machine Learning, 1995

Volume

Issue:

Proceedings of the Twentieth International Conference on Machine Learning, 1995

Track:

Contents

Downloads:

Download PDF

Abstract:

We present a report on work in progress on a divide and conquer approach to multiple alignment The algorithm makes use of the costs calculated from applying the standard dynamic programming scheme to all pairs of sequences. The resulting cost matrices for pairwise alignment give rise to secondary matrices containing the additional costs imposed by fixing the path through the dynamic programming graph at a particular vertex. Such a constraint corresponds to a division of the problem obtained by slicing both sequences between two particular positions, and aligning the two sequences on the left and the two sequences on the right, charging for gaps introduced at the slicing point. To obtain an estimate for the additional cost imposed by forcing the multiple alignment through a particular vertex in the whole hypercube, we will take a (weighted) sum of secondary costs over all pairwise projections of the division of the problem, as defined by this vertex, that is, by slicing all sequences at the points suggested by the vertex. We then use that partition of every single sequence under consideration into two "halfs" which imposes a minimal (weighted) sum of pairwise additional costs, making sure that one of the sequences is divided somewhere close to its midpoint. Hence, each iteration can cut the problem size in half. As the enumeration of all possible partitions may restrict this approach to small-size problems, we eliminate futile partitions, and organize their enumeration in a way that starts with the most promising ones. Comparing our approach for the case of 3 sequences with a) structurally verified alignments and b) alignments from literature, indicates high quality alignments, with roughly the same number of errors as the ``optimal'' (in the dynamic programming framework) solution in a), and being as close as the "optimal" to a maximum weight trace done by Kececioglu, using 6 sequences altogether.

ISMB

Proceedings of the Twentieth International Conference on Machine Learning, 1995

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.