Using Dirichlet Mixture Priors to Derive Hidden Markov Models for Protein Families

M. Brown, R. Hughey, A. Krogh, I. S. Mian, K. Sjölander, and D. Haussler

A Bayesian method for estimating the amino acid distributions in the states of a hidden Markov model (HMM) for a protein family or the colunms of a multiple alignment of that family is introduced. This method uses Dirichlet mixture densities as priors over amino acid distributions. These mixture densities are determined from examination of previously constructed tlMMs or multiple alignments. It is shown that this Bayesian method can improve the quality of ItMMs produced from small training sets. Specific experiments on the EF-hand motif are reported, for which these priors are shown to produce HMMs with higher likelihood on unseen data, and fewer false positives and false negatives in a database search task.


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.