Proceedings:
Proceedings of the Twentieth International Conference on Machine Learning, 1995
Volume
Issue:
Proceedings of the Twentieth International Conference on Machine Learning, 1995
Track:
Contents
Downloads:
Abstract:
We have developed a method for predicting the common secondary structure of large RNA multiple alignments using only the information in the alignment. It uses a series of progressively more sensitive searches of the data in an iterative manner to discover regions of base pairing, the first pass examines the entire multiple alignment. The searching uses two methods to find base pairings. Mutual information is used to measure covariation between pairs of columns in the multiple alignment and a minimum length encoding method is used to detect column pairs with high potential to base pair. Dynamic programming is used to recover the optimal tree made up of the best potential base pairs and to create a stochastic context-free grammar. The information in the tree guides the next iteration of searching. The method is similar to the traditional comparative sequence analysis technique. The method correctly identifies most of the common secondary structure in 16S and 23S rRNA.
ISMB
Proceedings of the Twentieth International Conference on Machine Learning, 1995