Proceedings:
Proceedings of the Twentieth International Conference on Machine Learning, 2000
Volume
Issue:
Proceedings of the Twentieth International Conference on Machine Learning, 2000
Track:
Contents
Downloads:
Abstract:
Untranslated regions (UTR) play important roles in the post-transcriptional regulation of mRNA processing. There is a wealth of UTR-related information to be mined from the rapidly accumulating EST collections. A computational tool, UTR-extender, has been developed to infer UTR sequences from genomically aligned ESTs. It can completely and accurately reconstruct 72% of the 3' UTRs and 15% of the 5' UTRs when tested using 908 functionally cloned transcripts. In addition, it predicts extensions for 11% of the 5' UTRs and 28% of the 3' UTRs. These extension regions are validated by examining splicing frequencies and conservation levels. We also developed a method called polyadenylation site scan (PASS) to precisely map polyadenylation sites in human genomic sequences. A PASS analysis of 908 genic regions estimates that 40-50% of human genes undergo alternative polyadenylation. Using EST redundancy to assess expression levels, we also find that genes with short 3' UTRs tend to be highly expressed.
ISMB
Proceedings of the Twentieth International Conference on Machine Learning, 2000