The Ribosome Scanning Model for Translation Initiation: Implications for Gene Prediction and Full-Length cDNA Detection

Pankaj Agarwal and Vineet Bafna

Biological signals, such as the start of protein translation in eukaryotic mRNA, are stretches of nucleotides recognized by cellular machinery. There are a variety of techniques for modeling and identifying them. Most of these techniques either assume that the base pairs at each position of the signal are independently distributed, or they allow for limited dependencies among different positions. In previous work, we provided a statistical model that generalizes earlier methods and captures all significant high-order dependencies among different base positions. In this paper, we use a set of experimentally verified translation initiation (TI) sites (provided by Amos Bairoch) from eukaryotic sequences to train a range of methods, and then compare these methods. None of the methods is effective in predicting TI sites. We take advantage of the ribosome scanning model (Cigan et al., 1988) to significantly improve the prediction accuracy for full-length mRNAs. The ribosome scanning model suggests scanning from the 5' end of the capped mRNA and initiating translation at the first AUG in good context. This reduces the search space dramatically and accounts for its effectiveness. The success of this ap- proach illustrates how biological ideas can illuminate and help solve challenging problems in computational biology.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.