Proceedings:
Proceedings Of The Fifth International Conference On Intelligent Systems For Molecular Biology
Volume
Issue:
Proceedings Of The Fifth International Conference On Intelligent Systems For Molecular Biology
Track:
Contents
Downloads:
Abstract:
Transcription factors, proteins required for the regulation of gene expression, recognize and bind short stretches of DNA on the order of 4 to 10 bases in length. In general, each factor recognizes a family of "similar" sequences rather than a single unique sequence. Ultimately, the transcriptional state of a gene is determined by the cooperative interaction of several bound factors. We have developed a method using Gibbs Sampling and the Minimum Description Length principle for automatically and reliably creating weight matrix models of binding sites from a database (Transfac) of known binding site sequences. Determining the relationship between sequence and binding afinity for a particular factor is an important first step in predicting whether a given uncharacterized sequence is part of a promoter site or other control region. Here we describe the foundation for the methods we will use to develop weight matrix models for transcription factor binding sites.
ISMB
Proceedings Of The Fifth International Conference On Intelligent Systems For Molecular Biology