Proceedings:
Proceedings Of The Seventh International Conference On Intelligent Systems For Molecular Biology
Volume
Issue:
Proceedings Of The Seventh International Conference On Intelligent Systems For Molecular Biology
Track:
Contents
Downloads:
Abstract:
Simulated data sets have been found to be useful in developing software systems because (1) they allow one to study the effect of a particular phenomenon in isolation, and (2) one has complete information about the true solution against which to measure the results of the software. In developing a software suite for assembling a whole human genome shotgun data set, we have developed a simulator, celsim, that permits one to describe and stochastically generate a target DNA sequence with a variety of repeat structures, to further generate polymorphic variants if desired, and to generate a shotgun data set that might be sampled from the target sequence(s). We have found the tool invaluable and quite powerful, yet the design is extremely simple, employing a special type of stochastic grammar.
ISMB
Proceedings Of The Seventh International Conference On Intelligent Systems For Molecular Biology