Occurrence Probability of Structured Motifs in Random Sequences,


Robin, S., Daudin, J.-J., Richard, H., Sagot, M.-F. and Schbath, S.,


Submitted.


Abstract

Let he problem of extracting from a set of nucleic acid sequences motifs which may have biological function is more and more important. In this paper, we are interested in particular motifs that may be implicated in the transcription process. These motifs, called structured motifs, are composed of two ordered parts separated by a variable distance and allowing for substitutions. In order to assess their statistical significance, we propose approximations of the probability of occurrence of such a structured motif in a given sequence. An application of our method to evaluate candidate promoters in E. coli and B. subtilis is presented. Simulations show the goodness of the approximations.

Key words and phrases Markov models, motif occurrences, promoters, structured motifs.