Compound Poisson and Poisson Process Approximations
for Occurrences of Multiple Words
in Markov Chains

Gesine REINERT and Sophie SCHBATH

J. Comp. Biol., vol. 5, 223-254, 1998.

Abstract

We derive a Poisson process approximation for the occurrences of clumps of multiple words, and a compound Poisson process approximation for the number of occurrences of multiple words in a sequence of letters generated by a stationary Markov chain. Using the Chen-Stein method we provide a bound on the error in the approximations. For rare words, these errors tend to zero as the length of the sequence increases to infinity. Modeling a DNA sequence as a stationary Markov chain, we show as an application that the compound Poisson approximation is efficient for the number of occurrences of rare stem-loop motifs.

Key words and phrases Chen-Stein method, stem-loop motifs, compound Poisson approximation, Poisson process approximation, occurrences of multiple words.

Statistiques des Séquences Biologiques Home Page

Compound Poisson and Poisson Process Approximations for Occurrences of Multiple Words in Markov Chains

Gesine REINERT and Sophie SCHBATH

J. Comp. Biol., vol. 5, 223-254, 1998.

Compound Poisson and Poisson Process Approximations
for Occurrences of Multiple Words
in Markov Chains