An improved approximation for assessing the statistical significance of molecular sequence features

Sabine MERCIER, Dominique CELLIER and François CHARLOT

Journal of Applied Probability, 40, 2003.


Using random walk theory, we first establish explicitly the exact distribution of the maximal partial sum of a sequence of independent and identically distributed random variables. This result allows us to obtain a new approximation of the distribution of the local score of one sequence. This approximation improves the one given par Karlin et al, which can be deduced from this new formula. We obtain a more accurate asymptotic expression with additional terms. Examples of application are given

Key words and phrases Statistical significance, sequence analysis, local score, random walks.

