Sophie's Home Page
Sophie Schbath

Publications



[52] Mariadassou, M., Nouvel, L.-X., Constant, F., Morgavi, D., Rault, L., Barbey, S., Helloin, E., Rue, O., Schbath, S., Launay, F., Sandra, O., Lefebvre, R., Le Loir, Y., Germon, P., Citti, C. and Even, S. (2023). Microbiota members from body sites of dairy cows are largely shared within individual hosts throughout lactation but sharing is limited in the herd. Animal Microbiome. 5. [ DOI ]
[51] Bize, A., Midoux, C., Mariadassou, M., Schbath, S., Forterre, P. and da Cunha, V. (2021). Exploring short k-mer profiles in cells and mobile elements from archaea highlights the major influence of both the ecological niche and evolutionary history. BMC Genomics. 22. [ DOI ]
[50] Aubert, J., Schbath, S. and Robin, S. (2021). Model-based biclustering for overdispersed count data with application in microbial ecology. Methods in Ecology and Evolution. 12 1050-1061. [ DOI ]
[49] Sultan, I., Fromion, V., Schbath, S. and Nicolas, P. (2020). Statistical modelling of bacterial promoter sequences for regulatory motif discovery with the help of transcrriptome data: application to listeria monocytogenes. Journal of the Royal Society Interface. 17. [ DOI ]
[48] Hurel, J., Schbath, S., Bougeard, S., Rolland, M., Petrillo, M. and Touzain F. (2020). DUGMO: tool for the detection of unknown genetically modified organisms with high-throughput sequencing data for pure bacterial samples BMC Bioinformatics. 21. [ DOI ]
[47] Benoit, G., Mariadassou, M., Robin, S., Schbath, S., Peterlongo, P. and Lemaitre, C. (2019). SimkaMin: fast and resource frugal de novo comparative metagenomics. Bioinformatics. 36. [ DOI ]
[46] Benoit, G., Peterlongo, P., Mariadassou, M., Drezen, E., Schbath, S., Lavenier, D. and Lemaitre, C. (2016). Multiple comparative metagenomics using multiset k-mer counting. PeerJ Computer Science. e94. [ DOI ]
[45] Massip, F., Sheinman, M., Schbath, S. and Arndt, P.F. (2016). Comparing the Statistical Fate of Paralogous and Orthologous Sequences. Genetics. 204. [ DOI ]
[44] Massip, F., Sheinman, M., Schbath, S. and Arndt, P.F. (2015). How Evolution of Genomes Is Reflected in Exact DNA Sequence Match Statistics. Molecular Biology and Evolution. 32 524-535. [ DOI ]
[43] De Paepe, M., Hutinet, G., Son, O., Amarir-Bouhram, J., Schbath, S. and Petit, M.-A. (2014). Temperate phages acquire DNA from defective prophages by relaxed homologous recombination: The role of Rad52-like recombinases. PLOS Genetics. 10(3) e1004181. [ DOI ]
[42] Dumazert, J., Stephan, J.-Y., Petit, M.-A. and Schbath, S. (2013). Assessing the enrichment significance of a possition weight matrix (PWM) along a DNA sequence. In Journees Ouvertes Biologie Informatique Mathematiques (JOBIM), (C. Gaspin and e. Lindley, N., ed.), 25-34, Toulouse, France. Selected long paper.
[41] Schbath, S., Martin, V., Zytnicki, M., Fayolle, J., Loux, V. and Gibrat, J.-F. (2012). Mapping reads on a genomic sequence: an algorithmic overview and a practical comparative analysis. Journal of Computational Biology. 19 796-813. [ DOI ]
[40] Fayyaz, A., Launay, G., Schbath, S., Gibrat, J.-F. and Rodolphe, F. (2012). Statistical significance of threading scores. Journal of Computational Biology. 19 13-29. [ DOI ]
[39] Devillers, H. and Schbath, S. (2012). Separating significant matches from spurious matches in dna sequences. Journal of Computational Biology. 19 1-12. [ DOI ]
[38] Schbath, S. (2011). Statistiques de motifs. Gazette des mathematiciens. 130 60-65.
[37] Devillers, H., Chiapello, H., Schbath, S. and El Karoui, M. (2011). Robustness assessment of whole bacterial genome segmentations. Journal of Computational Biology. 18 1155-1165. [ DOI ]
[36] Stefanov, V., Robin, S. and Schbath, S. (2011). Occurrence of structured motifs in random sequences: Arbitrary number of boxes. Discrete Applied mathematics. 159 826-831. [ DOI ]
[35] Schbath, S. and Hoebeke, M. (2011). Advances in genomic sequence analysis and pattern discovery. (L. Elnitski, O. Piontkivska, and L. Welch, ed.), chapter R'MES: a tool to find motifs with a significantly unexpected frequency in biological sequences. Science, Engineering, and Biology Informatics, vol. 7. World Scientific.
[34] Touzain, F., Petit, M.-A., Schbath, S. and El Karoui, M. (2011). DNA motifs that sculpt the bacterial chromosome. Nature Reviews Microbiology. 9 15-26. [ DOI ]
[33] Devillers, H., Chiapello, H., Schbath, S. and El Karoui, M. (2010). Assessing the robustness of complete bacterial genome segmentations. In RECOMB-CG 2010, (E. Tannier, ed.). Lecture Notes in Bioinformatics. 6398 173-187.
[32] Reynaud-Bouret, P. and Schbath, S. (2010). Adaptive estimation for Hawkes' processes; Application to genome analysis. Annals of Statistics. 38 (5) 2781-2822. [ DOI | http ]
[31] Schbath, S. and Robin, R. (2009). Scan Statistics - Methods and Applications. (J. Glaz, I. Pozdnyakov, and S. Wallenstein, ed.), chapter How pattern statistics can be useful for DNA motif discovery? Statistics for Industry and Technology. Birkhauser.
[30] Schbath, S., Lacroix, V. and Sagot, M.-F. (2009). Assessing the exceptionality of coloured motifs in networks. EURASIP Journal on Bioinformatics and Systems Biology. ID 616234 1-9. [ DOI ]
[29] Mercier, R., Petit, M.-A., Schbath, S., Robin, S., El Karoui, M., Boccard, F. and Espeli, O. (2008). The MatP/matS site specific system organizes the Terminus region of the E. coli chromosome into a Macrodomain. Cell. 135 475-485. [ DOI ]
[28] Touyar, N., Schbath, S., Cellier, D. and Dauchel, H. (2008). Poisson approximation for the number of repeats in a Markov chain model. J. Appl. Prob. 45 440-455.
[27] Touzain, F., Schbath, S., Debled-Rennesson, I., Aigle, B., Leblond, P. and Kucherov, G. (2008). SIGffRid: a tool to search for σ factor binding sites in bacterial genomes using comparative approach and biologically driven statistics. BMC Bioinformatics. 9:73 1-23. [ http ]
[26] Picard, F., Daudin, J.-J., Koskas, M., Schbath, S. and Robin, S. (2008). Assessing the exceptionality of network motifs. J. Comp. Biol. 15:1 1-20. [ http | .pdf ]
[25] Halpern, D., Chiapello, H., Schbath, S., Robin, S., Hennequet-Antier, C., Gruss, A. and El Karoui, M. (2007). Identification of DNA motifs implicated in maintenance of bacterial core genomes by predictive modelling. PLoS Genetics. 3(9) e153. [ DOI ]
[24] Robin, S., Schbath, S. and Vandewalle, V. (2007). Statistical tests to compare motif count exceptionalities. BMC Bioinformatics. 8:84 1-20. [ http ]
[23] Roquain, E. and Schbath, S. (2007). Improved compound Poisson approximation for the number of occurrences of multiple words in a stationary Markov chain. Adv. Appl. Prob. 39 1-13. [ .ps ]
[22] Stefanov, V., Robin, S. and Schbath, S. (2007). Waiting times for clumps of patterns and for structured motifs in random sequences. Discrete Applied Mathematics. 155 868-880. [ DOI | .ps ]
[21] Matias, C., Schbath, S., Birmelé, E., Daudin, J.-J. and Robin, S. (2006). Network motifs: mean and variance for the count. REVSTAT. 4 31-51. [ .pdf ]
[20] Robin, S., Rodolphe, F. and Schbath, S. (2005). DNA, Words and Models. Cambridge University Press, English version of ADN, mots et modèles, BELIN 2003. [ http ]
[19] Gusto, G. and Schbath, S. (2005). FADO: a statistical method to detect favored or avoided distances between motif occurrences using the hawkes' model. Statistical Applications in Genetics and Molecular Biology. 4 1. Article 24. [ .ps ]
[18] Reinert, G., Schbath, S. and Waterman, M. (2005). Applied Combinatorics on Words. volume 105 of Encyclopedia of Mathematics and its Applications, chapter Statistics on Words with Applications to Biological Sequences. Cambridge University Press. [ http ]
[17] Schbath, S. (2004). A la recherche de mots de fréquence exceptionnelle dans les génomes. In Images des Mathématiques, volume 3, 100-106. CNRS.
[16] Robin, S., Rodolphe, F. and Schbath, S. (2003). ADN, mots et modèles. BELIN.
[15] Schbath, S. (2003). Statistical methods in physical mapping. In Encyclopedia of the Human Genome, number 434 in Mathematical genetics. Nature Publishing Group.
[14] Robin, S., Daudin, J.-J., Richard, H., Sagot, M.-F. and Schbath, S. (2002). Occurrence probability of structured motifs in random sequences. J. Comp. Biol. 9 761-773. [ .ps ]
[13] Robin, S. and Schbath, S. (2001). Numerical comparison of several approximations of the word count distribution in random sequences. J. Comp. Biol. 8 349-359. [ .ps ]
[12] Reinert, G., Schbath, S. and Waterman, M. (2000). Probabilistic and statistical properties of words. J. Comp. Biol. 7 1-46. [ .ps ]
[11] Schbath, S., Bossard, N. and Tavaré, S. (2000). The effect of non-homogeneous clone length distribution on the progress of an STS mapping project. J. Comp. Biol. 7 47-58. [ .ps ]
[10] Schbath, S. (2000). An overview on the distribution of word counts in Markov chains. J. Comp. Biol. 7 193-201. [ .ps ]
[9] El Karoui, M., Biaudet, V., Schbath, S. and Gruss, A. (1999). Characteristics of Chi distribution on several bacterial genomes. Research in Microbiology. 150 579-587.
[8] Reinert, G. and Schbath, S. (1999). Compound Poisson approximations for occurrences of multiple words. In Statistics in Genetics and Molecular Biology, (F. Seillier, ed.). IMS Lecture Notes-Monograph Series. Vol. 33. [ .ps ]
[7] Schbath, S. and Bouvier, A. (1998). Finding words with unexpected frequencies in dna sequences. In Explorapedia of Statistical and Mathematical Techniques for use in Research and Technology. http://www.bioss.sari.ac.uk/smart/unix/intro/slides/home.htm. [ http ]
[6] Reinert, G. and Schbath, S. (1998). Compound Poisson and Poisson process approximations for occurrences of multiple words in markov chains. J. Comp. Biol. 5 223-254. [ .ps ]
[5] Schbath, S. (1997b). An efficient statistic to detect over- and under-represented words in DNA sequences. J. Comp. Biol. 4 189-192. [ .ps ]
[4] Schbath, S. (1997a). Coverage processes in physical mapping by anchoring random clones. J. Comp. Biol. 4 61-82. [ .ps ]
[3] Schbath, S. (1995a). Compound Poisson approximation of word counts in DNA sequences. ESAIM: Probability and Statistics. 1 1-16. (http://www.emath.fr/ps/). [ .ps ]
[2] Schbath, S. (1995b). Étude asymptotique du nombre d'occurrences d'un mot dans une chaîne de Markov et application à la recherche de mots de fréquence exceptionnelle dans les séquences d'ADN. PhD thesis, Université René Descartes, Paris V. [ .ps ]
[1] Schbath, S., Prum, B. and Turckheim, E. d. (1995). Exceptional motifs in different Markov chain models for a statistical analysis of DNA sequences. J. Comp. Biol. 2 417-437. [ .ps ]