SSB : Articles 1995

2006
[1]	Same, A., Ambroise, C. and Govaert, G. (2006). A classification em algorithm for binned data. Computational Statistics and Data Analysis. 51 (2) 466-480.
[2]	Schibler, L., Roig, A., Mahe, M.-F., Laurent, P., Hayes, H., Rodolphe, F. and Cribiu, E.-P. (2006). High-resolution comparative mapping among man, cattle and mouse suggests a role for repeat sequences in mammalian genome evolution. BMC Genomics. 7 194.
[3]	Nuel, G. (2006c). Pattern statistics on Markov chains and sensitivity to parameter estimation. Algo. Mol. Biol. 1 Article 17.
[4]	Nuel, G. (2006a). Effective p-value computations using Finite Markov Chain Imbedding (FMCI): application to local score and to pattern statistics. Algo. Mol. Biol. 1 Article 5.
[5]	Martin, J., Gibrat, J.-F. and Rodolphe, F. (2006). Analysis of an optimal hidden markov model for secondary structure prediction. BMC Structural Biology. 6 25.
[6]	Regad, L., Martin, J. and Camproux, A.-C. (2006). Identification of non random motifs in loops using a structural alphabet. In Proceedings of IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, 1-9, Toronto, Ontario.
[7]	vande Guchte, M., Penaud, S., Grimaldi, C., Barbe, V., Bryson, K., Nicolas, P., Robert, C., Oztas, S., Mangenot, S., Couloux, A., Loux, V., Dervyn, R., Bossy, R., Bolotin, A., Batto, J., Walunas, T., Gibrat, J., Bessieres, P., Weissenbach, J., Ehrlich, S. and Maguin, E. (2006). The complete genome sequence of lactobacillus bulgaricus reveals extensive and ongoing reductive evolution. Proc Natl Acad Sci U S A. 103 9274-9279.
[8]	Nicolas, P., Sun, F. and Li, L. M. (2006b). A model-based approach to selection of tag snps. BMC Bioinformatics. 7 303. [ DOI ]
[9]	Bryson, K., Loux, V., Bossy, R., Nicolas, P., Chaillou, S., van de Guchte, M., Penaud, S., Maguin, E., Hoebeke, M., Bessieres, P. and Gibrat, J. (2006). Agmial: implementing an annotation strategy for prokaryote genomes as a distributed system. Nucleic Acids Res. 34 3533-3545.
[10]	Zhu, X., Ambroise, C. and McLachlan, G. (2006). Selection bias in working with the top genes in supervised classification of tissue samples. Statistical Methodology. 3 29-41.
[11]	Cord, A., Ambroise, C. and Cocquerez, J.-P. (2006). Feature selection in robust clustering based on laplace mixture. Pattern Recognition Letters. 27 (6) 627-635.
[12]	Arribas-Gil, A., Gassiat, E. and Matias, C. (2006). Parameter estimation in pair hidden Markov models. Scand. J. Statist. 33 (4) 651-671. [ http ]
[13]	Chich, J.-F., David, O., Villers, F., Schaeffer, B., Lutomski, D. and Huet, S. (2006). Statistics for proteomics: Experimental design and 2-de differential analysis. Journal of Chromatography B. In Press ?
[14]	Guedj, M., Wojcik, J., Della-Chiesa, E., Nuel, G. and Forner, K. (2006b). A fast, unbiased and exact allelic test for case-control association studies. Human Heredity. 61 210-221.
[15]	Guedj, M., Robelin, D., Hoebeke, M., Lamarine, M., Wojcik, J. and Nuel, G. (2006a). Detecting local high-scoring segments: a first-stage approach for genome-wide association studies. SAGMB. 5.
[16]	Nuel, G. (2006b). Numerical Solutions for Patterns Statistics on Markov Chains. Stat. App. Gen. Mol. Biol. 5 (1) Article 26.
[17]	Nicolas, P., Tocquet, A.-S., Miele, V. and Muri, F. (2006a). A reversible jump Monte-Carlo Markov chain algorithm for bacterial promoter discovery. J. Comp. Biol. 13 651-667.
[18]	Lepage, E., Brinster, S., Caron, C., Ducroix-Crepy, C., Rigottier-Gois, L., Dunny, G., Hennequet-Antier, C. and Serror, P. (2006). Comparative genomic hybridization analysis of enterococcus faecalis: Identification of genes absent from food strains. Journal of Bacteriology. 188 6858-6868.
[19]	Matias, C., Schbath, S., Birmelé, E., Daudin, J.-J. and Robin, S. (2006). Network motifs: mean and variance for the count. REVSTAT. 4 31-51.
2005
[20]	Martin, J., Gibrat, J.-F. and Rodolphe, F. (2005a). How to choose the optimal hidden markov model for protein secondary structure prediction. IEEE Intelligent Systems. 60 (6) 19-25.
[21]	Martin, J., Letellier, G., Marin, A., Taly, J.-F., de Brevern, A. and Gibrat, J.-F. (2005b). Protein secondary structure assignment revisited: a detailed analysis of different assignment methods. BMC Struct Biol. 5 17.
[22]	Miele, V., Bourguignon, P., Robelin, D., Nuel, G. and Richard, H. (2005). seq++: a package for biological sequences analysis with a range of markov-related models. BioInformatics. 21 (11) 2783-2784.
[23]	Nuel, G. (2005). S-SPatt: simple statistics for patterns on Markov chains. Bioinformatics. 21 (13) 3051-3052.
[24]	Robin, S., Rodolphe, F. and Schbath, S. (2005). DNA, Words and Models. Cambridge University Press, English version of ADN, mots et modèles, BELIN 2003.
[25]	Lebarbier, E. (2005). Detecting multiple change-points in the mean of gaussian process by model selection. Signal Processing. 85 117-736.
[26]	Hennequet-Antier, C., Chiapello, H., Piot, K., Degrelle, S., Hue, I., Renard, J., Rodolphe, F. and Robin, S. (2005). Anovarray: a set of sas macros for the analysis of variance of gene expression data. BMC Bioinformatics. 6 150.
[27]	Reinert, G., Schbath, S. and Waterman, M. (2005). Applied Combinatorics on Words. volume 105 of Encyclopedia of Mathematics and its Applications, chapter Statistics on Words with Applications to Biological Sequences. Cambridge University Press.
[28]	Gusto, G. and Schbath, S. (2005). FADO: a statistical method to detect favored or avoided distances between motif occurrences using the hawkes' model. Statistical Applications in Genetics and Molecular Biology. 4 0. Article 24.
2004
[29]	Marin, A., Malliavin, T., Nicolas, P. and Delsuc, M.-A. (2004). Prediction of the amino-acid type from the chemical shift values: investigation of the information carried by nmr experiments. J Biomol NMR. 30 (1) 47-60.
[30]	Mary-Huard, T., Daudin, J.-J., Robin, S., Bitton, F., Cabannes, E. and Hilson, P. (2004). Spotting effect in microarray experiments. BMC Bioinformatics. 0. [ .pdf ]
[31]	Nuel, G. (2004). LD-SPatt: Large Deviations Statistics for Patterns on Markov Chains. J. Comput. Biol. 11 (6) 1023-1033.
[32]	Aubert, J., Bar-Hen, A., Daudin, J.-J. and Robin, S. (2004). Determination of the differentially expressed genes in microarray experiments using local FDR. BMC Bioinformatics. 5 125. [ http ]
[33]	Delmar, P., Robin, S., Le Roux, D. and Daudin, J.-J. (2004). Mixture model on the variance for the differential analysis of gene expression. J. R. Statist. Soc. B. 0 ??-??
[34]	Schbath, S. (2004). A la recherche de mots de fréquence exceptionnelle dans les génomes. In Images des Mathématiques, volume 3, 100-106. CNRS. [ .ps ]
2003
[35]	Bar-Hen, A. and Robin, S. (2003). An iterative procedure for differential analmysis of gene expression. CRAS. 343-346.
[36]	Richard, H. and Nuel, G. (2003). SPA: Simple web tool to assess statistical significance of DNA patterns. Nucleic Acids Research. 31 (13) 3679-3681.
[37]	Hassenforder, C. and Mercier, S. (2003). Distribution exacte du score local, cas Markovien. C.R.A.S. 336 863-868. [ .html ]
[38]	Bourgait, I., Chiapello, H., Hennequet-Antier, C., Robin, S., Schbath, S., Gruss, A. and El Karoui, M. (2003). Genomic distribution of short motifs involved in dna repair in pathogenic and non pathogenic E. coli. In Second European Conference on Computational Biology (ECCB). Paris, France. September, 27-30 (selected short paper, 7-9).
[39]	Robelin, D. and Prum, B. (2003). Detecting short inverted segments in a biological sequenc. In Second European Conference on Computational Biology (ECCB). Paris, France. September, 27-30 (selected short paper, 41-43).
[40]	Robelin, D., Richard, H. and Prum, B. (2003). SIC: a tool to detect short inverted segments in a biological sequence. Nucl. Acids. Res. 31 (13) 3669-3671. [ .html \| http ]
[41]	Daudin, J.-J., Etienne, M.-P. and Vallois, P. (2003). Asymptotic behavior of the local score of independent and identically distributed random sequences. Stochastic Processes and their Applications. 107 1-28. [ http ]
[42]	Robin, S. (2003). Between Data Science And Applied Data Analysis. (S. Martin, W. Gaul, and M. Vichi, ed.), chapter Some Statistical Issues in Microarray Data Analysis, 337-347. Springer.
[43]	Hoebeke, M., Nicolas, P. and Bessières, P. (2003). MuGeN: simultaneous exploration of multiple genomes and computer analysis results. Bioinformatics. 19 859-864.
[44]	Mercier, S., Cellier, D. and Charlot, F. (2003). An improved approximation for assessing the statistical significance of molecular sequence features. J. Appl. Prob. 40 ? [ .html ]
[45]	Robin, S., Rodolphe, F. and Schbath, S. (2003). ADN, mots et modèles. BELIN.
[46]	Schbath, S. (2003). Statistical methods in physical mapping. In Encyclopedia of the Human Genome, number 434 in Mathematical genetics. Nature Publishing Group. [ .ps ]
2002
[47]	Nicolas, P., Bize, L., Muri, F., Hoebeke, M., Rodolphe, F., Ehrlich, S., Prum, B. and Bessières, P. (2002). Mining bacillus subtilis chromosome heterogeneities using hidden Markov models. Nucl. Acids Res. 30 1418-1426.
[48]	Bacro, J.-N., Daudin, J.-J., Mercier, S. and Robin, S. (2002). Back to the local score in the logarithmic case: a direct and simple proof. Ann. Inst. Statist. Math. 54 748-757. [ .html \| .ps ]
[49]	Robin, S., Daudin, J.-J., Richard, H., Sagot, M.-F. and Schbath, S. (2002). Occurrence probability of structured motifs in random sequences. J. Comp. Biol. 9 761-773. [ .html \| .ps ]
[50]	Robin, S. (2002). A compound Poisson model for words occurrences in DNA sequences. J. Royal Statist. Soc., C series. 51 437-451. [ .html \| .ps ]
[51]	Daudin, J., Ghachem, S., Descender, M., Robin, S., Hénault, A., Sekowska, A. and Danchin, A. (2002). Comparison of statistical methods for differential analysis of macroarray data. preprint. 0.
[52]	Nicodème, P., Salvy, B. and Flajolet, P. (2002). Motif statistics. Theoretical Computer Science. ? Extended version of an article published in the proceedings of 7th Annual European Symposium on Algorithms ESA'99, Prague, July 1999. [ .html ]
[53]	Prum, B. (2002). Mathématiques et biologie. Bulletin de l'APMEP. 0.
2001
[54]	Lavielle, M. and Lebarbier, E. (2001). An application of mcmc methods for the multiple change-points problem. Signal Processing. 81 39-53.
[55]	Mercier, S. and Daudin, J. (2001). Exact distribution for the local score of one i.i.d. random sequence. J. Comp. Biol. 8 373-380. [ .html ]
[56]	Mercier, S., Cellier, D., Charlot, F. and Daudin, J. (2001). Exact and asymptotic distribution of the local score of one i.i.d. random sequence,. Lecture Notes in Comp. Science, volume for JOBIM 2000. 74-85.
[57]	Daudin, J. and Mercier, S. (2001). Distribution exacte du score local d'une suite de variables indépendantes et identiquement distribuées. C. R. Acad. Sci. Paris,. 329 (I) 815-820. [ .html ]
[58]	Sekowska, A., Robin, S., Daudin, J.-J., Hénaut, A. and Danchin, A. (2001). Extracting biological information from dna arrays: an unexpected link between arginine and methionine metabolism in bacillus subtilis. Genome Biology. 2 (6) (http://genomebiology.com/2001/2/6/research/0019.1).
[59]	Robin, S. and Daudin, J.-J. (2001). Exact distribution of the distances between any occurrences of a set of words. Ann. Inst. Statist. Math. 36 (4) 895-905. [ .html \| .ps ]
[60]	Robin, S. and Schbath, S. (2001). Numerical comparison of several approximations of the word count distribution in random sequences. J. Comp. Biol. 8 349-359. [ .ps ]
[61]	Prum, B., Turckheim, É. and Vingron, M. (2001). Identifying repeat-containing protein sequences. ESAIM: Probability and Statistics. 0.
[62]	Muri-Majoube, F. and Prum, B. (2001). Une approche statistique de l'analyse des génomes. La Gazette des Mathématiciens. 89 63-98.
[63]	Prum, B. (2001a). La recherche automatique des gènes. La Recherche. 1108 86-88.
[64]	Prum, B. (2001b). Probabilités, statistique et génomes. Matapli. 64 0.
2000
[65]	Bacro, J. and Comet, J. (2000). Sequence alignment: an approximation law for the z-value with applications to databank scanning. Computers and Chemistry. 401-410. [ .html ]
[66]	Goldstein, D., Muri-Majoube, F., Saragueta, P. and Prum, B. (2000). Inverse complementary homologues of cysteine signatures. C. R. Acad. Sci. 323 167-172.
[67]	Prum, B. (2000b). Une approche statistique de l'analyse des séquences gén etiques. La Revue du Palais de la Découverte. 276 56-65.
[68]	Prum, B. (2000a). Les chaînes de Markov dans l'analyse des génomes. Matapli. 62 17-24.
[69]	Reinert, G., Schbath, S. and Waterman, M. (2000). Probabilistic and statistical properties of words: an overview. J. Comp. Biol. 7 1-46. [ .html \| .ps ]
[70]	Schbath, S., Bossard, N. and Tavaré, S. (2000). The effect of non-homogeneous clone length distribution on the progress of an STS mapping project. J. Comp. Biol. 7 47-58. [ .html \| .ps ]
[71]	Schbath, S. (2000). An overview on the distribution of word counts in Markov chains. J. Comp. Biol. 7 193-201. [ .html \| .ps ]
[72]	Mathé, C. and Rodolphe, F. (2000). Translation conditional models for protein coding sequences. J. Comp. Biol. 7 249-260.
1999
[73]	Krause, A., Nicodème, P., Bornberg-Bauer, E., Rehmsmeier, M. and Vingron, M. (1999). WWW-access to the SYSTERS protein sequence cluster set. Bioinformatics. 15 262-263. Application Note accepted for the GCB Special Issue of Bioinformatics. [ .html ]
[74]	Robin, S. and Daudin, J.-J. (1999). Exact distribution of word occurrences in a random sequence of letters. J. Appl. Prob. 36 179-193. [ .html ]
[75]	El Karoui, M., Biaudet, V., Schbath, S. and Gruss, A. (1999). Characteristics of Chi distribution on several bacterial genomes. Research in Microbiology. 150 579-587. [ .html ]
[76]	Bouvier, A., Gélis, F. and Schbath, S. (1999), R'MES : Recherche de Mots Exceptionnels dans les Séquences d'ADN - version 2. Technical report, Guide de l'utilisateur. INRA, Biométrie, F78352 Jouy-en-Josas. [ http \| .ps ]
[77]	Reinert, G. and Schbath, S. (1999). Compound Poisson approximations for occurrences of multiple words. In Statistics in Genetics and Molecular Biology, (F. Seillier, ed.). IMS Lecture Notes-Monograph Series. 33 257-275. [ .html \| .ps ]
1998
[78]	Krause, A., Nicodème, P., Rehmsmeier, M. and Vingron, M. (1998). Automatic clustering of large sequence databases. In Proceedings of the German Conference on Bioinformatics GCB98, Köln.
[79]	Nicodème, P. (1998). SSMAL: similarity searching with alignment graphs. Bioinformatics. 14 (6) 508-515. [ .html ]
[80]	Schbath, S. and Bouvier, A. (1998). Finding words with unexpected frequencies in dna sequences. In Explorapedia of Statistical and Mathematical Techniques for use in Research and Technology. http://www.bioss.sari.ac.uk/smart/unix/intro/slides/home.htm. [ http ]
[81]	Reinert, G. and Schbath, S. (1998). Compound Poisson and Poisson process approximations for occurrences of multiple words in markov chains. J. Comp. Biol. 5 223-254. [ .html \| .ps ]
1997
[82]	Nicodème, P. and Steyaert, J. (1997). Selecting optimal oligonucleotide primers for multiplex PCR. In Fifth International Conference on Intelligent Systems for Molecular Biology, 210-213. AAAI Press. [ .html ]
[83]	Schbath, S. (1997b). An efficient statistic to detect over- and under-represented words in DNA sequences. J. Comp. Biol. 4 189-192. [ .html \| .ps ]
[84]	Schbath, S. (1997a). Coverage processes in physical mapping by anchoring random clones. J. Comp. Biol. 4 61-82. [ .html \| .ps ]
1996
[85]	Schbath, S. (1996), Using non-homogeneous processes in physical mapping by anchoring random clones: Mathematical analysis and application to hotspots. Technical Report #96-6, Center for Applied Mathematical Sciences, University of Southern California, Los Angeles.
[86]	Gélis, F. and Schbath, S. (1996), RMES : Recherche de Mots Exceptionnels dans les Séquences d'ADN - version 1. Technical report, Notice d'utilisation. INRA, Biométrie, F78352 Jouy-en-Josas.
1995
[87]	Schbath, S. (1995a). Compound Poisson approximation of word counts in DNA sequences. ESAIM: Probability and Statistics. 1 1-16. (http://www.emath.fr/ps/). [ .html \| .ps ]
[88]	Schbath, S. (1995b). Étude asymptotique du nombre d'occurrences d'un mot dans une chaîne de Markov et application à la recherche de mots de fréquence exceptionnelle dans les séquences d'ADN. PhD thesis, Université René Descartes, Paris V. [ .ps ]
[89]	Schbath, S., Prum, B. and Turckheim, E. d. (1995). Exceptional motifs in different Markov chain models for a statistical analysis of DNA sequences. J. Comp. Biol. 2 417-437. [ .html \| .ps ]
[90]	Prum, B., Rodolphe, F. and Turckheim, É. (1995). Finding words with unexpected frequencies in DNA sequences. J. R. Statist. Soc. B. 57 205-220. [ .html ]

Copyright: © 2007 SSB (tous droits réservés)
Author:    Sophie Schbath <schbath@ jouy.inra.fr>
Modified:  2007-09-17