SSB group

Welcome to the home page of

Sophie Schbath

Institut National de la Recherche Agrononique
Unité Mathématiques et Informatique Appliquées, du Génome à l'Environnement
Domaine de Vilvert
F-78352 Jouy-en-Josas Cedex
Sophie.Schbath at
Fax : +

I got a PhD thesis in Statistics from the University of Paris V on October 25, 1995. I made my thesis in the Biometrics laboratory of INRA (French National Institute of Agronomic Research) at Jouy-en-Josas, in which I obtained a permanent position of researcher in August 1996. In 1996, I was a one-year post-doctoral fellow at Los Angeles, in the team of Simon Tavaré and Michael Waterman. In 2000, I moved in a new multidisciplinary laboratory of INRA called "Mathématique, Informatique & Génome" (MIG) and located in Jouy-en-Josas. I defended my habilitation on September 22, 2003. I am now Director of Research; I was the head of the MIG lab from 2012 to 2014. Since 2015, I have been the head of the new MaIAGE lab, result of the merging of MIG lab and MIAJ lab.

My main interest is on statistical analysis of word or motif occurrences in biological sequences. In particular, I have done lots of work on statistical methods to detect words significantly over- or under-represented with respect to what should be expected under a Markovian model. I am still involved in the development of the R'MES software I have initiated. I also look at the locations of words along genomes and more precisely I am interested in the detection of "co-located" motifs that may be involved in a common biological mechanism; such dependence may suggest a possible protein interaction.
These works naturally leaded me to the field of comparative genomics and more precisely to the comparison of complete bacterial genomes which uses Maximum Exact Matches between sequences.
During my post-doc at the University of Southern California, I have investigated another area: physical mapping. I studied the effect of two kind of heterogeneities of the data on the progress of a physical mapping project by anchoring (number of islands, proportion of genome covered, averaged length of islands). These heterogeneities can be related to the anchor/clone locations and to the clone lengths. Physical mapping is no more of interest now because of next-generation sequencing, but statistical techniques used there can be helpful in metagenomics studies I am investigating now.
Moreover, in 2005, I started to work on the detection of network motifs, which are subgraphs with a significantly high frequency in biological networks.

By the way, I have been the president of the French Society of BioInformatics (300 members) from 2010 and co-headed the French Research Group "Molecular BioInformatics" (1000 members) from 2006 to 2014. (February 18, 2015).