Serveur de l'unité Mathématiques Informatique et Génome - INRA Jouy. Micado
documentation



Documentation


1 Getting Started with Micado
Micado (MICrobial Advanced Database Organization) is a relational database dedicated to microbial genomes [Biaudet, 1997] and functional analysis of Bacillus subtilis [Samson, 2000]. The database is accessible on the web at the URL : http://genome.jouy.inra.fr/cgi-bin/micado/index.cgi
1.1 Micado content
1.1.1 Sequence data
Micado provides easy access to GenBank and NCBI sequences annotations.
  • GenBank entries (primary data). Genbank contains all public nucleic sequences data. Access and distribution of these sequences are free. Information about GenBank are available at :
    ftp://ncbi.nlm.nih.gov/genbank/gbrel.txt.
    Micado includes microbial primary sequences from GenBank.

  • NCBI entries (completes genomes, checked annotation) contains all the complete genomes from prokaryotes (bacteria and archea) in GenBank format [Perrière, 2000]. It also provides sequence annotations improved by the introduction of data on codon usage, gene orientation on the chromosome and gene families. NCBI sequences are available at :
    ftp://ftp.ncbi.nih.gov/genomes/Bacteria/.

1.1.2 Bacillus subtilis functional analysis data
The goal of functional genomics is to unravel the physiological functions of genes. Bacillus subtilis genome contains 4100 protein-coding genes. Among these genes, 1200 genes of unknown function have been chosen with the aim to elucidate their function.
So these 1200 genes have been specifically and systematically disrupted in a mutant strain of B. subtilis. These strains have been studied by an european consortium of 17 laboratories. The physiological data tested are : growth and gene reporter activity (betagalactosidase), RNAm transcripts and phenotypic tests. All these data are linked, so it is possible to navigate between the different parts of the database. There are many different possibilities to access to the database content.
1.1.3 Bacillus subtilis genetic map data
This part contains genetic map data produced in 1993 for B. subtilis[Anagnos, 1993]. It describes gene names and synonyms, gene positions on the genetic map and associated bibliographic references.
1.2 Access to the database< Information on sequences, genes and mutants is searched through hyperlinked menu pages, composed of selectable lists and options, text input, and graphical navigation on clickable images.
1.2.1 Access to sequence data and their annotation
Two forms are available from Micado main page :
  • One to get all informations about a Genbank or NCBI complete genome access number After this query, the web page contains three kinds of results : the physical map of the sequence contained in the access number, sequence extraction facilities for the considered acces number (DNA sequence, locations...) and a table including all the features/qualifiers of the considered access number.
  • One to get all informations about a gene name in a species for which the complete genome is in the Micado database. The results include all the Genbank or NCBI complete genome access number which cited the gene name in the qualifier 'gene'. If this query don't give any result, the search is performed on the qualifier 'note' but results are more approximate. Links to PubMed, Swiss-Prot are systematically included. Additionnal links to Subtilist, Jafan and the Functionnal Analysis Page of Micado are included only for Bacillus subtilis genes.
1.2.2. Bacillus subtilis functional analysis project

In this part information is accessible by gene name either in a form or via three list of genes (BSFA, JAFAN or essential genes). You will find in this part three kinds of results :

  • growths curves in Rich and Minimal medium
  • reporter gene activity of strains
  • systematic determination of phenotype
Get more details on Protocols, phenotypes and laboratories involved in the BSFA project.

1.2.3 Explore the Micado Database
  • BLAST/FASTA against the Micado Database or one of the complete genomes
    By this way, you can compare your own sequence to a set of Micado sequences with three programs :Fasta, blast and psi-blast. Sequence may be protein or DNA sequence.
  • Search pattern in Micado sequences
    The pattern matching software used in this section is Patscan [Dsouza, 1997]. Pattern in DNA or protein sequence may be searched in three kinds of Micado sequences : a unique sequence, a set of sequences or in a complete genome. To definite the pattern, see the link ``Look at the rules for pattern definition''. Example : TATAA[1,0,0] means 'match TATAA allowing 1 mismatch, 0 deletion, 0 insertion'.
1.3 How to contact Micado
At every Micado page, there is a link to contact the database in case of problem or to comment. The mail addresses are : Helene.Chiapello @ jouy.inra.fr or Annie.Gendrault-Jacquemard @ jouy.inra.fr
1.4 Releases
1.5 History
Micado database was built in the ``Génétique Microbienne'' laboratory at INRA (Jouy-en-Josas) to replace the first database MadBase [Biaudet, 1995], to store the increasing number of data (thanks to genome sequencing projects, their analysis and a new functional analysis program of unknown genes ), to manage them and to allow their consultation.
  • 1994-1999 : Micado creation from the work of Véronique Biaudet-Brunaud who more particularly worked on Micado structure and content [Biaudet-Brunaud, 1997]. The interface development was performed by Franck Samson. This group was managed by Philippe Bessières.
  • From February 2000 : Hélène Chiapello (structure, content and interfaces) and Mark Hoebeke (graphical interfaces) take over their activity, in the MIG laboratory, a new bioinformatic lab at INRA. Philippe Bessières still managed this group. A new major release of Micado is in preparation.

2 Micado technical Documentation
2.1 Database structure
The Relational DataBase Management System used is PostgreSQL 7.3.3.
2.1.1 Relationnal model
Sequence
Micado : 'sequences' relationnal model
Mutant
Micado : 'mutants' relational model part 1 Micado : 'mutants' relational model part 2
2.1.2 Tables and description
Sequence
Table name Field number Content
ACCESSIONS 2 Sequences with a changed accession number
ARTICLES 7 Bibliographic references of the Genbank entry
COMMENTS 2 Comments on accession numbers
COMPLETE_GENOMES 3 Description of complete genomes entries from NCBI
DNA_LOC 5 DNA sequence of each location site
DNA_SEQ 2 DNA sequence of each Genbank entry
FEATURES 7 Features description of GenBank/NCBI complete genome entries
KEYWORDS 2 Keywords used in GenBank sequences
LOCATIONS 9 Description of each location site of each feature
PROT_FEAT 2 Protein sequences issued from DNA sequences (CDS)
QUALIFIERS 4 Qualifiers describing feature properties
SEQUENCES 17 Main description of the Genbank entry
Mutant
Table name Field number Content
BSFA_ARTICLEMUTANT 10 Bibliographic references about mutants
BSFA_CHECKMETHOD 14 Description of controls
BSFA_GROWTHCURVE 14 Growth curve of strains and reporter gene activities
BSFA_INFOREGION 4 Informations (features) on a given bsfa region
BSFA_LABOPHENO 3 List of involved laboratories involved in a phenotype study
BSFA_LSLABO 3 List of all laboratories
BSFA_LSPHENO 2 List of phenotypes
BSFA_PROTOCOLCOLUMN 5 All column titles for results tables
BSFA_PROTOCOLCURVE 11 All values of curves results
BSFA_PROTOCOLDESC 6 List of protocols, used in functional analysis
BSFA_PROTOCOLTABLE 7 Results of experiences (without curve)
BSFA_REFART 3 Association table Articles/strainmutants
BSFA_REGIONS 5 Description of bsfa regions
BSFA_STRAINMUTANT 21 Description of mutant strains
BSSFA_STRAINPROTOCOL 23 Summary of protocol results for each gene
2.2 Web interfaces
All Web interfaces are Perl/CGI scripts. Used Perl libraries are CGI and DBI/DBD.
2.3 Micado bibliography
  • [Anagnos, 1993] Anagnostopoulos C., P. J. Piggot & J. A. Hoch (1993). The Genetic Map of Bacillus subtilis. In : Sonenshein A. L., J. A. Hoch & R. Losick (eds.). Bacillus subtilis and other Gram-Positive Bacteria. American Society for Microbiology, Washington, D.C.. 425-461.

  • [Biaudet, 1995] Véronique Biaudet, Franck Samson & Philippe Bessières. MadBase : a Networked Relational Database for Microbial Genomes. Second Meeting on the Interconnection of Molecular Biology Databases Cambridge, United Kingdom - July 20-22, 1995.

  • [Biaudet, 1997] Véronique Biaudet, Franck Samson, Philippe Bessières. Micado a network oriented database for microbial genomes. CABIOS, 1997.

  • [Biaudet-Brunaud, 1997] Thesis of Véronique Biaudet-Brunaud. Développement d'une base de données de génomes microbiens, et étude des séquences Chi recombinogènes. Université Paris VI, 1997.

  • [Dsouza, 1997] Dsouza, M., Larsen, N., Overbeek, R. Searching for patterns in genomic data. Trends in genetics, 1997.

  • [Hart, 1994] Hart KW, Searls DB, Overton GC. SORTEZ: a relational translator for NCBI's ASN.1 database. Comput Appl Biosci, Jul 1994.

  • [Kroeger, 1997] Kroeger, M. and Wahl, R. Compilation of DNA sequences of Escherichia coli K12; description of the interactive databases ECD and ECDC (update 1996). Nucleic Acids Research, 1997.

  • [Kunst, 1997] Kunst F, Ogasawara N, Moszer I, Albertini AM, Alloni G, Azevedo V, Bertero MG, Bessières P, Bolotin A, Borschert S, Borriss R, Boursier L, Brans A, Braun M, Brignell SC, Bron S, Brouillet S, Bruschi CV, Caldwell B, Capuano V, Carter NM, Choi SK, Codani JJ, Connerton IF, Danchin A, et al. The complete genome sequence of the gram-positive bacterium Bacillus subtilis. Nature, Nov 1997.

  • [Moszer, 1995] Moszer I, Glaser P, Danchin A. SubtiList : a relational database for the Bacillus subtilis genome. Microbiology, Feb 1995.

  • [Perrière, 2000] Perrière G, Bessières P, Labedan B. EMGLib : The enhanced microbial genomes library (update 2000). Nucleic acids res, Jan 2000.

  • [Samson, 2000] Franck Samson, Véronique Biaudet-Brunand, Shahinaz Gas, Etienne Dervin, Gabriel Gallezot, Sandrine Duchet, Jean-Michel Batto, S.Dusko Ehrlich and Philippe Bessières. Micado, an Integrative Database Dedicated to the Functional Analysis of Bacillus subtilis and Microbial Genomics. Functional Analysis of Bacterial Genes, 2000.


Bioinformatics
Hélène Chiapello (Helene.Chiapello @jouy.inra.fr)
Annie Gendrault-Jacquemard (Annie.Gendrault @jouy.inra.fr)