1  Getting Started with Micado

     1.1  Micado content
          1.1.1 Sequence data
	  1.1.2 Bacillus subtilis functional analysis data
     1.2  Access to the database
          1.2.1 Access to sequence data and their annotations
          1.2.2 Bacillus subtilis functional analysis project
          1.2.3  Explore the Micado Database
        1.3  How to contact Micado
     1.4  Releases
     1.5  History

2 Micado technical documentation

     2.1 Detailed description of Micado database structure
          2.1.1  General structure
          2.1.2  Tables and description
     2.2 Technical informations about interfaces
     2.3 Micado bibliography

Micado (MICrobial Advanced Database Organization) is a relational database dedicated
to microbial genomes [Biaudet, 1997] and functional analysis of Bacillus subtilis [Samson, 2000].
The database is accessible on the web at the URL :

	1.1  Micado content

		1.1.1  Sequence data
Micado provides easy access to GenBank and EMGlib sequences annotations.
Sequences from GenBank GenBank contains all public nucleic sequences data. Access and distribution of these sequences are free. Information about GenBank are available at :

Micado includes microbial primary sequences from GenBank. At this date, Micado contains the 118777 microbial sequences of Release 126 of GenBank .

Sequences from EMGLib (Enhanced Microbial Genomes Library) This database contains all the complete genomes from prokaryotes (bacteria and archea) in GenBank format [Perrière, 2000]. It also provides sequence annotations improved by the introduction of data on codon usage, gene orientation on the chromosome and gene families. Current release of Micado contains 55 complete genomes from EMGLib.

EMGLib sequences are available at the PBIL website at :

1.1.2 Bacillus subtilis functional analysis data The goal of functional genomics is to unravel the physiological functions of genes. Bacillus subtilis genome contains 4100 protein-coding genes. Among these genes, 1200 genes of unknown function have been chosen with the aim to elucidate their function.

So these 1200 genes have been specifically and systematically disrupted in a mutant strain of B. subtilis. These strains have been studied by an european consortium of 17 laboratories. The physiological data tested are : growth and gene reporter activity (betagalactosidase), RNAm transcripts and phenotypic tests.

All these data are linked, so it is possible to navigate between the different parts of the database. There are many different possibilities to access to the database content. 1.2 Access to the database Information on sequences, genes and mutants is searched through hyperlinked menu pages, composed of selectable lists and options, text input, and graphical navigation on clickable images. 1.2.1 Access to sequence data and their annotation Two forms are available from Micado main page :
  • One to get all informations about a Genbank or Emglib access number After this query, the web page contains three kinds of results : the physical map of the sequence contained in the access number, sequence extraction facilities for the considered acces number (DNA sequence, locations...) and a table including all the features/qualifiers of the considered access number.
  • One to get all informations about a gene name in a species for which the complete genome is in the Micado database. The results include all the Genbank or Emglib access number which cited the gene name in the qualifier 'gene'. If this query don't give any result, the search is performed on the qualifier 'note' but results are more approximate. Links to PubMed, Swiss-Prot are systematically included. Additionnal links to Subtilist, Jafan and the Functionnal Analysis Page of Micado are included only for Bacillus subtilis genes.
1.2.2. Bacillus subtilis functional analysis project

In this part information is accessible by gene name either in a form or via three list of genes (BSFA, JAFAN or essential genes). You will find in this part three kinds of results :

  • growths curves in Rich and Minimal medium
  • reporter gene activity of strains
  • systematic determination of phenotype
Get more details on Protocols, phenotypes and laboratories involved in the BSFA project.

1.2.3 Explore the Micado Database BLAST/FASTA against the Micado Database or one of the complete genomes By this way, you can compare your own sequence to a set of Micado sequences with three programs :Fasta, blast and psi-blast. Sequence may be protein or DNA sequence. Search pattern in Micado sequences The pattern matching software used in this section is Patscan [Dsouza, 1997]. Pattern in DNA or protein sequence may be searched in three kinds of Micado sequences : a unique sequence, a set of sequences or in a complete genome. To definite the pattern, see the link ``Look at the rules for pattern definition''. Example : TATAA[1,0,0] means 'match TATAA allowing 1 mismatch, 0 deletion, 0 insertion'. 1.3 How to contact Micado At every Micado page, there is a link to contact the database in case of problem or to comment. The mail addresses are : chiapell@jouy.inra.fr or hoebeke@jouy.inra.fr 1.4 Releases 1.5 History Micado database was built in the ``Génétique Microbienne'' laboratory at INRA (Jouy-en-Josas) to replace the first database MadBase [Biaudet, 1995], to store the increasing number of data (thanks to genome sequencing projects, their analysis and a new functional analysis program of unknown genes ), to manage them and to allow their consultation.
  • 1994-1999 : Micado creation from the work of Véronique Biaudet-Brunaud who more particularly worked on Micado structure and content [Biaudet-Brunaud, 1997]. The interface development was performed by Franck Samson. This group was managed by Philippe Bessières.
  • From February 2000 : Hélène Chiapello (structure, content and interfaces) and Mark Hoebeke (graphical interfaces) take over their activity, in the MIG laboratory, a new bioinformatic lab at INRA. Philippe Bessières still managed this group. A new major release of Micado is in preparation.
2 Micado Documentation 2.1 Detailed description of Micado database structure The Relational DataBase Management System used is PostgreSQL 7.1.2 2.1.1 General structure Sequence Mutant 2.1.2 Tables and description Sequence
Table name Field number Content
ACCESSIONS 2 Sequences with a changed accession number
ARTICLES 7 All bibliographic references
ARTICLE_SEQ 2 Association accession-articles
AUTHOR_SEQ 4 Author of a definite sequence (by an accession number)
COMMENTS 2 Comments on accession numbers
COMPLETE_GENOMES 3 Description of complete genomes entries from Emglib
DNA_LOC 5 DNA sequence of each location site
DNA_SEQ 2 DNA sequence of each accession number
FEATURES 7 Features description of GenBank/Emglib entries
JOURNALS 7 List of journals of published articles
KEYWORDS 2 Keywords used in GenBank sequences
LOCATIONS 9 Description of each location site of each feature
LST_AUTHOR 2 List of bibliographic references authors
LST_KEYWORD 1 List of keywords
LST_TAXON 2 Taxonomic classification
LST_TYPE_FEATURE 1 List of used features in GenBank and Emglib sequences
LST_TYPE_QUALIFIER 1 List of used qualifiers in GenBank sequences
PROT_FEAT 2 Protein sequences issued from DNA sequences (CDS)
QUALIFIERS 4 Qualifiers describing feature properties
SEQUENCES 17 Sequences from GenBank
Table name Field number Content
BSFA_ARTICLEMUTANT 10 Bibliographic references about mutants
BSFA_CHECKMETHOD 14 Description of controls
BSFA_GROWTHCURVE 14 Growth curve of strains and reporter gene activities
BSFA_INFOREGION 4 Informations (features) on a given bsfa region
BSFA_LABOPHENO 3 List of involved laboratories involved in a phenotype study
BSFA_LSLABO 3 List of all laboratories
BSFA_LSPHENO 2 List of phenotypes
BSFA_PROTOCOLCOLUMN 5 All column titles for results tables
BSFA_PROTOCOLCURVE 11 All values of curves results
BSFA_PROTOCOLDESC 6 List of protocols, used in functional analysis
BSFA_PROTOCOLTABLE 7 Results of experiences (without curve)
BSFA_REFART 3 Association table Articles/strainmutants
BSFA_REGIONS 5 Description of bsfa regions
BSFA_STRAINMUTANT 21 Description of mutant strains
BSSFA_STRAINPROTOCOL 23 Summary of protocol results for each gene

	2.2 Technical informations about interfaces

Programming and libraries :  Perl-CGI scripts which provide SQL access to the database.....

