To get the software | Last news |
ISLAND:
Program to simulate the progress of an STS mapping project |
---|
Function | References | Usage |
This program allows to calculate or to estimate the progress of a mapping project using the anchoring approach, by giving the (mean) number of anchored islands, the mean length of an anchored island and the (mean) proportion of genome covered by anchored islands.
The clone and anchor (STS) locations along the genome can be
- either read from previously generated input files,
- or simulated according to a specified model.
Clones containing at least an anchor in common are assembled into anchored islands.
This program is written in C++.
The user provides the 2 following input files:
Warning:
this file has to be sorted according to decreasing right-hand ends of clones.
To that purpose, the shell-script
sort.clones is provided in the
package and allows to create this file
from a non sorted one.
Warning:
this file has to be sorted according to decreasing anchor locations.
To that purpose, the shell-script
sort.anchors is provided in the
package and allows to create this file
from a non sorted one.
References :
Usage :
When the clone and anchor locations are read
from input files
When the clone and anchor locations are
simulated
Input files
Output
Usage
Example
- the genome length,
- and for each clone, starting from the right-hand end of the genome,
the location of the left-hand end of the clone (along the genome) followed
by the location of its right-hand end.
- the genome length
(the same as in the clone locations file),
- the anchor locations (along the genome),
starting from the right-hand end of the genome,
With a Unix command | From a program |
island -c[lones] filename-for-clones -a[nchors] filename-for-anchors [-f[rench]]The clone locations file and the anchor locations file are both input files that are not modified in output.
The results are printed on the standard output.
When the -french option is chosen, all messages are in french; by default, they are in english.
It is possible to call island from a program or a host-system, after compiling the C++ source-files that are provided in the package, and link-editing the created object-files with the calling program or system.
There are two possibilities:
to directly call the computation program; in that case, the results are transferred in arguments.
In both cases,
a global variable must be declared and initialized
in the calling program:
The declaration of the program to be called is the
following one:
int french=0;
the value 0 means that messages will be in english.
If one wishes french messages, french has to be set to 1.
results printed:
void main_lecture(char* fileclone, char * fileanchor);
The fileclone and fileanchor
arguments respectively contain the pathname of the
clone locations file and the pathname of the
anchor locations file.
The declaration of the program to be called is the following one:
void commun(int SIZE, int M, int N, int inter, double Ginit, double Np, double max, double min, ifstream& ifileclone, ifstream& ifileancre, ofstream& ofile, double& NbMoy, double& LgMoy, double& OceanMoy, double& VarNbIle, double& VarLgIle, double& VarOcean)
Remark:
The
standard deviations calculated when
island
is called by the
Unix command
or by the
program that prints the results,
are actually equal to
sqrt(Variance/SIZE).
First lines of the clone locations file, named CLONES:
First lines of the anchor locations files, named ANCHORS:
Unix command:
What appears on the standard output:
A dialogue is established with the user to set the following input parameters:
(Comments are not part of the file)
100000 <----- genome length in basepairs
99805 99920 <----- left-hand end followed by right-hand end locations
99749 99894 <----- of clones ....
99762 99877 <----- .... sorted according to decreasing right end
(Comments are not part of the file)
100000 <----- genome length in basepairs
99927 <----- anchor locations
99865 <----- ....
99563 <----- .... sorted by decreasing order
island -clones CLONES -anchors ANCHORS
This is a program to calculate some properties of the physical
map of a genome of length 100000 constructed by the anchoring approach.
Clone and anchor locations are respectively read in the files
CLONES and ANCHORS.
Number of clones taken into account: 2334
Number of anchors taken into account: 514
Here are the results:
One obtains 193 anchored islands,
with an average length of 519.47 bases
and covering 88.07 percent of the genome.
Model
Input parameters
Output
Usage
Example
With a Unix command | From a program |
island [-d[etail] detail-file] [-f[rench]]The -detail option is to write, in the associated file, the value of the three quantities of interest obtained at each iteration.
When the -french option is chosen, all messages are in french; by default, they are in english.
Results are printed on the standard output.
As in the case where the clone and anchor locations are read from input files, it is possible to call island from a program or a host-system: see paragraph A.3.2.
The declaration of the program to be called is the following one:
void main_simul(char* fic);The fic argument contains the pathname of the file in which one wants to store the three quantities of interest at each iteration. If one does not want to store these intermediate results, fic has to be set to NULL.
Remark: fic corresponds to the detail-file when one uses the -detail option of the island command.
Unix command:
islandWhat appears on the standard output:
This is a program to calculate some properties of the physical map of a genome constructed by the anchoring approach from simulated data. Type in the data required for these simulations: How many simulations do you want to do (>0)? 100 What is the genome length (in basepairs)? 100000000 What is the mean number of anchors (>0)? 500 What is the mean number of clones (>0)? 2300 How many regions do you want to consider along the genome (>0)? 20 What is the mean length of long clones (in basepairs)? 350000 What is the mean length of small clones (in basepairs)? 150000 How much variability do you allow for the clone lengths (in basepairs)? 100000 Genome length (bp): 1e+08 Mean number of anchors: 500 Mean number of clones: 2300 Number of regions: 20 Mean length of long clones (bp): 350000 Mean length of small clones (bp): 150000 Variability of clone lengths (bp): 100000 RESULTS: Mean number of anchored islands: 175.02 (+/- 0.7970) Mean length of an anchored island (bp): 540575.02 (+/- 1934.8311) Mean proportion of oceans: 0.16 (+/- 0.0018)