Version française Write us

To sort the input files of island



The input files of island must be sorted: the clone locations file according to decreasing right-hand ends of clones and the anchor locations file according to decreasing anchor locations.

The shell-scripts sort.clones and sort.anchors are available in the package to create such files from non sorted ones.


The shell-script sort.clones

Function Syntax Usage restrictions Example

Function:

Decreasing sort of the second column of a file from the second line.

Syntax :
sort.clones < name-of-the-file-to-be-sorted > name-of-the-sorted-file

Usage restrictions:

The input file must have the following structure:
- the first line contains the genome length in basepairs,
- each of the following lines is composed of:

  1. the left-hand end location of a clone,
  2. one or several white characters,
  3. the right-hand end location of the clone,
  4. possibly, a comment introduced with a character #, itself preceded by one or several white characters.

The numbers must be written in "fixed point" notation " (i.e non exponential).

Example:

 sort.clones < CLONES > SORTED-CLONES

The input file, the clone locations file named CLONES, is sorted by increasing values of the left-hand ends locations of the clones. So, it cannot be used as input of island. The output file, the file named SORTED-CLONES, will be sorted by decreasing values of the left-hand ends and can be used as input of island.

First lines of the file named CLONES:
(Only the comment introduced with the character # is part of the file)

100000          # genome length in basepairs
320 335         <----  left-hand end followed by right-hand end locations
350 365         <----     ....
425 440         <----     .... sorted according to increasing right end

First lines of the file named CLONES-TRIES :

100000          # genome length in basepairs
425 440         
350 365         
320 335         

Sommaire

The shell-script sort.anchors

Function Syntax Usage restrictions Example

Function:

Decreasing sort of the first column of a file from the second line.

Syntax :
sort.anchors < name-of-the-file-to-be-sorted > name-of-the-sorted-file

Usage restrictions:

The input file must have the following structure:
- the first line contains the genome length in basepairs,
- each of the following lines is composed of:

  1. the location of an anchor,
  2. possibly, a comment introduced with a character #, itself preceded by one or several white characters.

The numbers must be written in "fixed point" notation " (i.e non exponential).

Example:

 sort.anchors < ANCHORS > SORTED-ANCHORS

The input file, the clone locations file named ANCHORS, is sorted by increasing values of the locations of the anchors. So, it cannot be used as input of island. The output file, the file named SORTED-ANCHORS, will be sorted by decreasing values and can be used as input of island.

First lines of the file named ANCHORS:
(Only the comment introduced with the character # is part of the file)

100000          # genome length in basepairs
   150          <---- locations of anchors
   294          <----    ....
   663          <----    .... sorted according to increasing values

First lines of the file named SORTED-ANCHORS :

100000          # genome length in basepairs
   663     
   294     
   150     

Sommaire


Last release: June 26, 1998