Sophie Schbath
INRA, Unité de Biométrie, F78352
Jouy-en-Josas Cedex, France
A complete physical map of the DNA of an organism consists of ordered overlapping clones spanning the entire genome. Such a map is a useful tool for further genetic analyses, like gene location or sequencing of specific regions in the genome. A large number of clones are chosen at random from a library and several approaches can be used to infer overlaps between clones. For our purpose, we have an additional library of random anchors, small DNA sequences that occur exactly one in the genome, and two clones overlap if they contain a common anchor. The anchored clones are then linked into islands.
To plan a physical mapping project, it is important to study the distribution of islands with respect to the number of clones and anchors used. For instance, the main quantities of interest are the mean number of islands, the mean length of islands and the mean proportion of the genome covered by islands. In a previous work, Arratia et al. (1991) gave such analysis modeling the processes of clone locations and anchor locations by stationary processes. Because of inhomogeneities involved in this problem, due for instance to the cloning bias, we provide general results allowing to predict progress in such a mapping project when clones and anchors are not homogeneously distributed along the genome. An application of our results on two simple non-homogeneous models reveals that using homogeneous processes for clones and anchors provides an overly optimistic assessment to the progress of the mapping project.
References
Arratia, R., Lander, E. S., Tavaré, S. and Waterman, M. S. (1991). Genomic mapping by anchoring random clones: A mathematical analysis. Genomics 11 806-827.