|
|
|
Sequencing a Genome by Walking with Clone-end Sequences: A Mathematical Analysis
|
|
Serafim Batzoglou, Bonnie Berger, Jill Mesirov, and Eric S. Lander
|
|
|
|
One approach to sequencing a large genome is (1) to sequence a
collection of nonoverlapping "seeds" chosen from a genomic library
of large-insert clones [such as bacterial artificial chromosomes
(BACs)] and then (2) to take successive "walking" steps by
selecting and sequencing minimally overlapping clones, using
information such as clone-end sequences to identify the
overlaps. In this paper we analyze the strategic issues involved
in using this approach. We derive formulas showing how two key
factors, the initial density of seed clones and the depth of the
genomic library used for walking, affect the cost and time of a
sequencing project that is, the amount of redundant sequencing and
the number of steps to cover the vast majority of the genome. We
also discuss a variant strategy in which a second genomic library
with clones having a somewhat smaller insert size is used to close
gaps. This approach can dramatically decrease the amount of
redundant sequencing, without affecting the rate at which the
genome is covered.
|
|
http://www.genome.org/cgi/content/full/9/12/1163
|
|
|
|