Chemistry: molecular biology and microbiology – Measuring or testing process involving enzymes or... – Involving nucleic acid
Reexamination Certificate
1996-07-19
2002-02-19
Marschel, Ardin H. (Department: 1631)
Chemistry: molecular biology and microbiology
Measuring or testing process involving enzymes or...
Involving nucleic acid
C436S094000
Reexamination Certificate
active
06348313
ABSTRACT:
This invention relates to new techniques for the sequencing of nucleic acids based upon a general approach in which labelled adaptor molecules are employed. The invention facilitates the large scale analysis of populations of nucleic acids, for example populations of sequences as produced in the Human Genome Project (HGP). Its applicability is, of course, not limited to HGP or its like.
Conventional analysis of nucleic acid sequences has hitherto depended largely on the base specific fragmentation of the original nucleic acid sample into two or more parts differing in size by one or more bases. Sequencing is effected by separation of the resultant fragments followed by their analysis.
In relatively low throughput sequence analysis of RNA, base specific fragmentation has been effected by ribonucleases with base specific activities, followed by thin layer chromatographic separation of the products. Higher throughput sequence analysis, especially of DNA, generates the fragments to be analyzed by base specific chemical cleavage (Maxam, A. M. and Gilbert, W Proc. Natl. Acad Sci. 74 p560 (1977) or by terminating, in a base specific manner, synthesis catalysed by a suitable nucleic acid polymerase (Sanger, F., Nicklen, S and Coulson, A. R., Proc. Natl. Acad Sci. 74 p5463 (1977)). Separation of the resultant fragments is achieved by denaturing gel electrophoresis through ultra thin slabs or capillaries containing a suitable polymer like polyacrylamide. This can resolve of the order of about a thousand bases per suitably prepared sample at a resolution of one base, and can handle tens of samples simultaneously. Detection (Smith, L., M. and Youvan, D., C. Biotechnolgy 7 p576-580 (1989)) (Yang, M., M., and Youvan, D., C., Biotechnology 7 p576-580 (1989)) has been direct or indirect through radioactive, chemiluminescent or fluorescent labelling or by stable isotopes (Human Genome 1991-1992 Program Report p18 and p22 U.S. department of Energy 1992)).
There is a great deal of interest in achieving greater rates of sequencing at reduced cost. It will then be feasible to analyze completely the genomes of organisms, in particular those of higher eucaryotes which are commonly over 3,000,000,000 bases in size per haploid genome. Furthermore, methods which are suitable for such analysis will also make it possible to perform high resolution linkage analysis on many individuals in a population. This will be important for identifying the phenotypes, especially common diseases, associated with genes, and to trace gene flow in humans. Analyzing the expressed sequences in a population of cDNAs or mRNAs would also become possible. It would also be possible repeatedly to sequence the same region or multiple regions from many different individuals for the purposes of comparisons related to for example diagnosis.
Very high throughput methods of sequence analysis are therefore being investigated (desirably one or more orders of magnitude greater than achievable with current, conventional, commercially available sequencing apparatus, such as the ABI 373 DNA sequencing System which can read not more than 1000 bases a day from 72 samples). Scanning tunnelling electron microscopy can directly visualise the bases in individual molecules. Lasers might also be usable to sort individual molecules, which can then be analyzed by degrading them from one end, a base at a time (Harding, J. D. and Keller, R. A. Trends in Biotech 10 p55-58 (1992).
However, there is a further problem when it is desired to conduct sequence analysis at a rate adequate for analyzing whole genomes or adequate for comparing many selected sequences from many individuals (for example, when using family studies to identify the locus of an inherited trait), namely many samples need to be simultaneously analyzed. This is currently being approached through sequencing by hybridisation.
There are two formats for sequencing analysis by hybridisation. One format (Drmanac, R., et al Genomics 4 p114-128 (1989) and Stretzoska, Z., et al Proc. Natl. Acad. Sci USA 88 p10, 089-10,093 (1991)) immobilises many samples (perhaps numbering hundreds of thousands) separately on a large array. The array is probed in turn by each of many different labelled oligonucleotides of known sequence. Identification of samples which have hybridised to each of the probes, indicates those which have complementary sequences to the probe. Use of multiple probes covering all possible sequences allows the complete sequences of the samples to be assembled. This method is, however, limited by the requirement for oligonucleotides of at least 5 bases to achieve specific hybridisation, which in turn dictates that large numbers of probes (4
n
where n is the length of the oligonucleotide) are required to cover all possible sequence combinations.
The alternative format (Fodor, S. et al Nature 364 P555-556 (1993), Kharpko, K., R., et al DNA Sequence 1 p375-388 (1991), Southern, E., M., Maskos, U., and Elder, J., K. Genomics 13 p1008-1017 (1992)) requires many thousands of different oligonucleotides, each with different known sequence covering together all possibilities, to be immobilised on a suitable array. Probing the array with a labelled nucleic acid sample whose sequence is to be analyzed identifies the oligonucleotides which share homology with the sample. This is usually achieved through synthesis of the oligonucleotides in situ with masking, for example by a lithograph, of those not requiring the specific base being added at any given time. The sample is labelled and hybridised to the array. The positions of hybridisation indicate where sequence homologies are shared between the sample and the detected oligonucleotides. Therefore the sequences of the sample can be deduced from those of the detected oligonucleotides.
In either format for sequencing by hybridisation, it is difficult in practice to synthesise oligonucleotides of adequate length. When oligonucleotides are immobilised and probed with sample, in practice only short oligonucleotides can be synthesised on arrays of necessarily limited size.
Alternatively, and as mentioned above, when oligonucleotides are synthesised independently to probe an array of samples, the number required to cover all sequence possibilities is 4
n
, where n is the length of each oligonucleotide. It is logistically challenging both to produce and to use the number required to accurately detect all possible sequences. For example, the number required to make all possible 5 mers is 1024.
The length of the oligonucleotides determines their fidelity of hybridisation, and also the ease with which full sequence can be assembled from the component oligonucleotide sequences. In each case longer oligonucleotides are better. Greater fidelity of hybridisation is achieved the longer the oligonucleotides used since more stringent washing can be performed when the oligonucleotides are as long as possible. When full length sequence is being assembled from overlapping component sequences, the longer the component sequences the fewer possible “solutions” that there are likely to be.
A further problem associated with the sequencing by hybridisation format where probe oligonucleotides are immobilised is that as the size of the target increases the proportion of any given region within that target decreases. This reduces signal to noise, and therefore has the effect of limiting the size of target, which can be analyzed.
Hybridisation used alone, is in general not a good means of analyzing sequences because not all oligonucleotides hybridise with equal efficiency or specificity under a given range of conditions. There are therefore associated interpretational and/or practical difficulties.
The possibility for enzymatic sequencing in situ on arrays of immobilised samples has also been reported (Rosenthal, A. and Brenner, S. 1993 Meeting on Genome maping and sequencing page 222 Cold Spring Harbor Laboratory Press (1993). Each base is labelled differently and added to the samples such that extension is terminated at a given base. The number and type of added bases is recorded
Marschel Ardin H.
Medical Research Council
Millen White Zelano & Branigan P.C.
LandOfFree
Sequencing of nucleic acids does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Sequencing of nucleic acids, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Sequencing of nucleic acids will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2985488