Feature extraction and normalization algorithms for...

Image analysis – Applications – Biomedical applications

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06571005

ABSTRACT:

TECHNICAL FIELD
The invention relates to the analysis of gene probe microarrays and, in particular, to the analysis of image data produced by such gene probe microarrays.
BACKGROUND
Monitoring gene expression using high-density microarrays is a technique in the study of cell functions and the associated biochemical pathways, candidate gene identification, cellular response to drug compounds, and classification of disease states. For example, see:
Alon, U. et al. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues by oligonucleotide arrays.
Proc. Natl. Acad. Sci. USA,
96, 6745-6750 (1999).
Zhu, H. et al. Cellular gene expression altered by human cytomegalovirus: global monitoring with oligonucleotide arrays. Proc. Natl. Acad. Sci. USA 95, 14470-14475 (1998).
Wodicka, L. et al. Genome-wide expression monitoring in Saccharomyces cerevisiae. Nature Biotechnology 15, 1359-1366 (1997).
Eisen, M. B. et al. Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci.USA 95, 14863-14868 (1998).
Tamayo, P., et al. Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation. Proc. Natl. Acad. Sci. USA 96, 2907-2912 (1999).
Golub, T. R. et al. Molecular classification of cancer Class discovery and class prediction by gene expression monitoring. Science 286, 531-537 (1999).
It appears that recent research has largely focused on enhancing the microarray technology itself and the corresponding experimental protocols. For example, see
Lockhart, D. J. et al. Expression monitoring by hybridization to high-density oligonucleotide arrays.
Nature Biotechnology
14, 1675-1680 (1996).
Schena, M. et al. Quantitative monitoring of gene expression patterns with a complementary DNA microarray.
Science
270, 467-470 (1995).
Shalon, D. et al. A DNA microarray system for analyzing complex DNA samples using two-color fluorescent probe hybridization.
Genome Research
6, 639-645 (1996).
Mahadevappa, M. & Wodicka, L. A high-density probe array sample preparation method using 10- to 100-fold fewer cells.
Nature Biotechnology
17, 1134-1136(1999).
Other research has focused on developing higher-level analysis methods such as clustering and classification. For example, see
Chen, Y. et al. Ratio-based decisions and the quantitative analysis of cDNA microarray images. Journal of Biomedical Optics 2, 364-374 (1997).
Chen et al. detailed algorithms for image segmentation and confidence intervals for expression ratios for cDNA microarray data.
The fundamentals of oligonucleotide expression array technology are described, for example, in the Lockhart paper cited above and are well-known in the art. The oligonucleotide expression array technology is broadly discussed here to provide a frame of reference for discussion of the invention. In particular, genes are represented on a probe array by some number of sequences of a particular length that uniquely identify the genes and that ostensibly have optimal hybridization characteristics. Each oligonucleotide (probe) is synthesized in a small cell that contains a large number (typically between 10
6
and 10
7
) of copies of a given probe.
A mismatch (MM) oligonucleotide is designed to correspond to a perfect match (PM) oligonucleotide pulled from a gene sequence. In an MM oligonucleotide, typically the center base position of the oligonucleotide has been mutated. The MM probes give some estimate of the random hybridization and cross hybridization signals.
To use an oligonucleotide array, RNA samples are prepared and fluorescently labeled according to a particular protocol (e.g., the protocol set forth by Lockhart et al. in the article cited above), and then the labeled RNA sample is hybridized to the corresponding probes on the array. The array then goes through an automated staining/washing process (e.g., using an Affymetrix fluidics station), and the array is scanned using a confocal laser. The scanner generates an image of the array by exciting each cell with its laser, detects the resulting photon emissions from the fluorescently labeled RNA that has hybridized to the probes in the cell, and then converts the detected photon emissions into a raw intensity value for each cell. “Features” (comprised of groups of cells) are “extracted” based on the images, and characteristic feature intensities are computed from the raw cell intensities. It can be determined from the features' “characteristic intensity” whether a particular gene is present in the array, and the quantity at which the gene is present.
Conventional feature extraction is now discussed in greater detail. For example, as discussed by Wodicka et al.(1997), the raw oligonucleotide array image has recognizable patterns at each corner that allows the determination of the positions of the corners of the array. The number of features in each row and column is known. Once the corners are determined, the positions of each feature in the array are computed.
As can be seen from
FIG. 1A
, the boundary pixels of a feature are typically distorted by blurring (i.e. their levels are “pulled” towards the level of a neighboring feature) and do not faithfully represent the true intensity of the feature. Therefore, the boundary pixels are conventionally removed before the characteristic feature intensity is computed. That is, the intensities of the boundary pixels of a feature are not considered in determining a characteristic intensity value for the feature. In most cases, after removing the boundary pixels from a feature, the feature is represented by a 6×6 block of pixels that remain.
Then, the characteristic intensity for the feature is determined, for example, by computing an average intensity of the remaining pixels. It can be seen from
FIG. 1A
that determining the median of the remaining 6×6 pixels often results in determining the median value from a more variable region than, say, the most homogenous block of pixels (e.g., a 4×4 pixel block) within the 6×6 pixel block. This can result in a downward bias from the “true” characteristic feature intensity.
Furthermore,
FIG. 1B
illustrates how a misalignment of the basic grid can result in a failure to extract the central part of the true feature.
What is desired is a feature extraction method that more robustly and reliably extracts the “useful” portion of a true feature for determining characteristic feature intensity.
Furthermore, it is well known that the comparison of gene expression results across experiments is enhanced when the results of the experiments are normalized to a single scale. Normalizing multiple probe arrays to allow direct array-to-array comparisons has presented a great challenge. Conventional normalization methods include 1) linear normalization and nonlinear regression, and 2) methods using housekeeping genes or staggered spike-in controls.
With linear normalization, it is assumed that the intensities between two or more arrays are related as a straight line with a zero y-intercept. Its use leads to multiplication by a scaling factor (slope of the line) to make the mean of the “experiment” chip the same as that of the baseline chip. A description of this technique applied to Affymetrix probe arrays is given by Alon et al. (1999). For example, see page 6746, lines 2-4 which states that
“To compensate for possible variations between arrays, the intensity of each EST on an array was divided by the mean intensities of all ESTs on that array and multiplied by a nominal average intensity of 50.”
Ignoring the slight differences of the number of retained probe pairs per gene (due to outlier probe removal), the essential effect of these operations is equivalent to the multiplication of each probe pair difference by a constant scaling factor.
Chen et al. (1998) describe an application of the linear normalization technique to cDNA spotted arrays, where one intensity channel is normalized against another on the same array. For example, on page 371, formulae (12) & (13) represent a linear

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Feature extraction and normalization algorithms for... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Feature extraction and normalization algorithms for..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Feature extraction and normalization algorithms for... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3047740

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.