Data processing: measuring – calibrating – or testing – Measurement system in a specific environment – Biological or biochemical
Reexamination Certificate
2000-04-06
2003-11-11
Woodward, Michael P. (Department: 1631)
Data processing: measuring, calibrating, or testing
Measurement system in a specific environment
Biological or biochemical
C702S020000, C435S006120
Reexamination Certificate
active
06647341
ABSTRACT:
BACKGROUND OF THE INVENTION
Classification of biological samples from individuals is not an exact science. In many instances, accurate diagnosis and safe and effective treatment of a disorder depends on being able to discern biological distinctions among morphologically similar samples, such as tumor samples. The classification of a sample from an individual into particular disease classes has typically been difficult and often incorrect or inconclusive. Using traditional methods, such as histochemical analyses, immunophenotyping and cytogenetic analyses, often only one or two characteristics of the sample are analyzed to determine the sample's classification, resulting in inconsistent and sometimes inaccurate results. Such results can lead to incorrect diagnoses and potentially ineffective or harmful treatment.
For example, acute leukemia was first successfully treated by Farber and colleagues in the 1940's, and it was recognized that treatment responses were variable (Farber, et al.,
NEJM
238:787-793 (1948)). Subtle differences in nuclear shape and granularity were suggestive of distinct subtypes of acute leukemia, but such morphological distinctions were difficult to reproduce (C. E. Forkner,
Leukemia and Allied Disorders
, (New York, Macmillan) (1938); E. Frei et al.,
Blood
18:431-54 (1961); Medical Research Council,
Br Med J
1:7-14 (1963)). By the 1960s, these distinctions were further strengthened by enzyme-based histochemical analyses which demonstrated that some leukemias were periodic-acid-schiff (PAS) positive, whereas others were myeloperoxidase positive. This was the basis of the first attempts to classify the acute leukemias into those arising from lymphoid precursors (acute lymphoblastic leukemia, ALL) and those arising from myeloid precursors (acute myeloid leukemia, AML). This classification was further solidified by the development in the 1970s of antibodies recognizing either lymphoid or myeloid cell surface molecules. Most recently, particular subtypes of acute leukemia have been found to be associated with specific chromosomal translocations; for example, the t(12;21)(p13;q22) translocation occurs in 25% of patients with ALL, whereas the t(8;21)(q22;q22) occurs in 15% of patients with AML.
No single test is currently sufficient to establish the diagnosis of AML vs. ALL. Rather, current clinical practice involves an experienced hematopathologist's interpretation of the tumor's morphology, histochemistry, immunophenotyping and cytogenetic analysis, each of which is performed in a separate, highly specialized laboratory. Correct distinction of ALL from AML is critical for successful treatment: chemotherapy regimens for ALL generally contain corticosteroids, vincristine, methotrexate, and L-asparaginase, whereas most AML regimens rely on a backbone of daunorubicin and cytarabine. While remissions can be achieved using ALL therapy for AML (and vice versa), cure rates are markedly diminished, and unwarranted toxicities are encountered. Thus, the ability to accurately classify a biological sample as an AML sample or an ALL sample is quite important.
Furthermore, important biological distinctions are likely to exist which have yet to be identified due to the lack of systematic and unbiased approaches for identifying or recognizing such classes. Thus, a need exists for an accurate and efficient method for identifying biological classes and classifying samples.
SUMMARY OF THE INVENTION
The present invention relates to a method of identifying a set of informative genes whose expression correlates with a class distinction between samples, comprising sorting genes by degree to which their expression in the samples correlate with a class distinction, and determining whether said correlation is stronger than expected by chance. A gene whose expression correlates with a class distinction more strongly than expected by chance is an informative gene. A set of informative genes is identified. In one embodiment, the class distinction is a known class, and in one embodiment the class distinction is a disease class distinction. In particular, the disease class distinction can be a cancer class distinction, such as a leukemia class distinction (e.g., Acute Lymphoblastic Leukemia (ALL) or Acute Myeloid Leukemia (AML)). In another embodiment, the class distinction is a brain tumor class distinction (e.g., medulloblastoma or glioblastoma). In a further embodiment, the class distinction is a lymphoma class distinction, such as a Non-Hodgkin's lymphoma class distinction (e.g., folicular lymphoma (FL) or diffuse large B cell lymphoma (DLBCL). The known class can also be a class of individuals who respond well to chemotherapy or a class of individuals who do not response well to chemotherapy.
Sorting genes by the degree to which their expression in the sample correlates with a class distinction can be carried out by neighborhood analysis (e.g., a signal to noise routine, a Pearson correlation routine, or a Euclidean distance routine) that comprises defining an idealized expression pattern corresponding to a gene, wherein said idealized expression pattern is expression of said gene that is uniformly high in a first class and uniformly low in a second class; and determining whether there is a high density of genes having an expression pattern similar to the idealized expression pattern, as compared to an equivalent random expression pattern. The signal to noise routine is:
P
(
g,c
)=(&mgr;
1
(
g
)−&mgr;
2
(
g
))/(&sgr;
1
(
g
)+&sgr;
2
(
g
)),
wherein g is the gene expression value; c is the class distinction, &mgr;
1
(g) is the mean of the expression levels for g for the first class; &mgr;
2
(g) is the mean of the expression levels for g for the second class; &sgr;
1
(g) is the standard deviation for the first class; and &sgr;
2
(g) is the standard deviation for the second class.
Another aspect of the present invention is a method of assigning a sample to a known or putative class, comprising determining a weighted vote of one or more informative genes (e.g., greater than 50, 100, 150) for one of the classes in the sample in accordance with a model built with a weighted voting scheme, wherein the magnitude of each vote depends on the expression level of the gene in the sample and on the degree of correlation of the gene's expression with class distinction; and summing the votes to determine the winning class. The weighted voting scheme is:
V
g
=a
g
(
x
g
−b
g
),
wherein V
g
is the weighted vote of the gene, g; a
g
is the correlation between gene expression values and class distinction, P(g,c), as defined herein; b
g
=&mgr;
1
(g)+&mgr;
2
(g))/2 which is the average of the mean log
10
expression value in a first class and a second class; x
g
is the log
10
gene expression value in the sample to be tested; and wherein a positive V value indicates a vote for the first class, and a negative V value indicates a negative vote for the class. A prediction strength can also be determined, wherein the sample is assigned to the winning class if the prediction strength is greater than a particular threshold, e.g., 0.3. The prediction strength is determined by:
(
V
win
−V
lose
)/(
V
win
+V
lose
),
wherein V
win
and V
lose
are the vote totals for the winning and losing classes, respectively. When classifying a sample into an ALL disease class or an AML disease class, the informative genes can be, for example, C-myb, Proteasome iota, MB-1, Cyclin, Myosin light chain, Rb Ap48, SNF2, HkrT-1, E2A, Inducible protein, Dynein light chain, Topoisomerase II&bgr;, IRF2, TFIIE&bgr;, Acyl-Coenzyme A, dehydrogenase, SNF2, ATPase, SRP9, MCM3, Deoxyhypusine synthase, Op 18, Rabaptin-5, Heterochromatin protein p25, IL-7 receptor, Adenosine deaminase, Fumarylacetoacetate, Zyxin, LTC4 synthase, LYN, HoxA9, CD33, Adipsin, Leptin receptor, Cystatin C, Proteoglycan 1, IL-8 precursor, Azurocidin, p62, CyP3, MCL1, ATPase, IL-8, Cathepsin D, Lectin, MAD-3, CD11c, Ebp72, Lysozyme, Properdin and/or Catalase.
The invention also encom
Golub Todd R.
Lander Eric S.
Mesirov Jill
Slonim Donna
Tamayo Pablo
Clow Lori A.
Hamilton Brook Smith & Reynolds P.C.
Whitehead Institute for Biomedical Research
Woodward Michael P.
LandOfFree
Methods for classifying samples and ascertaining previously... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Methods for classifying samples and ascertaining previously..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Methods for classifying samples and ascertaining previously... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3115599