Methods, software and apparati for identifying genomic...

Chemistry: molecular biology and microbiology – Measuring or testing process involving enzymes or... – Involving nucleic acid

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C702S027000

Reexamination Certificate

active

06291182

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to methods and apparati using nucleic acid markers having a statistical association with a detectable trait to identify one or more genes responsible for the trait or for a predisposition for expressing the trait.
BACKGROUND OF THE INVENTION
Recent advances in genetic engineering and bioinformatics have enabled the manipulation and characterization of large portions of the human genome. While efforts to obtain the full sequence of the human genome are rapidly progressing, there are many practical uses for genetic information which can be implemented with partial knowledge of the sequence of the human genome.
As the full sequence of the human genome is assembled, the partial sequence information available can be used to identify genes responsible for detectable human traits, such as genes associated with human diseases, and to develop diagnostic tests capable of identifying individuals who express a detectable trait as the result of a specific genotype or individuals whose genotype places them at risk of developing a detectable trait at a subsequent time. Each of these applications for partial genomic sequence information is based upon the assembly of genetic and physical maps which order the known genomic sequences along the human chromosomes.
The present invention relates to methods and apparati using nucleic acid markers having a statistical association with a detectable trait to identify one or more genes responsible for the trait or for a predisposition for expressing the trait.
SUMMARY OF THE INVENTION
The present invention relates to methods and apparati for identifying one or more genes associated with a detectable phenotype. As described in more detail below, the present invention involves the use of biallelic markers, which are polymorphic nucleic acid sequences which differ from one another at a single nucleotide. The allelic frequencies of the biallelic markers are compared in nucleic acid samples derived from individuals expressing the detectable trait and individuals who do not express the detectable trait. In this manner, candidate genomic regions suspected of harboring a gene associated with the detectable trait under investigation are identified.
The existence of one or more genes associated with the detectable trait within the candidate region is confirmed by identifying more biallelic markers lying in the candidate region. A first haplotype analysis is performed for each possible combination of groups of biallelic markers within the genomic region suspected of harboring a trait-associated gene. For example, each group may comprise three biallelic markers. For each of the groups of markers, the frequency of each possible haplotype (for groups of three markers there are 8 possible haplotypes) in individuals expressing the trait and individuals who do not express the trait is estimated. For example, the haplotype frequencies may be estimated using the Expectation-Maximization method of Excoffier L and Slatkin M,
Mol. Biol. Evol.
12:921-927 (1995), the disclosure of which is incorporated herein by reference and which is described in more detail below. In some embodiments, the Expectation-Maximization method may be performed using the EM-HAPLO program (Hawley M E, Pakstis A J & Kidd K K,
Am. J. Phys. Anthropol.
18:104 (1994), the disclosure of which is incorporated herein by reference). Alternatively, the frequency of each allele of individual biallelic markers may be determined in nucleic acid samples from individuals who express the trait under investigation and control individuals who do not express the trait.
The frequencies of each of the possible haplotypes of the grouped markers (or each allele of individual markers) in individuals expressing the trait and individuals who do not express the trait are compared. For example, the frequencies may be compared by performing a chi-squared analysis. Within each group, the haplotype (or the allele of each individual marker) having the greatest association with the trait is selected. This process is repeated for each group of biallelic markers (or each allele of the individual markers) to generate a distribution of association values, which will be referred to herein as the “candidate region” distribution.
A second haplotype analysis is performed for each possible combination of groups of biallelic markers within random genomic regions. For example, each group may comprise three biallelic markers. For each of the groups of markers, the frequency of each possible haplotype (for groups of three markers there are 8 possible haplotypes) in individuals expressing the trait and individuals who do not express the trait is estimated. For example, the haplotype frequencies may be estimated using the Expectation-Maximization method of Excoffier L and Slatkin M, as described above. In some embodiments, the Expectation-Maximization method may be performed using the EM-HAPLO program as described above. Alternatively, the frequency of each allele of individual biallelic markers may be determined in nucleic acid samples from individuals who express the trait under investigation and control individuals who do not express the trait.
The frequencies of each of the possible haplotypes of the grouped markers (or each allele of individual markers) in individuals expressing the trait and individuals who do not express the trait are compared. For example, the frequencies may be compared by performing a chi-squared analysis. Within each group, the haplotype (or the allele of each individual marker) having the greatest association with the trait is selected. This process is repeated for each group of biallelic markers (or each allele of the individual markers) to generate a distribution of association values, which will be referred to herein as the “random region” distribution.
The “candidate region” distribution and the “random region” distribution of are then compared to one another to determine if there are significant differences between them. For example, the candidate region distribution and the random region distribution can be compared using either the Wilcoxon rank test (Noether, G. E. (1991) Introduction to statistics: “The nonparametric way”, Springer-Verlag, New York, Berlin, the disclosure of which is incorporated herein by reference) or the Kolmogorov-Smirnov test (Saporta, G. (1990) “Probalites, analyse des donnees et statistiques” Technip editions, Paris, the disclosure of which is incorporated herein by reference) or both the Wilcoxon rank test and the Kolmogorov-Smirnov test.
If the candidate region distribution and the random region distribution are found to be significantly different, the candidate genomic region is highly likely to contain a gene associated with the detectable trait. Accordingly, the candidate genomic region is evaluated more fully to isolate the trait-associated gene. Alternatively, if the candidate region distribution and the random region distribution are equal using the above analyses, the candidate genomic region is unlikely to contain a gene associated with the detectable trait. Accordingly, no further analysis of the candidate genomic region is performed.
The present invention solves the need for empirical assessments of the statistical significance of the association of biallelic markers with detectable traits. The present invention considers the trait being investigated as well as the populations of individuals utilized to determine the significance of the association. In particular, the present invention allows the reference points (i.e. the controls) for evaluating significance to be derived from the same populations as those used to detect the association between the biallelic markers and the trait. In addition, in some embodiments, the present invention allows all the data available for candidate genomic regions suspected of harboring a gene associated with a detectable trait to be utilized in the determination of whether the candidate region does in fact harbor such a gene. Accordingly, the present invention avoids the risk of failin

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Methods, software and apparati for identifying genomic... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Methods, software and apparati for identifying genomic..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Methods, software and apparati for identifying genomic... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2444505

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.