Data processing: measuring – calibrating – or testing – Measurement system in a specific environment – Biological or biochemical
Reexamination Certificate
2001-12-21
2004-09-14
Martinell, James (Department: 1631)
Data processing: measuring, calibrating, or testing
Measurement system in a specific environment
Biological or biochemical
C702S020000
Reexamination Certificate
active
06792355
ABSTRACT:
BACKGROUND OF THE INVENTION
The present invention relates generally to classifying and identifying polypeptides having similar structure or function based on comparative amino acid sequence analysis and more specifically to determining structure-related properties of a ligand when bound to a polypeptide of known amino acid sequence.
Structure determination plays a central role in chemistry and biology due to the correlation between the structure of a molecule and its function. In particular, a three dimensional model of a therapeutic target polypepetide can be of valuable assistance in the design or discovery of therapeutic drugs. The structure of a ligand bound to a polypeptide as observed in a three dimensional model can be used as a template for identifying structural properties to be incorporated into candidate drugs. Alternatively, using computer assisted methods a candidate drug can be identified based on structural properties that allow docking to a binding site in the three dimensional model of the target polypeptide, much as a key fits a lock. By structure-based methods such as these, lead compounds can be identified for further development.
Although methods for structure determination are evolving, it is currently difficult, costly and time consuming to empirically determine the three dimensional structure of a polypeptide. In general, determining such structures for polypeptides complexed with ligands is even more difficult. One approach to circumventing this difficulty is theoretical modeling of polypeptide structures with or without a bound ligand based on more readily available structural and functional information. Such theoretical modeling approaches are based on the tenet that the three-dimensional structure and function of a polypeptide are imparted by its amino acid sequence and the corollary that polypeptides with similar amino acid sequences have similar structure and function.
Theoretical determination of a three dimensional model for a polypeptide by ab initio methods is a relatively undeveloped method. However, another theoretical approach, referred to as homology modeling, has been used to infer structure for a particular polypeptide by threading its amino acid sequence through or overlaying the sequence upon a three-dimensional model of a homologous polypeptide. The successful application of homology modeling to determining polypeptide structure relies upon choosing a correct polypeptide template for comparison. In most cases criteria for comparison are unavailable or unreliable.
Thus, there exists a need for efficient methods to identify homologous amino acid sequences and to identify structural or functional characteristics of a polypeptide based on its amino acid sequence. A need also exists for methods to determine ligand binding properties of polypeptides based on sequence information. The present invention satisfies these needs and provides related advantages as well.
SUMMARY OF THE INVENTION
The invention provides a method for separating two or more subsets of polypeptides within a set of polypeptides. The method includes the steps of: (a) determining a sequence comparison signature for each amino acid sequence in a set of amino acid sequences, wherein the sequence comparison signature includes pairwise comparison scores for the amino acid sequence compared to each of the other amino acid sequences in the set; (b) constructing a distance arrangement including the sequence comparison signatures related according to the distance between each of the sequence comparison signatures; and (c) identifying a first and second cluster of sequence comparison signatures in the distance arrangement, wherein the first cluster includes sequence comparison signatures for polypeptides having a similar protein fold or biological function, the protein fold or function being different compared to a protein fold or function of polypeptides having sequence comparison signatures in the second cluster.
The invention also provides a method for identifying a member of a polypeptide family. The method includes the steps of: (a) determining a query sequence comparison signature for an amino acid sequence, wherein the query sequence comparison signature inlcudes pairwise comparison scores for the amino acid sequence compared to each amino acid sequence in a set; (b) comparing the distance between the query sequence comparison signature and the sequence comparison signatures for other amino acid sequences in the set, wherein the sequence comparison signatures for other amino acid sequences in the set are clustered into polypeptide families; and (c) identifying a proximal cluster having one or more sequence comparison signatures that have a closer distance to the query sequence comparison signature than the sequence comparison signatures of a distal cluster, thereby identifying the polypeptide having the query sequence comparison signature as being a member of the polypeptide family for the proximal cluster.
REFERENCES:
Bejerano et al, Bioninformatics 17 (10), 927 (2001).*
Gerstein, M., “Measurements of the effectiveness of transitive sequences comparison, through a third ‘intermediate’ sequence,”Bioinformatics, 14(8):707-717 (1998).
Altschul et al., “Basic Local Alignment Search Tool,”J. Mol. Biol., 215:403-410 (1990).
Altschul et al., “Gapped BLAST and PSI-BLAST: a new generation of protein database search programs,”Nucleic Acids Res., 25(17):3389-3402 (1997).
Apweiler et al., “Proteome Analysis Database: online application of InterPro and CluSTr for the functional classification of proteins in whole genomes,”Nucleic Acids Res., 29(1)44-48 (2001).
Attwood et al., “PRINTS and PRINTS-S shed light on protein ancestry,”Nucleic Acids Res., 30(1)239-241 (2002).
Bateman et al., “The Pfam Protein Families Database,”Nucleic Acids Res., 30(1):276-280 (2002).
Bolten et al., “Clustering protein sequences-structure prediction by transitive homology,”Bioinformatics, 17(10):935-941 (2001).
Böhm, “The Computer Program LUDI: A New Method for the De Novo Design of Enzyme Inhibitors,”J. Comp. Aided Mol. Des., 6:61-78 (1992).
Carugo and Argos, “NADP-dependent Enzymes. I: Conserved Sterochemistry of Cofactor Binding,”Proteins: Struc., Funct., Genet., 28:10-28 (1997).
Corpet et al;., “ProDom and ProDom-CG: tools for protein domain analysis and whole genom comparisons,”Nucleic Acids Res., 28(1):267-269 (2000).
Henikoff et al., “Increased coverage of protein families with the Blocks Database servers,”Nucleic Acids Res., 28(1):228-230 (2000).
Hofmann et al., “The PROSITE database, its status in 1999,”Nucleic Acids Res., 27(1):215-219 (1999).
Jarvis and Partick,“Clustering Using a Similarity Measure Based on Shared Near Neighbors,”IEEE Trans. Comp., 22(11):1025-1034 (1973).
Manley, “Multivariate Statistical Methods, a Primer,” Chapman Hall Chapter 8:100-113 (1994).
Murzin et al., “SCOP: A Structural Classification of Proteins Database for the Investigation of Sequences and Structures,”J. Mol. Biol., 247:536-540 (1995).
Needleman and Wunsch, “A General Method and Applicable to the Search for Similarites in the Amino Acid Sequence of Two Proteins,”J. Mol. Biol., 48:443-453 (1970).
Nicholas et al., “Strategies for Searching Sequence Database,”BioTechniques, 28:1174-1191 (2000).
Pearson and Lipman, “Improved Tools for Biological Sequence Comparison,”Proc. Natl. Acad. Sci. USA, 85:2444-2448 (1988).
{haeck over (S)}ali and Blundell, “Comparative Protein Modelling by Satisfaction of Spatial Restraints,”J. Mol. Biol., 234:779-815 (1993).
Schnur, “Design and Diveristy Analysis of Large Combinatorial Libraries Using Cell-Based Methods,”J. Chem. Inf. Comput. Sci., 39:36-45 (1999).
Sellers, “On the Theory and Computation of Evolutionary Distances,”J. Appl. Math., 26(4):787-793 (1974).
Sellers, “Pattern Recognition in Genetic Sequences,”Proc. Natl. Acad. Sci. USA, 76(7):3041 (1979).
Smith and Waterman, “Identification of Common Molecular Subsequences,”J. Mol. Biol., 147:195-197 (1981).
Tatusova and Madden, “BLAST 2 Sequences, A New Tool for Comparing Protein and Nucleotide Sequences,”FEMS Microbiol. Lett., 17
Hansen Mark R.
Kho Richard
Villar Hugo O.
Martinell James
McDermott Will & Emery LLP
Triad Therapeutics, Inc.
LandOfFree
Methods for determining polypeptide structure, function or... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Methods for determining polypeptide structure, function or..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Methods for determining polypeptide structure, function or... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3253428