Data processing: measuring – calibrating – or testing – Measurement system in a specific environment – Chemical analysis
Reexamination Certificate
2000-03-20
2003-09-30
Brusca, John S. (Department: 1631)
Data processing: measuring, calibrating, or testing
Measurement system in a specific environment
Chemical analysis
C702S019000, C250S281000, C530S300000
Reexamination Certificate
active
06629040
ABSTRACT:
BACKGROUND OF THE INVENTION
Traditionally, protein sequences were determined by stepwise, chemical degradation of purified proteins or fragments thereof. With the advent of sequence databases which contain complete genomic sequences or large numbers of complete or partial expressed gene sequences (expressed sequence tags, EST's) (Goffeau et al. (1996), Science 274:546-549; Fraser et al. (1977) Nature 390:580-586; Neubauer et al. (1998) Nature Genetics 20:46-50), the sequences of most proteins can be determined by correlating experimental data extracted from the protein with sequence databases (Henzel et al. (1993) Proc. Natl. Acad. Sci. USA 90:5011-5015; Eng et al. (1994) J. Am. Soc. Mass. Spectrom. 5:976-989). The many implemented sequence database searching strategies have in common the use of a combination of specific constraints to narrow down a candidate list of matching proteins in a database to a single protein (Patterson et al. (1995) Electrophoresis 16:1791-1814). Currently, the most restrictive constraints are generated by mass spectrometric (MS) or tandem mass spectrometric (MS/MS) analysis of peptide mixtures after proteolysis of a purified protein or protein mixture with a specific protease.
The constraints provided by collision-induced dissociation (CID) of selected peptides are highly discriminating because CID spectra reflect the amino acid sequence of the peptide analyzed. MS/MS is generally practiced with peptides separated by capillary HPLC or capillary electrophoresis (CE) connected on-line to an electrospray ionization (ESI) MS/MS instrument. Peptides eluting from the separation system are detected by the first stage mass analyzer that also selects peptide ions automatically for CID followed by fragment analysis in a second mass analyzer. Observed spectra are used to identify the protein from which the peptide originated, either by automated correlation of uninterpreted CID spectra with a sequence database or by searching sequence databases with complete or partial peptide sequences obtained by manual or computer-assisted interpretation of CID spectra (Eng et al. (1994) J. Am. Soc. Mass. Spectrom. 5:976-989; Mann et al. (1994) Anal. Chem. 66:4390-4399, each incorporated herein by reference). The method has the significant advantage that a CID spectrum from a single peptide is sufficient to conclusively identify a protein (Susin et al. (1999) Nature 397:441-446, incorporated herein by reference in its entirety). Consequently, proteins can be identified by correlating CID spectra with databases containing incomplete gene sequences as found in EST databases. Components of protein mixtures can be identified without the need for purification and proteins can be identified across species, provided that the peptide segment analyzed is conserved between species. The method has the disadvantage that peptide ions need to be sequentially selected for CID out of a mixture of analytes (Ducret et al. (1998) Protein Science 7:706-719). The number of peptides present in a mixture may significantly exceed the number of CID spectra generated in the time available for analysis. For automated MS/MS operation the mass spectrometer is generally programmed to give highest priority for CID selection to ions with the highest ion current (Ducret et al. supra). Therefore, if complex peptide mixtures are analyzed, lower intensity peptide ions will not be selected for CID. This results in an apparent compression of the dynamic range that can be somewhat alleviated, but not eliminated, by extending the peptide analysis time (Goodlett et al. (1993) J. Microcolumn Separations 5:57-62; Davis et al. (1996) J. Am. Soc. Mass. Spectrom. 9:194-201, each incorporated herein by reference in their entirety).
The accurately measured masses of peptides in a protein digest represent a different type of constraint for database searching. Such peptide mass profiles or fingerprints are determined in a single stage of mass spectrometry without CID. The list of observed peptide masses, together with auxiliary constraints including the estimated molecular weight of the unfragmented parent protein and the cleavage specificity of the protease used are then searched against sequence databases using any one of a number of available algorithms (Henzel et al. (1993) Proc. Natl. Acad. Sic. USA 90:5011-5015; Patterson et al. (1995) Electrophoresis 16:1791-1814, each incorporated herein). Peptide mass mapping identifies proteins without sequence specific information because the subset of peptide masses created by digestion of a protein with a specific protease defines the N- or C-terminal boundary of each fragment and thus provides a set of constraints unique to a given protein. The more accurately peptide masses are measured and the more peptide masses are detected from the same protein, the more conclusively the protein identity can be determined (Fenyö et al. (1998) Electrophoresis 19:998-1005, incorporated herein by reference). The peptide mass mapping approach has the advantage over the MS/MS strategy that the mass spectrometer operates in full scan mode (i.e., in a single stage) for the duration of the experiment, and should generally provide greater sensitivity. However, the method generally fails to identify the components of protein mixtures because it cannot be determined from which parent protein a specific peptide or set of peptides originated. Peptide mass fingerprinting is also incompatible with searching EST databases because it is unlikely that a sufficient number of peptide masses will match a single EST to provide an unambiguous correlation.
The present invention describes a class of reagents designated Isotope Distribution Encoded Tags (IDEnTs) and a method using the IDEnT concept for protein identification by accurate mass measurement of a single peptide, combining the strengths of the CID and peptide mass mapping approaches. Recent calculations for proteins expressed by the genomes of
E. coli
and
S. cerevisiae
, indicate that at 0.1 ppm mass accuracy 96% of the proteins will generate tryptic peptides with a unique mass, suggesting the feasibility of protein identification based on the mass of a single peptide. Inclusion of additional constraints such as the estimated molecular weight of the parent protein, the cleavage specificity of the protease used to digest and parent protein and the presence of an uncommon amino acid such as cysteine, methionine or tryptophane in the peptide sequence further enhances the stringency of the database search. Among these constraints the presence of cysteine in a peptide sequence is particularly attractive because the sulfhydryl side chain of cysteine residues is chemically distinct among amino acid residues and its presence significantly constrains the database search while still covering 92% of the open reading frames in yeast (Sechi and Chait (1998) Anal. Chem. 70:5150-5158). To employ this cysteine constraint for protein identification, it is essential that the cysteine-containing peptides be recognized in a peptide mixture. To this end a cysteine-specific alkylating reagent was synthesized which allows mass spectrometric identification of cysteine-containing peptides by the covalent addition of an isotope-distribution encoded tag or IDEnT (Lundell and Schreitmuller (1999) Anal. Biochem. 266:31-47).
SUMMARY OF THE INVENTION
The present invention describes an analytical strategy and the basic chemical concepts necessary to identify proteins in a sequence database from the accurately measured mass/charge of a single peptide using high-resolution mass spectrometry and a sequence constraint. This was achieved by covalently modifying peptides with a reagent specific for cysteine-containing peptides and that incorporates a non-native chemical element into the peptide such that the normal or expected isotope pattern for the peptide was changed. The process encodes the peptide with an isotope-distribution encoded tag (IDEnT) that can be decoded by high-resolution mass spectrometry. Once the IDEnT labeled peptide is decoded by visual inspection or computer
Aebersold Ruedi
Bruce James E.
Goodlett David R.
Rist Beate
Smith Richard D.
Brusca John S.
Greenlee Winner and Sullivan P.C.
University of Washington
Zhou Shubo
LandOfFree
Isotope distribution encoded tags for protein identification does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Isotope distribution encoded tags for protein identification, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Isotope distribution encoded tags for protein identification will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3044904