Chemistry: molecular biology and microbiology – Measuring or testing process involving enzymes or... – Involving antigen-antibody binding – specific binding protein...
Reexamination Certificate
2000-04-06
2002-12-03
Jones, W. Gary (Department: 1655)
Chemistry: molecular biology and microbiology
Measuring or testing process involving enzymes or...
Involving antigen-antibody binding, specific binding protein...
C250S281000, C250S282000, C250S287000, 36, C702S022000
Reexamination Certificate
active
06489121
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to methods of identifying a protein, polypeptide or peptide by means of mass spectrometry and especially by tandem mass spectrometry (MS/MS). Preferred methods relate to the use of mass spectral data to identify an unknown protein where sequence is at least partially present in an existing database.
2. Discussion of the Prior Art
Although several well-established chemical methods for the sequencing of peptides, polypeptides and proteins are known (for example, the Edman degradation), mass spectrometric methods are becoming increasingly important in view of their speed and ease of use. Mass spectrometric methods have been developed to the point at which they are capable of sequencing peptides in a mixture without any prior chemical purification or separation, typically using electrospray ionization and tandem mass spectrometry (MS/MS). For example, see Yates III (J. Mass Spectrom, 1998 vol. 33 pp. 1-19), Papayannopoulos (Mass Spectrom. Rev. 1995, vol. 14 pp. 49-73), and Yates III, McCormack, and Eng (Anal. Chem. 1996 vol. 68 (17) pp. 534A-540A). Thus, in a typical MS/MS sequencing experiment, molecular ions of a particular peptide are selected by the first mass analyzer and fragmented by collisions with neutral gas molecules in a collision cell. The second mass analyzer is then used to record the fragment ion spectrum that generally contains enough information to allow at least a partial, and often the complete, sequence to be determined.
Unfortunately, however, the interpretation of the fragment spectra is not straightforward. Manual interpretation (see, for example, Hunt, Yates III, et al, Proc. Nat. Acad. Sci. USA, 1986, vol. 83 pp 6233-6237 and Papayannopoulos, ibid) requires considerable experience and is time consuming. Consequently, many workers have developed algorithms and computer programs to automate the process, at least in part. The nature of the problem, however, is such that none of those so far developed are able to provide in reasonable time complete sequence information without either requiring some prior knowledge of the chemical structure of the peptide or merely identifying likely candidate sequences in existing protein structure databases. The reason for this will be understood from the following discussion of the nature of the fragment spectra produced.
Typically, the fragment spectrum of a peptide comprises peaks belonging to about half a dozen different ion series each of which correspond to different modes of fragmentation of the peptide parent ion. Each typically (but not invariably) comprises peaks representing the loss of successive amino acid residues from the original peptide ion. Because all but two of the 20 amino acids from which most naturally occurring proteins are comprised have different masses, it is therefore possible to establish the sequence of amino acids from the difference in mass of peaks in any given series which correspond to the successive loss of an amino acid residue from the original peptide. However, difficulties arise in identifying to which series an ion belongs and from a variety of ambiguities that can arise in assigning the peaks, particularly when certain peaks are either missing or unrecognized. Moreover, other peaks are typically present in a spectrum due to various more complicated fragmentation or rearrangement routes, so that direct assignment of ions is fraught with difficulty. Further, electrospray ionization tends to produce multiply charged ions that appear at correspondingly rescaled masses, which further complicates the interpretation of the spectra. Isotopic clusters also lead to proliferation of peaks in the observed spectra. Thus, the direct transformation of a mass spectrum to a sequence is only possible in trivially small peptides.
The reverse route, transforming trial sequences to predicted spectra for comparison with the observed spectrum, should be easier, but has not been fully developed. The number of possible sequences for any peptide (20
n
, where n is the number of amino acids comprised in the peptide) is very large, so the difficulty of finding the correct sequence for, say, a peptide of a mere 10 amino acids (20
10
=10
13
possible sequences) will be appreciated. The number of potential sequences increases very rapidly both with the size of the peptide and with the number (at least 20) of the residues being considered.
Details of the first computer programs for predicting probable amino acid sequences from mass spectral data appeared in 1984 (Sakurai, Matsuo, Matsuda, Katakuse, Biomed. Mass Spectrom, 1984, vol. 11 (8) pp 397-399). This program (PAAS3) searched through all the amino acid sequences whose molecular weights coincided with that of the peptide being examined and identified the most probable sequences with the experimentally observed spectra. Hamm, Wilson and Harvan (CABIOS, 1986 vol. 2 (2) pp 115-118) also developed a similar program.
However, as pointed out by Ishikawa and Niwa (Biomed. and Environ. Mass Spectrom. 1986, vol. 13 pp 373-380), this approach is limited to peptides not exceeding 800 daltons in view of the computer time required to carry out the search. Parekh et al in UK patent application 2,325,465 (published November 1998) have resurrected this idea and give an example of the sequencing of a peptide of 1000 daltons which required 2×10
6
possible sequences to be searched, but do not specify the computer time required. Nevertheless, despite the increase in the processing speed of computers between 1984 and 1999, a simple search of all possible sequences for a peptide of molecular weights greater than 1200 daltons is still impractical in a reasonable time using the personal computer typically supplied for data processing with most commercial mass spectrometers.
This problem has long been recognized and several approaches to rendering the problem more tractable have been described. One of the most successful has been to correlate the mass spectral data with the known amino acid sequences comprised in a protein database rather than with every possible sequence. In the prior method known as peptide mass mapping, a protein may be identified by merely determining the molecular weights of the peptides produced by digesting it with a site-specific protease and comparing the molecular weights with those predicted from known proteins in a database. (See, for example, Yates, Speicher, et al in Analytical Biochemistry, 1993 vol 214 pp 397-408). However, mass mapping is ineffective if a protein or peptide comprises only a small number of amino acids residues or possible fragments, and is inapplicable if information about the actual amino acid sequences is required. As explained, tandem mass spectrometry (MS/MS) can be used to provide such sequence information. MS/MS spectra usually contain enough detail to allow a peptide to be at least partially, and often completely sequenced without reference to any database of known sequences (See copending application GB 9907810.7, filed Apr. 6, 1999). There are, however, many circumstances where it is adequate, or even preferred, to establish sequences by reference to an existing database. Such methods were pioneered by Yates, et al, see, for example, PCT application 95/25281, Yates (J. Mass Spectrom 1998 vol 33 pp 1-19), Yates, Eng et al (Anal. Chem. 1995 vol 67 pp 1426-33). Other workers, including Mortz et al (Proc. Nat. Acad. Sci. USA, 1996 vol 93 pp 8264-7), Figeys, et al (Rapid Commun. Mass Spectrom. 1998 vol 12 pp 1435-44), Jaffe, et al, (Biochemistry, 1998 vol 37 pp 16211-24), Amot et al (Electrophoresis, 1998 vol 19 pp 968-980) and Shevchenko et al (J. Protein Chem. 1997 vol 16 (5) pp 481-490) report similar approaches.
As explained, it is generally easier to predict a fragmentation mass spectrum from a given amino acid sequence than to carry out the reverse procedure when comparing experimental MS data with sequence databases. A “fragmentation model” that describes the various ways in which a given amino acid sequence may fragment is therefore
Chakrabarti Arun K.
Diederiks & Whitelaw PLC
Jones W. Gary
Micromass Limited
LandOfFree
Methods of identifying peptides and proteins by mass... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Methods of identifying peptides and proteins by mass..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Methods of identifying peptides and proteins by mass... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2918970