Method and apparatus for discovering patterns in a set of...

Data processing: measuring – calibrating – or testing – Measurement system in a specific environment – Biological or biochemical

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C702S019000

Reexamination Certificate

active

09712638

ABSTRACT:
Generally, the present invention provides a way of determining in an unsupervised manner additional members for a family that is defined initially through exemplar sequences. The present invention is unsupervised in that it proceeds without any information related to the exemplar sequences defining the family, without aligning the sequences, without prior knowledge of any patterns in the exemplar sequences, and without knowledge of the cardinality or characteristics of any features that may be present in the exemplar sequences. In one aspect of the invention, a method is used to take a set of unaligned sequences and discover several or many patterns common to some or all of the sequences. These patterns can then be used to determine if candidate sequences are members of the family. In another aspect of the invention, a method is used to take a set of sequences and to determine a set of maximal patterns common to a number of sequences. The maximal patterns are determined without any previous knowledge about any properties or features that may be present in the processed sequences.

REFERENCES:
Benson et al., Nucleic Acid Research, vol. 25, pp. 1-6, 1997.
Kleffe et al., Bioinformatics, vol. 14, pp. 232-243, 1998.
NCBI, NCBI News, pp. 1-18, Aug. 1996.
Altschul et al., Journal of Molecular Biology, vol. 215, pp. 403-410, 1990.
JCBN, Amino Acids and Peptides Home Page, pp. 1-10, 1983.
Wu et al., Bioinformatics, vol. 16, No. 3, 2000, pp. 233-244.
Stormo, G., Bioinformatics, vol. 16, No. 1, 2000, pp. 16-23.
Attwood et al., “The PRINTS Protein Fingerprint Database in its Fifth Year,” Nucleic Acids Res., 26(1):304-308, (1998).
Bailey et al., “The Value of Prior Knowledge in Discovering Motifs with MEME,” In Proceedings of the Third International Conference on Intelligent Systems for Molecular Biology (ISMB '95), Menlo Park, California, AAAI Press, (1995).
Bairoch et al., “The PROSITE Database, its Status in 1997,” Nucleic Acids Res., 25(1):217-221, (1997).
Bork et al., “Applying motif and Profile Searches,” Methods Enzymol., 266:162-184, (1996).
Gao et al, “Motif Detection in Protein Sequences,” In Proceedings of SPRIRE'99, 63-72, (1999).
Grundy et al., “Meta-MEME: Motif-based Hidden Markov Models of Protein Families,” Computer Applications in the Biological Sciences (CABIOS), 13:397-406, (1997).
Henikoff et al., “Blocks Database and its Applications,” Methods Enzymol., 266:88-105, (1996).
Nevill-Manning et al., “Highly Specific Protein Sequence Motifs for Genome Analysis,” Proc. Natl. Acad. Sci. USA, 95(11):5865-5871, (1998).
Ogiwara et al., “Construction of a Dictionary of Sequence Motifs that Characterize Groups of Related Proteins,” Protein Eng., 5(6):479-488, (1992).
Rigoutsos et al., “Dictionary Building Via Unsupervised Hierarchical Motif Discovery In the Sequence Space Of Natural Proteins,” Proteins: Structure, Function and Genetics, 37(2): 264-277, (1999).
Saqi et al., “Identification of Sequence Motifs from a Set of Proteins with Related function,” Protein Engineering, 7(2):165-71, (1994).
Sonnhammer et al., “Pfam: A Comprehensive Database of Protein Domain Families Based on Seed Alignments,” Proteins, 28(3):405-420, (1997).
Tatusov et al., “A Genomic Perspective on Protein Families,” Science, 278(5338):631-637, (1997).
Altschul et al., “Basic Local Alignment Search Tool,” Academic Pres Limited, J. Mol. Biol. 215, pp. 403-410 (1990).
Altschul et al., “Issues in Searching Molecular Sequence Databases,” Nature Genetics, 6:119-129 (Feb. 1994).
Califano et al., “FLASH: A Fast Look-Up Algorithm for String Homology,” Proceedings 1993 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York, pp. 353-359 (Jun. 15-18, 1993).
Coulson et al., “Protein and Nucleic Acid Sequence Database Searching: A Suitable Case for Parallel Processing,” The Computer Journal, vol. 30, No. 5, pp. 420-424 (1987).
Lipman et al., “Rapid and Sensitive Protein Similarity Searches,” Science, vol. 227, No. 4693, pp. 1435-1441 (Mar. 22, 1985).
Neuwald et al., “Detecting Patterns in Protein Sequences,” Academic Press Limited, J. Mol. Biol. 239, pp. 698-712 (1994).
Pearson et al., “Improved Tools for Biological Sequence Comparison,” Proc. Nat'l Acad. Sci. USA, vol. 85, pp. 2444-2448 (Apr. 1988).
Rigoutsos et al., “Combinatorial Pattern Discovery in Biological Sequences: The TEIRESEAS Algorithm,” Oxford University Press, Bioinformatics, vol. 14, No. 1, pp. 5-67 (1998).
Rigoutsos et al., “Motif Discovery Without Alignment or Enumeration,” RECOMB, New York, pp. 221-227 (1998).
Suyama et al., “Searching for Common Sequence Patterns Among Distantly Related Proteins, Protein Engineering, vol. 8, No. 11, pp. 1075-1080 (1995).
Yamaguchi et al., “Protein Motif Discovery from Amino Acid Sequences by Sets of Regular Patterns,” Academic Publications, Information Research Report, 95(76):95-FI-38 (Jul. 1995).
“Part II, Sequence Analysis,” Chapter 9, Pattern Discovery, pp. 130-169 (Sep. 1993).

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for discovering patterns in a set of... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for discovering patterns in a set of..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for discovering patterns in a set of... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3753535

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.