Method for identifying sub-sequences of interest in a sequence

Data processing: measuring – calibrating – or testing – Measurement system in a specific environment – Biological or biochemical

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

08046173

ABSTRACT:
The present technique provides for the analysis of a data series to identify sequences of interest within the series. The analysis may be used to iteratively update a grammar used to analyze the data series or updated versions of the data series. Furthermore, the technique provides for the calculation of a minimum description length heuristic, such as a symbol compression ratio, for each sub-sequence of the analyzed data sequence. The technique may then compare a selected heuristic value against one or more reference conditions to determine if additional iteration is to be performed. The grammar and the data sequence may be updated between iterations to include a symbol representing a string corresponding to the selected heuristic value based upon a non-termination result of the comparison. Alternatively, the string corresponding to the selected heuristic value may be identified as a sequence of interest based upon a termination result of the comparison.

REFERENCES:
patent: 5136686 (1992-08-01), Koza
patent: 5455577 (1995-10-01), Slivka et al.
patent: 5617552 (1997-04-01), Garber et al.
patent: 5703581 (1997-12-01), Matias et al.
patent: 5977890 (1999-11-01), Rigoutsos et al.
patent: 6373971 (2002-04-01), Floratos et al.
patent: 6473757 (2002-10-01), Garofalakis et al.
patent: 7043371 (2006-05-01), Wheeler
patent: 2004/0249574 (2004-12-01), Tishby et al.
Evans et al. (GE Global Research Technical Information Series Oct. 2002; Kolmogorov Complexity Estimation and Analysis).
Troyanskaya et al. Bioinformatics (2002) vol. 18, No. 5, pp. 679-688.
O. G. Troyanskaya et al.; Title: “Sequence Complexity Profiles of Prokaryotic Genome Sequences: A fast Algorithm for Calculating Linguistic Complexity”; BIOINFORMATICS; vol. 18 No. 5 2002; pp. 679-688.
Scott C. Evans; Title: “Kolmogorov Complexity Estimation and Application for Information System Security”; Rensselaer Polytechnic Institute; Troy, New York; Jul. 2003.
Moses Charikar et al., “Approximating the Smallest Grammar: Kolmogorov Complexity in Natural Models”, Proceedings of Annual ACM Symposium, Feb. 20, 2002, pp. 792-801.
S.C. Evans and S.F. Bush, “Symbol Compression Ratio for String Compression and Estimation of Kolmogorov Complexity”, 2001CRD159, Class 1, Nov. 2001, 16 Pages.
Craig G. Nevill-Manning and Ian H. Witten, “On-Line and Off-Line Heuristics for Inferring Hierarchies of Repetitions in Sequences”, Proceedings of the IEEE, vol. 88, No. 11, Nov. 2000, pp. 1745-1755.
Alberto Apostolico and Stefano Lonardi, “Off-Line Compression by Greedy Textual Substitution”, Proceedings of the IEEE, vol. 88, No. 11, Nov. 2000, pp. 1733-1744.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method for identifying sub-sequences of interest in a sequence does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method for identifying sub-sequences of interest in a sequence, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method for identifying sub-sequences of interest in a sequence will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-4295073

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.