Systems and methods for tokenizing and interpreting uniform...

Data processing: database and file management or data structures – Database and file access – Search engines

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S755000

Reexamination Certificate

active

08001106

ABSTRACT:
Aspects include methods, computer readable storing instructions for such methods, and systems for processing text strings such as URLs that comprise patterns of parameters and values for such parameters, delimited in a site-specific manner. Such aspects provide for accepting a number of text strings that are expected to have a common delimiting strategy, then deeply tokenizing those text strings to arrive at a set of tokens from which are selected anchor tokens used to form patterns having the anchor tokens separated by wildcard portions for recursive processing. The patterns formed can be mapped to a tree of nodes. Information concerning relationships between nodes and between tokens within a given node, as well as other heuristics concerning which tokens are parameters and which are values can be used as observed events for producing probabilities that certain tokens are parameters or values, using a dynamic programming algorithm, such as a Viterbi algorithm.

REFERENCES:
patent: 7680785 (2010-03-01), Najork
patent: 2004/0054750 (2004-03-01), de Jong et al.
patent: 2008/0010291 (2008-01-01), Poola et al.
patent: 2008/0010292 (2008-01-01), Poola
patent: 2009/0164485 (2009-06-01), Burke et al.
Z. Bar-Yossef, I. Keidar and U. Schonfeld, “Do not Crawl in the DUST: Different URLs with Similar Text Extended Abstract,” Proceedings of the 15th international conference on World Wide Web, May 22-26, 2006, Edinburgh, Scotland , pp. 1015-1016.
S.H. Lee, S.J. Kim and S.H. Hong, “On URL Normalization,” O. Gervasi et al. (Eds.): ICCSA 2005, LNCS 3481, pp. 1076-1085, 2005, Springer-Verlag Berlin Heidelberg 2005. (Available at http://dblab.soongsil.ac.kr/publication/LeKi05a.pdf, last visited on Jun. 19, 2008.).
P.-J. Yeh, J.-T. Li and S.-M. Yuan, “Tracking the Changes of Dynamic Web Pages in the Existence of URL Rewriting,” Proceedings of the fifth Australasian conference on Data Mining and Analystics, vol. 61, Sydney, Australia, pp. 169-176, 2006.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Systems and methods for tokenizing and interpreting uniform... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Systems and methods for tokenizing and interpreting uniform..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Systems and methods for tokenizing and interpreting uniform... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2620949

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.