Method and apparatus for dictionary sorting

Data processing: speech signal processing – linguistics – language – Linguistics – Natural language

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C704S010000

Reexamination Certificate

active

06466902

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The invention generally relates to the field of data processing and more particularly to dictionary sorting of data.
2. Background Information
Sorting in general is well developed and optimized for putting a sequence of numbers into increasing or decreasing numerical order. See for instance Numerical Recipes in C, Chapter 8 (Sorting), (WILLIAM H. PRESS, et al., NUMERICAL RECIPES IN C, Cambridge University Press, 1988). Sorting routines for use in sorting other forms of data are often derived from the routines developed for sorting numbers. However, routines thus derived typically do not give the optimal solutions to the problems associated with sorting non-numeric data. Non-numeric data typically has special characteristics that make it poorly suited for use with routines derived from numerical sorting routines.
For example, textual data is formed in characters, and an often used sorting order for textual data is dictionary order. When two words or sentences are compared, the first characters of each word are compared first, then the second characters of each word are compared if the first characters were the same, and so forth. Thus, one comparison of text is constructed of several numerical comparisons. What is needed is a method of sorting that takes advantage of the characteristics of textual data.
Moreover, dictionary sorting is an integral part of the Burrows-Wheeler transform as described by Burrows and Wheeler, (M. Burrows and D. J. Wheeler,
A Block
-
sorting Lossless Data Compression Algorithm
, Digital Systems Research Center Research Report 124, http://gatekeeper.dec.com/pub/DEC/SRC/research-reports/abstracts/src-rr-124.html). Implementing this transform efficiently requires use of a method of sorting that is close to optimum for dictionary sorting of text. Thus, what is needed is a more optimal method of sorting textual data than the methods derived from methods of sorting numerical data.
SUMMARY OF THE INVENTION
The invention involves a method of sorting a text document, the text document composed of a sequence of characters. The method comprises counting each character of the sequence of characters pointed to by a marker. The method further comprises sorting markers for each character into a set of groups, each group corresponding to a distinct value of the characters in the sequence of characters, the groups created based on the count of each distinct value of the characters in the sequence of characters. The method further comprises repeating for each group of the set of groups containing more than one marker, counting each character following the character previously counted for that marker, and sorting the markers within each group into further groups of the set of groups, each further group of the set of groups corresponding to a distinct value of the characters in the sequence of characters, each further group of the set of groups created based on the count of each distinct value of the characters in the sequence of characters, until no group contains more than one marker.


REFERENCES:
patent: 4295206 (1981-10-01), Caid et al.
patent: 5371673 (1994-12-01), Fan
patent: 5551018 (1996-08-01), Hansen
patent: 5675818 (1997-10-01), Kennedy
patent: 5787426 (1998-07-01), Koshiba et al.
patent: 5937422 (1999-08-01), Nelson et al.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for dictionary sorting does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for dictionary sorting, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for dictionary sorting will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2986753

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.