System and method for providing lossless compression of n-gram l

Data processing: speech signal processing – linguistics – language – Linguistics – Natural language

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

704257, G06F 1727, G06F 1728

Patent

active

060920386

ABSTRACT:
System and methods for compressing (losslessly) n-gram language models for use in real-time decoding, whereby the size of the model is significantly reduced without increasing the decoding time of the recognizer. Lossless compression is achieved using various techniques. In one aspect, n-gram records of an N-gram language model are split into (i) a set of common history records that include subsets of n-tuple words having a common history and (ii) sets of hypothesis records that are associated with the common history records. The common history records are separated into a first group of common history records each having only one hypothesis record associated therewith and a second group of common history records each having more than one hypothesis record associated therewith. The first group of common history records are stored together with their corresponding hypothesis record in an index portion of a memory block comprising the N-gram language model and the second group of common history records are stored in the index together with addresses pointing to a memory location having the corresponding hypothesis records. Other compression techniques include, for instance, mapping word records of the hypothesis records into word numbers and storing a difference value between subsequent word numbers; segmenting the addresses and storing indexes to the addresses in each segment to multiples of the addresses; storing word records and probability records as fractions of bytes such that each pair of word-probability records occupies a multiple of bytes and storing flags indicating the length; and storing the probability records as indexes to sorted count values that are used to compute the probability on the run.

REFERENCES:
patent: 4342085 (1982-07-01), Glickman, et al.
patent: 5467425 (1995-11-01), Lau, et al.
patent: 5649060 (1997-07-01), Ellozy et al.
patent: 5724593 (1998-03-01), Hargrave, III, et al.
patent: 5794249 (1998-08-01), Orsolono et al.
patent: 5835888 (1998-11-01), Kanevshy et al.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for providing lossless compression of n-gram l does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method for providing lossless compression of n-gram l, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for providing lossless compression of n-gram l will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2047487

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.