Apparatus and method for estimating, from sparse data, the proba

Boots – shoes – and leggings

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

381 43, G10L 100

Patent

active

048315508

ABSTRACT:
Apparatus and method for evaluating the likelihood of an event (such as a word) following a string of known events, based on event sequence counts derived from sparse sample data. Event sequences--or m-grams--include a key and a subsequent event. For each m-gram is stored a discounted probability generated by applying modified Turing's estimate, for example, to a count-based probability. For a key occurring in the sample data there is stored a normalization constant which preferably (a) adjusts the discounted probabilities for multiple counting, if any, and (b) includes a freed probability mass allocated to m-grams which do not occur in the sample data. To determine the likelihood of a selected event following a string of known events, a "backing off" scheme is employed in which successively shorter keys (of known events) followed by the selected event (representing m-grams) are searched until an m-gram is found having a discounted probability stored therefor. The normalization constants of the longer searched keys--for which the corresponding m-grams have no stored discounted probability--are combined together with the found discounted probability to produce the likelihood of the selected event being next.

REFERENCES:
patent: 3188609 (1965-06-01), Harmon et al.
patent: 3925761 (1975-12-01), Chaires et al.
patent: 3969700 (1976-07-01), Bollinger et al.
patent: 4038503 (1977-07-01), Moshier
patent: 4156868 (1979-05-01), Levinson
patent: 4277644 (1981-07-01), Levinson et al.
patent: 4400788 (1983-08-01), Myers et al.
patent: 4435617 (1984-03-01), Griggs
patent: 4489435 (1984-12-01), Moshier
patent: 4530110 (1975-07-01), Nojiri et al.
patent: 4538234 (1985-08-01), Honda
Interpolation of Estimators Derived from Sparse Data, L. R. Bahl et al., IBM Technical Disclosure Bulletin, vol. 24 No. 4, Sep. 1981.
Variable N-Gram Method for Statistical Language Processing, F. J. Damerau, IBM Technical Disclosure Bulletin, vol. 24 No. 11A, Apr. 1982.
Probability Distribution Estimation from Sparse Data, F. Jelinek et al., IBM Technical Disclosure Bulletin, vol. 28 No. 6, Nov. 1985.
Recursive Self-Smoothing of Linguistic Contingency Tables, A. J., Nadas, IBM Technical Disclosure Bulletin, vol. 27 No. 7B, Dec. 1984.
Proceedings of the IEEE, vol. 73, No. 11, Nov. 1985, pp. 1616-1624 F. Jelinek: "The Development of an Experimental Discrete Dictation Recognizer".

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Apparatus and method for estimating, from sparse data, the proba does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Apparatus and method for estimating, from sparse data, the proba, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Apparatus and method for estimating, from sparse data, the proba will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2328667

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.