Data processing: speech signal processing – linguistics – language – Linguistics – Dictionary building – modification – or prioritization
Reexamination Certificate
2007-09-24
2011-11-08
Sked, Matthew (Department: 2626)
Data processing: speech signal processing, linguistics, language
Linguistics
Dictionary building, modification, or prioritization
C704S001000, C707S780000, C707S797000, C707S798000, C707S808000
Reexamination Certificate
active
08055498
ABSTRACT:
The present invention automatically builds a contracted dictionary from a given list of multi-word proper names and performs fuzzy searches in the contracted dictionary. The contracted dictionary of proper names includes two linked trie-based dictionaries: a first dictionary is used to store single word names, each word name having an ID number; and a second dictionary is used to store multi-word names encoded with ID numbers. Information related to the multi-word names is also stored as a gloss to the terminal node of the multi-word entry of the trie-based dictionary. An approximate lookup for a multi-word name is conducted first for each word of the multi-word name using an approximate matching technique such as a phonetic proximity or a simple edit distance. Accordingly, N suggestions is determined for each word of the multi-word name under consideration. Then, multi-word candidates are assembled in ID notation. Finally, an approximate search for each assembled candidate is performed based on an edit distance or a n-grams approximate string matching. Edit distances and N-grams are used to measure how similar two strings are. The result is a set of multi-word suggestions in an ID notation. This ID notation is encoded back to the original form using the first trie-based dictionary.
REFERENCES:
patent: 4672571 (1987-06-01), Bass et al.
patent: 5893102 (1999-04-01), Maimone et al.
patent: 6298321 (2001-10-01), Karlov et al.
patent: 6895377 (2005-05-01), Kroeker et al.
patent: 6912516 (2005-06-01), Ikeda et al.
patent: 7013304 (2006-03-01), Schuetze et al.
patent: 7222067 (2007-05-01), Glushnev et al.
patent: 7848926 (2010-12-01), Goto et al.
patent: 2006/0004744 (2006-01-01), Nevidomski et al.
patent: 2006/0277032 (2006-12-01), Hernandez-Abrego et al.
patent: 2006/0293880 (2006-12-01), Elshishiny et al.
patent: 0179215 (1986-04-01), None
patent: 1197885 (2002-04-01), None
Oerder, M. et al. “Word graphs: an efficient interface between continuous-speech recognition and language understanding,” Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., 1993 IEEE International Conference on, pp. 119-122 vol. 2.
Morimoto, Katsushi, Iriguchi, Hirokazu, Aoe, Jun-Ichi, “A Dictionary Retrieval Algorithm Using Two Trie Structures,” Systems & Computers in Japan; Feb. 1, 1995, vol. 26 Issue 2, p. 85-97, 13p.
PCT, International Searching Authority, International Search Report and the Written Opinion, International Application No. PCT/EP2007/054293, Date of Mailing Jan. 4, 2008.
El-Shishiny Hisham
Volkov Pavel
Bauer Andrea
Hoffman Warnick LLC
International Business Machines - Corporation
Sked Matthew
LandOfFree
Systems and methods for building an electronic dictionary of... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Systems and methods for building an electronic dictionary of..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Systems and methods for building an electronic dictionary of... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-4293631