Data processing: speech signal processing – linguistics – language – Linguistics – Natural language
Reexamination Certificate
2007-09-11
2007-09-11
Edouard, Patrick N. (Department: 2626)
Data processing: speech signal processing, linguistics, language
Linguistics
Natural language
C704S001000
Reexamination Certificate
active
11182477
ABSTRACT:
The present invention is a segmenter used in a natural language processing system. The segmenter segments a textual input string into tokens for further natural language processing. In accordance with one feature of the invention, the segmenter includes a tokeinzer engine that proposes segmentations and submits them to a linguistic knowledge component for validation. In accordance with another feature of the invention, the segmentation system includes language specific data that contains a precedence hierarchy for punctuation. If proposed tokens in the input string contain punctuation, they can illustratively be broken into subtokens based on the precedence hierarchy.
REFERENCES:
patent: 4777617 (1988-10-01), Frisch et al.
patent: 5225981 (1993-07-01), Yokogawa
patent: 5487147 (1996-01-01), Brisson
patent: 5634084 (1997-05-01), Malsheen et al.
patent: 5806021 (1998-09-01), Chen et al.
patent: 5828991 (1998-10-01), Skiena et al.
patent: 5890103 (1999-03-01), Carus
patent: 6185524 (2001-02-01), Carus et al.
patent: 6269189 (2001-07-01), Chanod
patent: 6289304 (2001-09-01), Grefenstette
patent: 6374210 (2002-04-01), Chu
patent: 6401060 (2002-06-01), Critchlow et al.
patent: 6442524 (2002-08-01), Ecker et al.
patent: 6539348 (2003-03-01), Bond et al.
patent: 0 971 294 (1996-07-01), None
patent: WO 00/11576 (2000-03-01), None
Kawtrakul et al. “A Gradual Refinement Model for a Robust Thai Morphological Analyser.” In Pro-ceedings of the 16th International Conferenceon Computational Linguistics (COLING-96),Copenhagen, Denmark, 1996, pp. 1086-1089.
Habert et al. “Towards Tokenization Evaluation.” In First International Conference on Language Resources and Evaluation (LREC'98).Grenade, Espagne, ELRA, 1998, pp. 427-431.
Briscoe. “Parsing (with) Punctuation etc”, Research Paper, Rank Xerox Research Centre, Grenoble, 1994.
Jones. “Can punctuation help parsing?” In Proceedings of the International Conference on Computational Linguistics, COLING-94, Kyoto, Japan, 1994.
Multilingual Text Analysis for Text-to-Speech Synthesis by Richard Sproat © 1996 ECAI 96. 12th European Conference on Artificial Intelligence.
Bradlee David G.
Knoll Sonja S.
Pentheroudakis Joseph E.
Edouard Patrick N.
Kelly Joseph R.
Microsoft Corporation
Westman Champlin & Kelly P.A.
Wozniak James S.
LandOfFree
Tokenizer for a natural language processing system does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Tokenizer for a natural language processing system, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Tokenizer for a natural language processing system will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3744048