Data processing: speech signal processing – linguistics – language – Linguistics – Natural language
Reexamination Certificate
2006-08-15
2006-08-15
Young, W. R. (Department: 2655)
Data processing: speech signal processing, linguistics, language
Linguistics
Natural language
Reexamination Certificate
active
07092871
ABSTRACT:
The present invention is a segmenter used in a natural language processing system. The segmenter segments a textual input string into tokens for further natural language processing. In accordance with one feature of the invention, the segmenter includes a tokenizer engine that proposes segmentations and submits them to a linguistic knowledge component for validation. In accordance with another feature of the invention, the segmentation system includes language-specific data that contains a precedence hierarchy for punctuation. If proposed tokens in the input string contain punctuation, they can illustratively be broken into subtokens based on the precedence hierarchy.
REFERENCES:
patent: 5634084 (1997-05-01), Malsheen et al.
patent: 5806021 (1998-09-01), Chen et al.
patent: 5870700 (1999-02-01), Parra
patent: 5890103 (1999-03-01), Carus
patent: 5963742 (1999-10-01), Williams
patent: 6016467 (2000-01-01), Newsted et al.
patent: 6185524 (2001-02-01), Carus et al.
patent: 6269189 (2001-07-01), Chanod
patent: 6289304 (2001-09-01), Grefenstette
patent: 6523172 (2003-02-01), Martinez-Guerra et al.
patent: 6539348 (2003-03-01), Bond et al.
patent: 0 971 294 (1996-07-01), None
patent: WO 00/11576 (2000-03-01), None
Multilingual Text Analysis for Text-to-Speech Synthesis by Richard Sproat © 1996 ECAI 96. 12th European Conference on Artificial Intelligence.
Bradlee David G.
Knoll Sonja S.
Pentheroudakis Joseph E.
Kelly Joseph R.
Microsoft Corporation
Westman Champlin & Kelly P.A.
Wozniak James S.
Young W. R.
LandOfFree
Tokenizer for a natural language processing system does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Tokenizer for a natural language processing system, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Tokenizer for a natural language processing system will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3623406