Tokenizer for a natural language processing system

Data processing: speech signal processing – linguistics – language – Linguistics – Natural language

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

07092871

ABSTRACT:
The present invention is a segmenter used in a natural language processing system. The segmenter segments a textual input string into tokens for further natural language processing. In accordance with one feature of the invention, the segmenter includes a tokenizer engine that proposes segmentations and submits them to a linguistic knowledge component for validation. In accordance with another feature of the invention, the segmentation system includes language-specific data that contains a precedence hierarchy for punctuation. If proposed tokens in the input string contain punctuation, they can illustratively be broken into subtokens based on the precedence hierarchy.

REFERENCES:
patent: 5634084 (1997-05-01), Malsheen et al.
patent: 5806021 (1998-09-01), Chen et al.
patent: 5870700 (1999-02-01), Parra
patent: 5890103 (1999-03-01), Carus
patent: 5963742 (1999-10-01), Williams
patent: 6016467 (2000-01-01), Newsted et al.
patent: 6185524 (2001-02-01), Carus et al.
patent: 6269189 (2001-07-01), Chanod
patent: 6289304 (2001-09-01), Grefenstette
patent: 6523172 (2003-02-01), Martinez-Guerra et al.
patent: 6539348 (2003-03-01), Bond et al.
patent: 0 971 294 (1996-07-01), None
patent: WO 00/11576 (2000-03-01), None
Multilingual Text Analysis for Text-to-Speech Synthesis by Richard Sproat © 1996 ECAI 96. 12th European Conference on Artificial Intelligence.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Tokenizer for a natural language processing system does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Tokenizer for a natural language processing system, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Tokenizer for a natural language processing system will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3623406

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.