Tokenizer for a natural language processing system

Data processing: speech signal processing – linguistics – language – Linguistics – Natural language

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C704S001000

Reexamination Certificate

active

11182477

ABSTRACT:
The present invention is a segmenter used in a natural language processing system. The segmenter segments a textual input string into tokens for further natural language processing. In accordance with one feature of the invention, the segmenter includes a tokeinzer engine that proposes segmentations and submits them to a linguistic knowledge component for validation. In accordance with another feature of the invention, the segmentation system includes language specific data that contains a precedence hierarchy for punctuation. If proposed tokens in the input string contain punctuation, they can illustratively be broken into subtokens based on the precedence hierarchy.

REFERENCES:
patent: 4777617 (1988-10-01), Frisch et al.
patent: 5225981 (1993-07-01), Yokogawa
patent: 5487147 (1996-01-01), Brisson
patent: 5634084 (1997-05-01), Malsheen et al.
patent: 5806021 (1998-09-01), Chen et al.
patent: 5828991 (1998-10-01), Skiena et al.
patent: 5890103 (1999-03-01), Carus
patent: 6185524 (2001-02-01), Carus et al.
patent: 6269189 (2001-07-01), Chanod
patent: 6289304 (2001-09-01), Grefenstette
patent: 6374210 (2002-04-01), Chu
patent: 6401060 (2002-06-01), Critchlow et al.
patent: 6442524 (2002-08-01), Ecker et al.
patent: 6539348 (2003-03-01), Bond et al.
patent: 0 971 294 (1996-07-01), None
patent: WO 00/11576 (2000-03-01), None
Kawtrakul et al. “A Gradual Refinement Model for a Robust Thai Morphological Analyser.” In Pro-ceedings of the 16th International Conferenceon Computational Linguistics (COLING-96),Copenhagen, Denmark, 1996, pp. 1086-1089.
Habert et al. “Towards Tokenization Evaluation.” In First International Conference on Language Resources and Evaluation (LREC'98).Grenade, Espagne, ELRA, 1998, pp. 427-431.
Briscoe. “Parsing (with) Punctuation etc”, Research Paper, Rank Xerox Research Centre, Grenoble, 1994.
Jones. “Can punctuation help parsing?” In Proceedings of the International Conference on Computational Linguistics, COLING-94, Kyoto, Japan, 1994.
Multilingual Text Analysis for Text-to-Speech Synthesis by Richard Sproat © 1996 ECAI 96. 12th European Conference on Artificial Intelligence.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Tokenizer for a natural language processing system does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Tokenizer for a natural language processing system, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Tokenizer for a natural language processing system will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3744048

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.