Method and apparatus for breaking words in a stream of text

Data processing: speech signal processing – linguistics – language – Linguistics – Natural language

Patent

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

G06F 1538

Patent

active

06035268&

ABSTRACT:
A word breaker utilizing a lexicon module and a processing module to identify word breaks in a stream of Asian (e.g. Japanese, Chinese, or Korean) language text. The lexicon module is a dictionary or database containing words native to the language of the input text. The processing module includes a plurality of analysis modules which operate on the input text. In particular, the processing module can include modules that analyze the input text using heuristic rules and statistical analysis to identify a first set of work breaks, thereby reducing the size of segments with undefined word breaks. The processing module also includes a database analysis module that identifies the remaining undefined word breaks in those smaller segments that have undergone heuristic or statistical analysis.

REFERENCES:
patent: 5029084 (1991-07-01), Morohasi et al.
patent: 5268840 (1993-12-01), Chang et al.
patent: 5299124 (1994-03-01), Fukumochi et al.
patent: 5548507 (1996-08-01), Martino et al.
patent: 5598518 (1997-01-01), Saito
Teller, V. et al., "A Probabilistic Algorithm for Segmenting Non-Kanji Japanese Strings," Proceedings of the 12th National Conference on Artifical Intelligence, vol. 1, 742-7 (Jul. 31-Aug. 4, 1994).

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for breaking words in a stream of text does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for breaking words in a stream of text, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for breaking words in a stream of text will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-372202

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.