Method and apparatus providing capitalization recovery for text

Data processing: presentation processing of document – operator i – Presentation processing of document – Layout

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C715S252000, C715S252000, C704S251000

Reexamination Certificate

active

06922809

ABSTRACT:
A method for capitalizing text in a document includes processing a reference corpus to construct a plurality of dictionaries of capitalized terms, where the plurality of dictionaries include a singleton dictionary and a phrase dictionary. Each record in the singleton dictionary contains a word in lowercase, a range of phrase lengths m:n for capitalized phrases that the word begins, where m is a minimum phrase length and n is a maximum phrase length, and where each record in the phrase dictionary includes a multi-word phrase in lowercase. The method adds proper capitalization to an input monocase document by capitalizing words found in mandatory capitalization positions; and by looking up each word in the singleton dictionary and, if the word is found in the singleton dictionary, testing the corresponding phrase length range. If the phrase length range indicates that the word does not start a multi-word phrase, the method capitalizes the word, while if the phrase length range indicates that the word does start a multi-word phrase, the method tests the word and an indicated plurality of next words as a candidate phrase to determine if the candidate phrase is found in the phrase dictionary and, if it is, capitalizes the words of the multi-word phrase. If the candidate phrase is not found in the phrase dictionary, the method changes the number of words in the candidate phrase (e.g., decrements by one) to form a revised candidate phrase, and determines whether the revised candidate phrase is found in the phrase dictionary.

REFERENCES:
patent: 5761689 (1998-06-01), Rayson et al.
patent: 6012088 (2000-01-01), Li et al.
David C. Gibbon, ‘Automated Authoring of Hypermedia Documents of Video Program’, ACM Multimedia 1995, Electronic Proceedings, pp. 1-12.
“A Parser for Real-Time Speech Synthesis of Conversational Texts” Bachenko et al., AT &T Bell Laboratories, date unknown, pp. 25-32.
“Automated Authoriing of Hypermedia Documents of Video Programs” Shahraray et al., ACM Multimedia 95—Electronic Proceedings, Nov. 5-9 1995, 12 pages.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus providing capitalization recovery for text does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus providing capitalization recovery for text, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus providing capitalization recovery for text will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3374640

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.