Data processing: presentation processing of document – operator i – Presentation processing of document – Layout
Reexamination Certificate
2005-07-26
2005-07-26
Hong, Stephen (Department: 2178)
Data processing: presentation processing of document, operator i
Presentation processing of document
Layout
C715S252000, C715S252000, C704S251000
Reexamination Certificate
active
06922809
ABSTRACT:
A method for capitalizing text in a document includes processing a reference corpus to construct a plurality of dictionaries of capitalized terms, where the plurality of dictionaries include a singleton dictionary and a phrase dictionary. Each record in the singleton dictionary contains a word in lowercase, a range of phrase lengths m:n for capitalized phrases that the word begins, where m is a minimum phrase length and n is a maximum phrase length, and where each record in the phrase dictionary includes a multi-word phrase in lowercase. The method adds proper capitalization to an input monocase document by capitalizing words found in mandatory capitalization positions; and by looking up each word in the singleton dictionary and, if the word is found in the singleton dictionary, testing the corresponding phrase length range. If the phrase length range indicates that the word does not start a multi-word phrase, the method capitalizes the word, while if the phrase length range indicates that the word does start a multi-word phrase, the method tests the word and an indicated plurality of next words as a candidate phrase to determine if the candidate phrase is found in the phrase dictionary and, if it is, capitalizes the words of the multi-word phrase. If the candidate phrase is not found in the phrase dictionary, the method changes the number of words in the candidate phrase (e.g., decrements by one) to form a revised candidate phrase, and determines whether the revised candidate phrase is found in the phrase dictionary.
REFERENCES:
patent: 5761689 (1998-06-01), Rayson et al.
patent: 6012088 (2000-01-01), Li et al.
David C. Gibbon, ‘Automated Authoring of Hypermedia Documents of Video Program’, ACM Multimedia 1995, Electronic Proceedings, pp. 1-12.
“A Parser for Real-Time Speech Synthesis of Conversational Texts” Bachenko et al., AT &T Bell Laboratories, date unknown, pp. 25-32.
“Automated Authoriing of Hypermedia Documents of Video Programs” Shahraray et al., ACM Multimedia 95—Electronic Proceedings, Nov. 5-9 1995, 12 pages.
Brown Eric William
Coden Anni Rosa
Hong Stephen
International Business Machines - Corporation
Karra Satheesh K.
Ohland, Greeley, Ruggiero & Perle, L.L.P.
LandOfFree
Method and apparatus providing capitalization recovery for text does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus providing capitalization recovery for text, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus providing capitalization recovery for text will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3374640