Text correction for PDF converters

Data processing: presentation processing of document – operator i – Presentation processing of document – Edit – composition – or storage control

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C715S245000

Reexamination Certificate

active

07827484

ABSTRACT:
To correct at least one extraneous or missing space in a document, weights are assigned to tokens contained in a dictionary. Each token is defined by an ordered sequence of non-space symbols. The weights are assigned based on at least one of a token length and frequency of occurrence of the token in the document. Corrected text is generated from text of the document by applying an ordered sequence of symbol-level transformations selected from a group of symbol-level transformations including at least (i) deleting a space, (ii) inserting a space, and (iii) copying a symbol. The ordered sequence of symbol-level transformations is optimized respective to an objective function dependent upon the weights of tokens of the corrected text.

REFERENCES:
patent: 5572423 (1996-11-01), Church
patent: 5715469 (1998-02-01), Arning
patent: 5933525 (1999-08-01), Makhoul et al.
patent: 6167369 (2000-12-01), Schulze
patent: 6618697 (2003-09-01), Kantrowitz et al.
patent: 2003/0216913 (2003-11-01), Keely et al.
patent: 2005/0007299 (2005-01-01), Gormish
patent: 2005/0034068 (2005-02-01), Jaeger
patent: 2007/0016862 (2007-01-01), Kuzmin
Taghva, Kazem et. al; An expert system for automatically correcting OCR output; 1994; Information Science Research Institute.
Kukich, Karen; Techniques for Automatically Correcting Words in Text; Dec. 1992; ACM Computing Surveys; vol. 24, No. 4.
DCLab, “Converting from PDF to XML &MS Word: Avoiding the Pitfalls.” at http://www.dclab.com/converting—from—pdf.asp, p. 1, Oct. 3, 2003 and p. 2, Nov. 3, 2003.
Iceni Technology, “Gemini,” 4 pages, at http://www.iceni.com/content/Gemini/, last visited Jun. 30, 2005.
Forney, “The Viterbi Algorithm,” Proceedings of the IEEE, vol. 61, No. 3, Mar. 1973.
Nevill-Manning et al., “Extracting Text from Postscript,” Software-Practice and Experience, vol. 28, No. 5, pp. 481-491, Apr. 1998.
ScanSoft, OmniPage, at http://www.scansoft.com/omnipage/capturesdk/, 2 pages, last visited Jul. 14, 2005.
Cambridgedocs XML Conversion and Publishing Technologies, at http://www.cambridgedoc.com/, last visited Jul. 14, 2005.
Kempe et al., “WFSC—A New Weighted Finite State Compiler,” Lecture Notes in Computer Science, vol. 2759/2003, pp. 108-119, Aug. 2003.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Text correction for PDF converters does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Text correction for PDF converters, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Text correction for PDF converters will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-4187866

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.