Excavating
Patent
1992-03-18
1995-10-17
Black, Thomas G.
Excavating
371 671, 371 681, 382310, 3642255, 3642376, 36423783, 364DIG1, G06F 1118, G06F 1716
Patent
active
054597390
ABSTRACT:
Three OCR systems are employed for text conversion and the results generated from each of the three are merged using a edit distance algorithm to estimate a correct common text ancestor. To make the process computationally feasible for large strings such as pages of documentation with 3,000 characters, the method is executed in two stages. The first procedure is carried out with each page considered as a string of lines. Where differences exist using the edit distance between the lines on a page to find the optimal alignment of the lines. In the event that choice must be made among three non-null lines, the procedure then is invoked on the three lines , by using the edit distance between the characters on a line to find the optimal alignment. The number of computations required of the procedure is further reduced by comer-cutting that hueristically determines an upper bound on the edit distance and limits calculations to those which do not exceed the upper bound.
REFERENCES:
patent: 3431557 (1964-03-01), Thomas et al.
patent: 3988715 (1976-10-01), Mullan et al.
patent: 4958379 (1990-09-01), Yamaguchi et al.
patent: 5181162 (1993-01-01), Smith et al.
patent: 5257323 (1993-10-01), Melen et al.
patent: 5265174 (1993-11-01), Nakatsuka
Time Warps, String Edits, and Macromolecules: The Theory and Practice Of Sequence Comparison Kruskal & Sankoff ch. p, pp. 1-44, Addison-Wesley Publishing Co., Reading, Mass 1983.
Speeding Up Dynamic Programming Algorithms for Final Optimal Lattice Paths by J. L. Spouge, SIAM J. Appl. Math, vol. 49 pp. 1552-1566, Oct. 1989.
The String-to-String correction Problem, R. W. Wagner, N. J. Fischer, Journal of the ACM, vol. 21, Jan. 1974, pp. 168-173.
Algorithms for Approximate String Matching by E. Ukkonen, Information and Control 64, 100-118 (1985).
Minimum Detour Methods for String or Sequence Comparison by Hadlock, Florida Atlantic Univ., Boca Raton, Fla., Congressus Numerantium 61 (1988), pp. 263-274.
A Linear Space Algorithm for Computing Maximal Common Subsequences by Hirschberg, Princeton University, Communications of the ACM, Jun., 1975, vol. 18, No. 6, pp. 341-343.
Fast Optimal Alignment by Spouge, CABIOS, vol. 7, No. 1, 1991 pp. 1-7.
On Approximate String Machining by Ukkonne, Lecture Notes in Computer Science, pp. 486-495, Proceedings of the 1983 International FCT-Conference, borgholm, Sweden, Aug. 21-27, 1983.
Stroustrup, The C++ Programming Language, 1991 pp. 537-539.
Handley John C.
Hickey Thomas B.
Black Thomas G.
Choulks Jack
OCLC Online Computer Library Center Incorporated
LandOfFree
Merging three optical character recognition outputs for improved does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Merging three optical character recognition outputs for improved, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Merging three optical character recognition outputs for improved will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-603644