Method and apparatus for determination of verified data

Image analysis – Editing – error checking – or correction – Including operator interaction

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Reexamination Certificate

active

06295387

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a method and apparatus for determining the veracity of data which previously required manual input. More particularly, the present invention relates to a method that seeks to determine the veracity of data by comparing data read by an OCR (Optical Character Reader) keyed into a system by an operator.
2. Related Art
Traditionally, optical character recognition has been used to read and process large amounts of data, such as that collected in a census. For each character image read and processed, the OCR program will generate a “classification”, which is the guess as to what the character processed is, and a “confidence”, which is the OCR's evaluation of how likely the data has been correctly read. It has been normal practice to retype low confidence data. Such re-keying of data is performed by workers at manual keying workstations, where the image is redisplayed for the operator who presses the appropriate character key. Typically low-confidence data has been discarded, but an examination of such low-confidence raw OCR data has proven that many times the OCR will get the classification correct, but at a low confidence level.
The accepted industry method to measure the accuracy of the OCR software is to use known test data and process it through a system or to use a set of “trusted keyers”, i.e., those people proven to be reliable and accurate, although not 100%, in the entry of data. This typically requires that the best keyboard operators be designated to re-inputting data rather than performing “real” work. Thus, it is desirable to maximize the speed of processing the incoming data and not waste the time of the best keyboard operators.
U.S. Pat. No. 5,282,267 describes a basic OCR system having an operator correction system. A dictionary is made available to the operator to look up correct data while errors are introduced to provide incorrect data to measure operator efficiency feedback. The described method of quality assurance is different from the present invention. The present invention seeks to overcome the time and cost inefficiencies of the previously known art.
SUMMARY OF THE INVENTION
The present invention to determine the veracity of data includes scanning a document containing characters and images as data into a memory, generating predetermined accuracy statistics for use in determining the accuracy of the data, performing automated recognition of the data to generate data classifications and confidences. This will also involve manually inputting classification data by a first operator for data having a low confidence and randomly selecting data to determine if the classification data is the same as the classification data generated by the performance of automated recognition, then the manually input data is found to be accurate and thus passes a quality assurance test. A similar type of comparison is performed to test the veracity of the scanned data. The method re-uses the low-confidence OCR classification data along with the manually keyed data of questionable accuracy to determine whether both data inputs are correct. When the OCR and manually keyed data disagree, the selected field is sent to a second keyer who inputs the selected data. The results of the two keyers and the OCR data are compared. Two of these three must agree, and the third (whether it is the OCR data or the manually keyed data) will be marked as incorrect. The apparatus used to perform the above method includes a scanner unit to scan the data into the apparatus, an automated recognition unit to generate automated recognition data from the data scanned into the apparatus, a memory unit to store the scanned data and manipulations thereof, and a mathematical processor unit to generate accuracy statistics. There will be at least one manual input station to manually input data to the memory unit, and at least one comparator to compare the automated recognition data with the manually input data.


REFERENCES:
patent: 4032887 (1977-06-01), Roberts
patent: 5257328 (1993-10-01), Shimizu
patent: 5271067 (1993-12-01), Abe et al.
patent: 5282267 (1994-01-01), Woo, Jr. et al.
patent: 5455875 (1995-10-01), Chevion et al.
patent: 5519786 (1996-05-01), Courtney et al.
patent: 5696854 (1997-12-01), Shepard
patent: 5697504 (1997-12-01), Hiramatsu et al.
patent: 6125362 (2000-09-01), Elworthy
patent: 6151423 (2000-11-01), Melen

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for determination of verified data does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for determination of verified data, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for determination of verified data will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2476297

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.