Method of identifying script of line of text

Image analysis – Pattern recognition – Context analysis or word recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C382S296000, C382S298000, C704S008000

Reexamination Certificate

active

07020338

ABSTRACT:
A method of identifying the script of a line of text by first assigning a weight to each n-gram in a group of documents of known scripts, where each n-gram is a sequence of numbers representing k-mean cluster centroids of a known script to which character segments in the documents of known scripts most closely match. A line of text is identified, where the line of text is made up of pixels. The identified line of text is cropped so that only a percentage of the pixels remain. The cropped line is vertically and horizontally rescaled into gray-scale pixels. The vertical gray-scale pixels are replaced with the sequence number of a k-means cluster centroid of a known script to which it most closely matches. The n-grams of the number sequence that represents the line of text is scored against the n-gram weights of the documents of known text. The highest score of the line of text is identified and compared to the scores of the documents of known scripts. The script of the line of text is determined to be the script of the document against which the line of text scores the highest.

REFERENCES:
patent: 5060276 (1991-10-01), Morris et al.
patent: 5062143 (1991-10-01), Schmitt
patent: 5410611 (1995-04-01), Huttenlocher et al.
patent: 5418951 (1995-05-01), Damashek
patent: 5442715 (1995-08-01), Gaborski et al.
patent: 5444797 (1995-08-01), Spitz et al.
patent: 5745600 (1998-04-01), Chen et al.
patent: 5844991 (1998-12-01), Hochberg et al.
patent: 5933525 (1999-08-01), Makhoul et al.
patent: 5982933 (1999-11-01), Yoshii et al.
patent: 5991714 (1999-11-01), Shaner
patent: 6005986 (1999-12-01), Ratner
patent: 6009392 (1999-12-01), Kanevsky et al.
patent: 6047251 (2000-04-01), Pon et al.
patent: 6061646 (2000-05-01), Martino et al.
patent: 6157905 (2000-12-01), Powell
patent: 6246976 (2001-06-01), Mukaigawa et al.
patent: 6253173 (2001-06-01), Ma
patent: 6272456 (2001-08-01), de Campos
patent: 6327386 (2001-12-01), Mao et al.
patent: 6404900 (2002-06-01), Qian et al.
patent: 6470094 (2002-10-01), Lienhart et al.
patent: 6658151 (2003-12-01), Lee et al.
patent: 6704698 (2004-03-01), Paulsen et al.
Parisse (“Global Word Shape Processing in Off-Line Recognition of Handwriting,” IEEE Transactions on Pattern Recognition and Machine Intelligence, vol. 18, No. 4, Apr. 1996, pp. 460-464).
A. L. Spitz, “Determination of the Script and Language Content of Document Images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, Nov. 3, 1997.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method of identifying script of line of text does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method of identifying script of line of text, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method of identifying script of line of text will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3608490

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.