Pure adversarial approach for identifying text content in...

Image analysis – Pattern recognition – Context analysis or word recognition

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C382S218000, C382S203000

Reexamination Certificate

active

08045808

ABSTRACT:
A pure adversarial optical character recognition (OCR) approach in identifying text content in images. An image and a search term are input to a pure adversarial OCR module, which searches the image for presence of the search term. The image may be extracted from an email by an email processing engine. The OCR module may split the image into several character-blocks that each has a reasonable probability of containing a character (e.g., an ASCII character). The OCR module may form a sequence of blocks that represent a candidate match to the search term and calculate the similarity of the candidate sequence to the search term. The OCR module may be configured to output whether or not the search term is found in the image and, if applicable, the location of the search term in the image.

REFERENCES:
patent: 2005/0216564 (2005-09-01), Myers et al.
patent: 2008/0008348 (2008-01-01), Metois et al.
Gatos et al. “A Segmentation-free Approach for Keyword Search in Historical Typewritten Documents.” Proceedings of the Eighth International Conference on Document Analysis and Recognition, Aug. 29, 2005, 5 pages.
Chen et al. “Detecting and Locating Partially Specified Keywords in Scanned Images Using Hidden Markov Models.” Proceedings of the Second International Conference on Document Analysis and Recognition, Oct. 20, 1993, 6 pages.
G S Lehal, et al. “A Shape Based Post Processor for Gurmukhi OCR”, 2001 IEEE, pp. 1105-1109, Punjabi University, India.
Jeff DeCurtins, et al. “Keyword spotting via word shape recognition”, Feb. 6, 1995, pp. 270-277, SPIE vol. 2422, XP 000642554.
PCT International Search Report for Application No. PCT/JP2007/071448 (4 sheets).
“Avira Warns: New Spam Wave With Anti OCR Techniques”, Nov. 17, 2006, p. 1 [retrieved on Apr. 19, 2007]. Retrieved from the intemet: <http://www.avira.com/en/security—news/ocr—spam—wave.html>.
“Barracuda Spam Firewell Protects Customers From Growing Incidence of Image Spam”, Jul. 19, 2006, p. 1 [retrieved on Apr. 19, 2007] [retrieved from the intemet: <http://www.barracudanetworks.com
s
ews—and—events/index.php?nid=105>.
Optical Character Recognition from Wikipedia, the free encyclopedia; pp. 1-5, [retrieved on Apr. 19, 2007] [retrieved from the internet: <http://en.wikipedia.org/wiki/Optical—character—recognition>.
OcrPlugin—Spamassassin Wiki, Aug. 28, 2006, pp. 1-4 [retrieved on Apr. 19, 2007] [retrieved from the internet: <http://wiki.apache.org/spamassassin/OcrPlugin?action=print>.
Mehran Sahami, Susan Dumais, David Heckerman and Eric Horvitz, “A Bayesian Approach to Filtering Junk E-Mail” AAA1'98 Workshop on Learning for Text Categorization, Jul. 27, 1998, Madison, Wisconsin.
Serge Belongie, Jitendra Malik and Jan Puzicha, “Shape Matching and Object Recognistion Using Shape Contexts”, Apr. 24, 2002, pp. 509-522, Transactions on Pattern Analysis and Machine Intelligence, vol. 24, No. 24. IEEE 2002.
“SunbeltBLOG: Image Spam”, May 2, 2006, pp. 1-2 [retrieved on Apr. 19, 2007] [retrieved from the internet: <http://sunbeltblog.blogspot.com/2006/05/image-spam.html>.
Messagelabs website—Email Control, pp. 1-4, [retrieved on Apr. 19, 2007] [retrieved from the internet<www.messagelabs/com/Services/Email—Services/Email—Control>.
Seunghak Lee, Iryoung Jeong and Seungjin Choi, “Dynamically Weighted Hidden Markov Model for Spam Deobfuscation”, pp. 2523-2529, IJCAI-07.
Honglak Lee and Andrew Y. Ng, “Spam Deobfuscation Using a Hidden Markov Model”, 8 sheets, Proceedings of the Second Conference on Email and Anti-Spam, CEAS 2005.
Eric Sven Ristad and Peter N. Yianilos, “Learning String Edit Distance”, Research Report CS-TR-532-96, pp. 1-33, Oct. 1997. Department of Computer Science, Princeton University.
Edit Distance from Wikipedia, the free encyclopedia. Mar. 20, 2007, p. 1 [retrieved on May 12, 2007] [retrieved from the intemet<http://en.wikipedia.org/wiki/Edit—distance>.
Dynamic Programming Algorithm (DPA) for Edit-Distance from Monash University website, 1999, pp. 1-5 [retrieved on May 12, 2007] [retrieved from the internet:<www.csse.monash.edu.au/˜Iloyd/tildeAlgDS/Dynamic/Edit/>.
Jonathan Oliver, “Using Lexigraphical Distancing to Block Spam” Jan. 21, 2005, pp. 1-14, MailFrontier—Email is good again, Spam Conference.
Optical Character Recognition from Wikipedia, the free encyclopedia, pp. 1-5 [retrieved on Jul. 12, 2007] [retrieved from the internet:<http://en.wikipedia.org/wiki/Optical—character—recognition>.
Cheng-Lin Liu and Hiromichi Fujisawa, “Classification and Learning for Character Recognition: Comparison of Methods and Remaining Problems”, 7 sheets [retrieved on Jul. 13, 2007] [retrieved from the internet:<www.dsi.unifi.it/NNLDAR/Papers/01-NNLDAR05-Liu.pdf>.
Open-Source Character Recognition from GOCR website, pp. 1-2 [retrieved on Jul. 13, 2007] [retrieved from the internet:<http://jocr.sourceforge.net/>.
Dynamic Programming Algorithm for Sequence Alignment, pp. 1-3 [retrieved on Aug. 9, 2007] [retrieved from the internet<http://www.csse.monash.edu.au/˜lloyd/tildeStrings/Notes/DPA.html>.
Battista Biggio, Giorgio Fumera, Ignazio Pillai and Fabio Roli, “Image Spam Filtering by Content Obscuring Detection”, 5 sheets, Aug. 2-3, 2007, Fourth Conference on Email and Anti-Spam CEAS 2007, California.
Zhe Wang, William Josephson, Qin Lv, Moses Charikar and Kai Li, “Filtering Image Spam with Near-Duplicate Detection”, 10 sheets, Aug. 2-3, 2007, Fourth Conference on Email and Anti-Spam CEAS 2007.
Mark Dredze, Reuven Gevaryahu and Ari Elias-Bachrach, “Learning Fast Classifiers for Image Spam”, 9 sheets, Aug. 2-3, 2007, Fourth Conference on Email and Anti-Spam CEAS 2007.
Byungki Byun, Chin-Hui Lee, Steve Webb and Calton Pu, “A Discriminative Classifier Learning Approach to Image Modeling and Spam Image Identification”, 9 sheets, Aug. 2-3, 2007, Fourth Conference on Email and Anti-Spam CEAS 2007.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Pure adversarial approach for identifying text content in... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Pure adversarial approach for identifying text content in..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Pure adversarial approach for identifying text content in... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-4281196

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.