Image analysis – Pattern recognition – Context analysis or word recognition
Reexamination Certificate
2007-08-16
2011-10-25
Chang, Jon (Department: 2624)
Image analysis
Pattern recognition
Context analysis or word recognition
C382S218000, C382S203000
Reexamination Certificate
active
08045808
ABSTRACT:
A pure adversarial optical character recognition (OCR) approach in identifying text content in images. An image and a search term are input to a pure adversarial OCR module, which searches the image for presence of the search term. The image may be extracted from an email by an email processing engine. The OCR module may split the image into several character-blocks that each has a reasonable probability of containing a character (e.g., an ASCII character). The OCR module may form a sequence of blocks that represent a candidate match to the search term and calculate the similarity of the candidate sequence to the search term. The OCR module may be configured to output whether or not the search term is found in the image and, if applicable, the location of the search term in the image.
REFERENCES:
patent: 2005/0216564 (2005-09-01), Myers et al.
patent: 2008/0008348 (2008-01-01), Metois et al.
Gatos et al. “A Segmentation-free Approach for Keyword Search in Historical Typewritten Documents.” Proceedings of the Eighth International Conference on Document Analysis and Recognition, Aug. 29, 2005, 5 pages.
Chen et al. “Detecting and Locating Partially Specified Keywords in Scanned Images Using Hidden Markov Models.” Proceedings of the Second International Conference on Document Analysis and Recognition, Oct. 20, 1993, 6 pages.
G S Lehal, et al. “A Shape Based Post Processor for Gurmukhi OCR”, 2001 IEEE, pp. 1105-1109, Punjabi University, India.
Jeff DeCurtins, et al. “Keyword spotting via word shape recognition”, Feb. 6, 1995, pp. 270-277, SPIE vol. 2422, XP 000642554.
PCT International Search Report for Application No. PCT/JP2007/071448 (4 sheets).
“Avira Warns: New Spam Wave With Anti OCR Techniques”, Nov. 17, 2006, p. 1 [retrieved on Apr. 19, 2007]. Retrieved from the intemet: <http://www.avira.com/en/security—news/ocr—spam—wave.html>.
“Barracuda Spam Firewell Protects Customers From Growing Incidence of Image Spam”, Jul. 19, 2006, p. 1 [retrieved on Apr. 19, 2007] [retrieved from the intemet: <http://www.barracudanetworks.com
s
ews—and—events/index.php?nid=105>.
Optical Character Recognition from Wikipedia, the free encyclopedia; pp. 1-5, [retrieved on Apr. 19, 2007] [retrieved from the internet: <http://en.wikipedia.org/wiki/Optical—character—recognition>.
OcrPlugin—Spamassassin Wiki, Aug. 28, 2006, pp. 1-4 [retrieved on Apr. 19, 2007] [retrieved from the internet: <http://wiki.apache.org/spamassassin/OcrPlugin?action=print>.
Mehran Sahami, Susan Dumais, David Heckerman and Eric Horvitz, “A Bayesian Approach to Filtering Junk E-Mail” AAA1'98 Workshop on Learning for Text Categorization, Jul. 27, 1998, Madison, Wisconsin.
Serge Belongie, Jitendra Malik and Jan Puzicha, “Shape Matching and Object Recognistion Using Shape Contexts”, Apr. 24, 2002, pp. 509-522, Transactions on Pattern Analysis and Machine Intelligence, vol. 24, No. 24. IEEE 2002.
“SunbeltBLOG: Image Spam”, May 2, 2006, pp. 1-2 [retrieved on Apr. 19, 2007] [retrieved from the internet: <http://sunbeltblog.blogspot.com/2006/05/image-spam.html>.
Messagelabs website—Email Control, pp. 1-4, [retrieved on Apr. 19, 2007] [retrieved from the internet<www.messagelabs/com/Services/Email—Services/Email—Control>.
Seunghak Lee, Iryoung Jeong and Seungjin Choi, “Dynamically Weighted Hidden Markov Model for Spam Deobfuscation”, pp. 2523-2529, IJCAI-07.
Honglak Lee and Andrew Y. Ng, “Spam Deobfuscation Using a Hidden Markov Model”, 8 sheets, Proceedings of the Second Conference on Email and Anti-Spam, CEAS 2005.
Eric Sven Ristad and Peter N. Yianilos, “Learning String Edit Distance”, Research Report CS-TR-532-96, pp. 1-33, Oct. 1997. Department of Computer Science, Princeton University.
Edit Distance from Wikipedia, the free encyclopedia. Mar. 20, 2007, p. 1 [retrieved on May 12, 2007] [retrieved from the intemet<http://en.wikipedia.org/wiki/Edit—distance>.
Dynamic Programming Algorithm (DPA) for Edit-Distance from Monash University website, 1999, pp. 1-5 [retrieved on May 12, 2007] [retrieved from the internet:<www.csse.monash.edu.au/˜Iloyd/tildeAlgDS/Dynamic/Edit/>.
Jonathan Oliver, “Using Lexigraphical Distancing to Block Spam” Jan. 21, 2005, pp. 1-14, MailFrontier—Email is good again, Spam Conference.
Optical Character Recognition from Wikipedia, the free encyclopedia, pp. 1-5 [retrieved on Jul. 12, 2007] [retrieved from the internet:<http://en.wikipedia.org/wiki/Optical—character—recognition>.
Cheng-Lin Liu and Hiromichi Fujisawa, “Classification and Learning for Character Recognition: Comparison of Methods and Remaining Problems”, 7 sheets [retrieved on Jul. 13, 2007] [retrieved from the internet:<www.dsi.unifi.it/NNLDAR/Papers/01-NNLDAR05-Liu.pdf>.
Open-Source Character Recognition from GOCR website, pp. 1-2 [retrieved on Jul. 13, 2007] [retrieved from the internet:<http://jocr.sourceforge.net/>.
Dynamic Programming Algorithm for Sequence Alignment, pp. 1-3 [retrieved on Aug. 9, 2007] [retrieved from the internet<http://www.csse.monash.edu.au/˜lloyd/tildeStrings/Notes/DPA.html>.
Battista Biggio, Giorgio Fumera, Ignazio Pillai and Fabio Roli, “Image Spam Filtering by Content Obscuring Detection”, 5 sheets, Aug. 2-3, 2007, Fourth Conference on Email and Anti-Spam CEAS 2007, California.
Zhe Wang, William Josephson, Qin Lv, Moses Charikar and Kai Li, “Filtering Image Spam with Near-Duplicate Detection”, 10 sheets, Aug. 2-3, 2007, Fourth Conference on Email and Anti-Spam CEAS 2007.
Mark Dredze, Reuven Gevaryahu and Ari Elias-Bachrach, “Learning Fast Classifiers for Image Spam”, 9 sheets, Aug. 2-3, 2007, Fourth Conference on Email and Anti-Spam CEAS 2007.
Byungki Byun, Chin-Hui Lee, Steve Webb and Calton Pu, “A Discriminative Classifier Learning Approach to Image Modeling and Spam Image Identification”, 9 sheets, Aug. 2-3, 2007, Fourth Conference on Email and Anti-Spam CEAS 2007.
Chang Jon
Okamoto & Benedicto LLP
Trend Micro Incorporated
LandOfFree
Pure adversarial approach for identifying text content in... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Pure adversarial approach for identifying text content in..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Pure adversarial approach for identifying text content in... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-4281196