Data processing: presentation processing of document – operator i – Presentation processing of document – Layout
Reexamination Certificate
1999-02-26
2003-05-20
Feild, Joseph H. (Department: 2176)
Data processing: presentation processing of document, operator i
Presentation processing of document
Layout
C715S252000, C715S252000, C382S224000, C707S793000
Reexamination Certificate
active
06565611
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention is directed toward the field of computer systems that capture digital ink, and more particularly toward automated generation of an index for handwritten notes.
2. Art Background
Some computer systems, including personal digital assistants (PDAs), permit users to enter handwritten material into the computer. Essentially, these computers and PDAs include a user interface that permits a user to write handwritten material onto a surface, and the handwritten material or notes are subsequently sampled into “digital ink.” One application of these computer systems is to permit a user to perform electronic note-taking.
One potential advantage of electronic note-taking over paper note-taking is the ability, in electronic note-taking, to create indexes. In general, indexes provide a means to locate specific information within the handwritten notes. With paper note-taking, such indexes must be created manually. Since this manual process is difficult, paper note-takers tend to mark important items or keywords by underlining, circling, or entering asterisks next to the important material. Although this type of highlighting helps users to locate important information while browsing notes, it does not provide an index.
In electronic document or text systems (i.e., systems where the text is cognitively recognized by the system), techniques exist to create automatic “back-of-the-book” indexes (See H. Schutze, “The Hypertext Concordance: A Better Back-Of-The-Book Index”, Proc. COMPUTERM, ACL Coling, 1998). These back-of-book indexes allow users to scan a list of keywords in the index and find occurrences of the index terms in the text. However, these electronic text systems are based on the user entering text, such as from a keyboard, directly into the system.
In other electronic text systems, information retrieval techniques are used to automatically create indexing of textual documents. For example, in one such system, index terms are selected for Web pages based on relative frequency of term occurrence (See H. Schutze, “The Hypertext Concordance: A Better Back-Of-The-Book Index”, Proc. COMPUTERM, ACL Coling, 1998). However, these techniques do not apply directly to digital ink, since words, in digital ink, are not cognitively identified. In theory, an attempt to convert digital ink to text using character recognition may be attempted. However, character recognition is not accurate on handwritten data. Accordingly, it is desirable to automatically generate indexes from handwritten data entered as digital ink into a computer without character recognition.
Manual indexing by the user is possible in electronic systems that use digital ink rather than text. One example is the application of keywords to sections of electronic notes, as provided by the Dynomite System, developed at FX Palo Alto Laboratory, and as provided by Marquis (See K. Weber and A. Poon, “Marquis: A Tool For Real-Time Video Logging”, CHIN 94). However, requiring the user to manually identify keywords to generate the index requires, during the note taking process, cognitive effort on the part of the user.
Another application for manual indexing of digital ink by a user is through the development of ink properties in the Dynomite system. An ink property is a data type applied to selected digital ink, that allows that ink to be subsequently retrieved by type. Example data types include “name” or “to do” items. Ink index pages for a given ink property are created by a user to subsequently permit quick scanning of all notes that contain that property. In addition, notes on the index page are hyper linked back to the original location in the notes. One significant problem associated with both the keyword and ink property manual approaches to generate indexes for digital ink systems is that they require significant cognitive effort on the part of the user. As a result, these techniques are not practical because the user is typically not disciplined enough to do it.
A system for manually indexing historical handwritten document images is described in R. Manmatha, Chengfeng Han, E. M. Riseman and W. B. Croft, “indexing Handwriting Using Word Matching”, ACM Digital Libraries, 1996. In this technique, images are segmented into words, and word equivalence classes are found by thresholding match scores between words. This technique requires the user to manually input words to specify the word equivalence classes. Index terms are then chosen from the largest word equivalence classes. In addition, stop words are manually eliminated. Since no stroke information on the handwritten data is available, match scores are computed based on the word images alone. Accordingly, it is desirable to automatically create indexes for handwritten digital ink, without user effort.
In A. Poon, K. Weber, T. Cass, “Scribbler: A Tool For Searching Digital Ink”, CHI 95, a technique called scribble matching is described. In general, scribble matching involves finding occurrences of a given word in a handwritten document. This technique is based on using dynamic programming to compute a score between the given handwritten word and the words in the document. A similar method is also described in D. Lopresti and A. Tomkins, “On The Searchability Of Electronic Ink”, Fourth International Workshop on Frontiers of Handwriting Recognition, December, 1994.
As is described fully below, the present invention provides a system for automatically generating indexes for handwritten notes based on the stokes of the digital ink.
SUMMARY OF THE INVENTION
A system automatically generates indexes for handwritten notes captured as digital ink in a computer. Ink words, which roughly correspond to words in the notes, are identified. Features of the ink words are computed, and pairwise distances or match scores, which measure the distance in the features between two ink words, are calculated. From the pairwise distances, equivalence classes of ink words are determined from clustering the ink words. Index terms, which appear in the index for the handwritten notes, are selected from the equivalence classes of ink words. The system generates location information for the index terms that identifies a location in the handwritten notes where the index terms appear. An index of the index terms are displayed with the location information. In one embodiment, the notes index contains page numbers, displayed next to the index terms, to identify the page in the handwritten notes where the index term appears. In another embodiment, the index contains hyper-linked index terms.
The system includes a novel technique to identify equivalence classes of ink words in handwritten notes. A threshold is generated to identify a maximum pairwise distance for the clustering of ink words. Specifically, a distribution curve, which represents a relationship between a number of occurrences among pairs of the ink words in the handwritten notes verse a pairwise distance, is generated. A knee of the distribution curve, &tgr;, is approximated with a first line of gradient 0 to &tgr;, and a second line comprising a constant gradient from the knee, &tgr;, throughout pairwise distances on the distribution curve. The knee of the distribution curve, &tgr;, is selected as the threshold for clustering.
REFERENCES:
patent: 4580218 (1986-04-01), Raye
patent: 5404295 (1995-04-01), Katz et al.
patent: 5428777 (1995-06-01), Perliski et al.
patent: 5513305 (1996-04-01), Maghbouleh
patent: 5524240 (1996-06-01), Barbara et al.
patent: 5537491 (1996-07-01), Mahoney et al.
patent: 5539841 (1996-07-01), Huttenlocher et al.
patent: 5596700 (1997-01-01), Darnell et al.
patent: 5748805 (1998-05-01), Withgott et al.
patent: 5822539 (1998-10-01), van Hoff
patent: 5845288 (1998-12-01), Syeda-Mahmood
patent: 5873107 (1999-02-01), Borovoy et al.
patent: 5963205 (1999-10-01), Sotomayor
patent: 6069618 (2000-05-01), Ogo
patent: 6151021 (2000-11-01), Berquist et al.
patent: 6311189 (2001-10-01), deVries et al.
patent: 6356922 (2002-03-01), Schilit et al.
patent: 6389435 (20
Cass Todd A.
Chiu Patrick
Uchihashi Shingo
Wilcox Lynn D.
Feild Joseph H.
Fliesler Dubb Meyer & Lovejoy LLP
Xerox Corporation
LandOfFree
Automatic index creation for handwritten digital ink notes does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Automatic index creation for handwritten digital ink notes, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Automatic index creation for handwritten digital ink notes will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3033185