Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
2001-05-16
2004-05-18
Pardo, Thuy N. (Department: 2175)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C707S793000, C707S793000, C707S793000, C707S793000
Reexamination Certificate
active
06738780
ABSTRACT:
FIELD OF INVENTION
The present invention relates to autonomous citation indexing. Specifically, an autonomous citation indexing system is completely automatic, autonomously extracts citations, identifies identical citations which occur in different formats, and identifies the context of citations in the body of the articles.
BACKGROUND OF THE INVENTION
A citation index indexes citations contained in articles, linking the articles with the cited works. Citation indexing was originally designed for information retrieval. A citation index allows navigation backward in time by following the list of cited articles and forward in time by tracking which subsequent articles cite any given article.
The rate of production of scientific literature continues to increase, making it time consuming for researchers to stay current. Most published scientific research appears in paper documents such as scholarly journals or conference proceedings. There is a considerable time delay between the completion of research and the availability of the publication. Thus, the World Wide Web (“the web”) has become an important distribution medium for scientific research. An increasing number of authors are making new research available on the web in the form of preprints or technical reports which can be downloaded and printed. These web publications are often available before any corresponding printed publication in journals or conference proceedings. In order to keep current on research, especially in rapidly advancing fields, a researcher can search the web to download papers as soon as they are written. Literature available on the web is easier to access. The web, however, does have its own limitations for searching. The web lacks a standardized organization, publications themselves tend to be poorly organized (each institution or researcher may have its, his or her own organizational scheme), and publications are often spread throughout the web. A researcher could spend large amounts of time solely for searching, downloading and printing papers to read in order to find those publications that may be important. However, web based literature can be read and processed by autonomous agents far more easily than printed documents. Agents can search the web and thereby provide an automated means to find, download and judge the relevance of web publications. The present invention concerns just such an agent. An assistant agent is a computer program which automatically performs some task on behalf of a user.
One method for finding relevant and important publications on the web is to use a combination of Web Search Engines and manual web browsing. Web search engines such as AltaVista (http://altavista.digital.com) index the text contained on web pages, allowing users to find information with keyword search. Some research publications on the web are made available in HTML (HyperText Markup Language) format, making the text of these papers searchable with web search engines. However, most of the published research papers on the web are in Postscript form (which preserves the formatting of the original), rather than HTML. The text of these papers is not indexed by search engines such as AltaVista, requiring researchers to locate pages which contain links to these papers, e.g. by searching for a paper by title or author name. Another limitation of the web search engines is that they typically only use word frequency information to find relevant web pages, although other types of information are potentially useful, e.g. papers which contain citations of common earlier papers may be related.
In the following text, reference will be made to several publications in the open literature. These publications are herein incorporated by reference.
The present invention benefits from three areas of prior work. The first involving citation indexing which indexes the citations made between academic articles. See, for example, in Chapters 1 to 3 and Chapter 10 of the book by E. Garfield, entitled “Citation Indexing: Its Theory and Application in Science, Technology and Humanities”, ISI Press, Philadelphia, 1979.
The second concerning semantic distance measures between text documents. Research in this area is directed towards finding quantifiable and useful measures of similarity or relatedness between bodies of text.
The third, web, interface and assistant software agents. Several papers have addressed the problem of locating “interesting” web pages. For example, articles including those by M. Pazzani, J. Muramatsu and D. Billsus, entitled “Syskill & Webert: Identifying interesting web sites” in Proceedings of the National Conference on Artificial Intelligence (AAAI96), 1996; by F. Menczer, entitled “Arachnid: Adaptive retrieval agents choosing heuristic neighborhoods for information discovery” in Machine Learning: Proceedings of the Fourteenth International Conference, pp. 227-235, 1997; by M. Balabanovic, entitled “An adaptive web page recommendation service” in Proceedings of the First International Conference on Autonomous Agents, ACM Press, New York, pp. 378-385, 1997; and by A. Moukas, entitled “Amalthaea: Information discovery and filtering using a multiagent evolving ecosystem” in Proceedings of the Conference on Practical Applications of Agents and Multiagent Technology, 1996. This includes work which uses learning techniques based on user feedback.
In citation indexing, references contained in articles are used to give credit to previous work in the literature and provide a link between the “citing” and “cited” articles. A citation index, such as Garfield, supra, indexes the citations that an article makes, linking the articles with the cited works. Citation indexes were originally designed mainly for information retrieval, as referenced by E. Garfield in an article entitled “The concept of citation indexing: A unique and innovative tool for navigating the research literature” Current Contents, Jan. 3, 1994. The citation links allow navigating the literature in unique ways. Papers can be located independent of language and words in the title, keywords or document. A citation index allows navigation backward in time (the list of cited articles) and forward in time (subsequent articles which cite the current article). Citation indexing can be a powerful tool for literature search, in particular:
a. A citation index allows finding out where and how often a particular article is cited in the literature, thus providing an indication of the importance of the article. Older articles may define methodology or set the research agenda. Newer articles may respond to or build upon the original article.
b. Citations can help to find other publications which may be of interest. Using citation information in addition to keyword information should allow the identification of more relevant literature.
c. The context of citations in citing publications may be helpful in judging the important contributions of a cited paper.
d. A citation index can provide detailed analyses of research trends and identify emerging areas of science.
The Institute for Scientific Information (ISI)® (Institute for Scientific Information, 1997) produces multi-disciplinary citation indexes, which are used to provide several commercial services for searching scientific periodicals. An ISI service is the Keywords Plus® service, which adds citation information to the indexing of an article. Specifically, in addition to the title, author-supplied keywords, and abstract, Keywords Plus adds additional indexing terms which are derived from the titles of cited papers. As a user browses through papers in the ISI databases, bibliographic coupling allows navigation by locating papers which share one or more references.
Another commercial citation index is the legal database offered by the West Group (KeyCite). This database indexes case law as opposed to scientific research articles.
Compared to the current commercial citation indexes, the citation indexing performed by using the present invention has the following limitations: it does not cover the significant journals
Bollacker Kurt D.
Giles C. Lee
Lawrence Stephen R.
NEC Laboratories America, Inc.
Pardo Thuy N.
LandOfFree
Autonomous citation indexing and literature browsing using... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Autonomous citation indexing and literature browsing using..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Autonomous citation indexing and literature browsing using... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3259542