METHOD FOR DYNAMICALLY DELIVERING CONTENTS ENCAPSULATED WITH...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000, C707S793000, C707S793000, C707S793000, C707S793000, C707S793000, C707S793000, C704S009000

Reexamination Certificate

active

06553373

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates generally to a system and method for reviewing documents. More particularly, the present invention relates to presentation of documents in a manner that allows the user to quickly ascertain their contents.
BACKGROUND OF THE INVENTION
Documents obtained via an electronic medium (i.e., the Internet or on-line services, such as AOL, Compuserve or other services) are often provided in such volume that it is important to be able to summarize them. Oftentimes, it is desired to be able to quickly obtain a brief (i.e., a few sentences or a paragraph length) summary of the document rather than reading it in its completeness. Most typically, such documents span several paragraphs to several pages in length. The present invention concerns itself with this kind of document, hereinafter referred to as average length document.
Present day summarization technologies fall short of delivering filly informative summaries of documents. To some extent, this is so because of shortcomings of the state-of-the-art in natural language processing; in general, the issue of how to customize a summarization procedure for a specific information seeking task is still an open one. However, given the rapidly growing volume of document-based information on-line, the need for any kind of document abstraction mechanism is so great that summarization technologies are beginning to get deployed in real world situations.
The majority of techniques for “summarization”, as applied to average-length documents, fall within two broad categories. A class of techniques mine a document for certain pre-specified pieces of information, typically defined a priori, on the basis of fixing the most characteristic features of a known domain of interest. Other approaches rely, in effect, on ‘re-using’ certain fragments of the original text; these have been identified, typically by some-similarity metric, as closest in meaning to the whole document. This categorization is not a rigid one: a number of approaches (as exhibited, for instance, in a recent workshop on
Association for Computational Linguistics
, “Proceedings of a Workshop on Intelligent, Scalable, Text Summarization,” Madrid, Spain, 1997) use strong notions of topicality (B. Boguraev and C. Kennedy, “Salience-based content characterization of text documents,” in
Proceedings of ACL '
97
Workshop on Intelligent, Scalable Text Summarization,
Madrid, Spain, 1997), (E. Hovy and C. Y. Lin, “Automated text summarization in SUMMARIST,” in
Proceedings of ACL '
97
Workshop on Intelligent, Scalable Text Summarization,
Madrid, Spain, 1997), lexical chains (R. Barzilay and M. Elhadad, “Using lexical chains for text summarization,” in
Proceedings of ACL '
97
Workshop on Intelligent, Scalable Text Summarization,
Madrid, Spain, 1997), and discourse structure (D. Marcu, “From discourse structures to text summaries”, in
Proceedings of ACL '
97
Workshop on Intelligent, Scalable Text Summarization,
Madrid, Spain, 1997), (U. Hahn and M. Strube, “Centered segmentation: scaling up the centering model to global discourse structure,” in
Proceedings of ACL
-
EACL/
97, 35
th Annual Meeting of the Association for Computational Linguistics and
8
th Conference of the European Chapter of the Association for Computational Linguistics,
Madrid, Spain, 1997), thus laying claim to newer sets of methods.
Still, at a certain level of abstraction, all approaches share a fundamental similarity: summarization methods today rely, in essence, on substantial data reduction over the original document source. Such a position leads to several usability questions.
Given the extracted fragments which any particular method has identified as worth preserving, what is an optimal way of encapsulating these into a coherent whole, for presenting to the user? Acknowledging that different information management tasks may require different kinds of summary, even from the same document, how should the data discarded by the reduction process be retained, in case a reference is necessary to a part of the document not originally included, in the summary? What are the trade-offs in fixing the granularity of analysis: for instance, are sentences better than paragraphs as information-bearing passages, or are phrases even better? Of particular importance to this invention is the question of “user involvement.” From the end-user's point of view, making judgements, on the basis of a summary, concerning what a document is about and whether to pay it closer attention would engage the user in a sequence of actions: look at the summary, absorb its semantic impact, infer what the document might be about, decide whether to consult the source, somehow call up the full document, and navigate to the point(s) of interest. Given that this introduces a serious amount of cognitive and operational overhead, what are the implications for the user when they are faced with a large, and growing, number of documents to deal with on a daily basis?
These are only some of the questions concerning the acceptability of summarization technology by end users. There is particular urgency, given the currently evolving notion of “infornmation push”, where content arriving unsolicited, and in large quantities, at individual workstations threatens users with real and immediate information overload. To the extent that broad coverage summarization techniques are beginning to get deployed in real world situations, it is still the case that these techniques are based primarily on sentence extraction methods. In such a context, the above questions take on more specific interpretations. Thus, is it appropriate to concatenate together the sentences extracted as representative—especially when they come from disjoint parts of the source document? What could be done, within a sentence extraction framework, to ensure that all ‘themes’ in a document get represented by the set of sentences identified by the technology? How can the jarring effect of ‘dangling’ (and unresolved) references in the selection—without any obvious means of identifying the referents in the original text—be overcome? What mechanisms could be developed for offering the user additional information from the document, for more focused attention to detail? What is the value of the sentence, as a basic information-bearing unit, as a window into a multi-document space?
To illustrate some of these issues, consider several examples from an operational news tracking site: the News Channel page of Excite, an information vendor and a popular search engine host for the World Wide Web, which is available via the “Ongoing Coverage” section of the news tracking page, (http:/
t.excite.com). Under the heading of Articles about IRS Abuses Alleged, some entries read:
EXAMPLE 1
RENO ON Sunday/Reform Taxes the . . .
The problem, of course, is that the enemies of the present system are all grinding different axes. How true, how true, and ditto for most of the people who sit on the Finance Committee. (First found: Oct. 18, 1997)
EXAMPLE 2
Scheduled IRS Layoffs For 500 Are . . .
The Agency's original plan called for eliminating as many as 5,000 jobs in field offices and at the Washington headquarters. “The way this has turned out, it works to the agency's advantage, the employees' advantage and the union's advantage.” (First found: Oct. 17, 1997.)
Both examples present summaries as sentences which almost seamlessly follow one another. While this may account for acceptable readability, it is at best misleading, as in the original documents these sentences are several paragraphs apart. This makes it hard to know that the references to “How true, how true”, in the first example, and “The way this has turned out”, in the second, are not whatever might be mentioned in the preceding summary sentences, but are, in fact, hidden somewhere in the original text of the documents. Opening references to “The problem” and “the agency” are hard to resolve. The thrust of the second

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

METHOD FOR DYNAMICALLY DELIVERING CONTENTS ENCAPSULATED WITH... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with METHOD FOR DYNAMICALLY DELIVERING CONTENTS ENCAPSULATED WITH..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and METHOD FOR DYNAMICALLY DELIVERING CONTENTS ENCAPSULATED WITH... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3027902

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.