System and method for dynamically associating keywords with...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details System and method for dynamically associating keywords with... System and method for dynamically associating keywords with...

: 2000-04-04
: 2003-02-04
: Vu, Kim (Department: 2172)
: Data processing: database and file management or data structures
: Database design
: Data structure types

: C707S793000, C707S793000, C707S793000, C707S793000, C707S793000, C707S793000
: Reexamination Certificate
: active
: 06516312
: ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to the field of data processing, and particularly to a software system and associated method for use with a search engine, to search data maintained in systems that are linked together over an associated network such as the Internet. More specifically, this invention pertains to a computer software product for dynamically associating keywords encountered in abstracts or summaries of a search result set, with domain-specific search engine queries, in order to retrieve resources pertaining to the keywords within the context of a current information sphere.
BACKGROUND OF THE INVENTION
The World Wide Web (WWW) is comprised of an expansive network of interconnected computers upon which businesses, governments, groups, and individuals throughout the world maintain inter-linked computer files known as web pages. Users navigate these pages by means of computer software programs commonly known as Internet browsers. Due to the vast number of WWW sites, many web pages have a redundancy of information or share a strong likeness in either function or title. The vastness of the unstructured WWW causes users to rely primarily on Internet search engines to retrieve information or to locate businesses. These search engines use various means to determine the relevance of a user-defined search to the information retrieved.
The authors of web pages provide information known as metadata, within the body of the hypertext markup language (HTML) document that defines the web pages. A computer software product known as a web crawler, systematically accesses web pages by sequentially following hypertext links from page to page. The crawler indexes the pages for use by the search engines using information about a web page as provided by its address or Universal Resource Locator (URL), metadata, and other criteria found within the page. The crawler is run periodically to update previously stored data and to append information about newly created web pages. The information compiled by the crawler is stored in a metadata repository or database. The search engines search this repository to identify matches for the user-defined search rather than attempt to find matches in real time.
A typical search engine has an interface with a search window where the user enters an alphanumeric search expression or keywords. The search engine sifts through available web sites for the user's search terms, and returns the search of results in the form of HTML pages. Each search result includes a list of individual entries that have been identified by the search engine as satisfying the user's search expression. Each entry or “hit” includes a hyperlink that points to a Uniform Resource Locator (URL) location or web page.
In addition to the hyperlink, certain search result pages include a short summary or abstract that describes the content of the URL location. Typically, search engines generate this abstract from the file at the URL, and only provide acceptable results for URLs that point to HTML format documents. For URLs that point to HTML documents or web pages, a typical abstract includes a combination of values selected from HTML tags. These values may include a text from the web page's “title” tag, from what are referred to as “annotations” or “meta tag values” such as “description,” “keywords,” etc., from “heading” tag values (e.g., H
1
, H
2
tags), or from some combination of the content of these tags.
However, for one HTML parent page with links to multiple different relevant non-HTML documents that satisfy the user's search criteria, the search result may include multiple identical URLs, one for each relevant non-HTML document. Each of these identical URLs points to the same HTML parent page, and each may include an identical abstract that is descriptive of the parent HTML page. As a result, the search results in redundant abstracts that can be practically useless, distracting, and time consuming to review.
More specifically, the popularity of domain-specific portal sites, that act as gateways to very specialized information sources, has grown concurrently with the WWW, both in complexity and volume of data. The term “portal” is generally synonymous with gateway, and is typically used to refer to a WWW site which is intended to be a major starting site or as an anchor site for web users. Current leading general-purpose portal sites include: Yahoo!®, Excite®, Netscape®, Lycos®, Cnet®, and MSN The Microsoft Network®. However, while such portal sites attempt to serve as gateways to a wide variety of general-purpose information, specialized portals have also been gaining popularity in recent years.
Specialized portal sites, such as the jCentral®, xCentral, etc., attempt to focus on a particular domain that appeals to a target audience. By limiting the scope of their operation, the belief is that specialized portal sites will be able to present information of greater relevance to their target audience.
For example, in a portal site such as jCentral® that caters to users interested to learn more about the Java programming language and related topics, the users are allowed to conduct a search by querying the portal database. The portal database is a vast repository of pre-collected, indexed, and summarized information, typically gathered from the WWW using automated crawling tools. When a user enters a query, the portal's search engine attempts to match the keywords specified by the user with summarized metadata that have been previously extracted from the documents stored in the repository, and then returns an ordered list of potential candidate matches relevant to the user's query.
Typically, the search engine will return a result set for a search query including a URL and a text based abstract of the original resource. Sometimes, users are able to control the length of the abstract. For instance, the HotBot® site at URL: http://www.hotbot.com, provides the choice of having only a list of URLs displayed as the search result, the URL with a brief abstract, or a comprehensive abstract.
However, since the abstract is usually generated on the server side, a resulting problem is the inability of the users to obtain more detailed information pertaining to domain-specific terms that appear in the abstract, without issuing a separate query with the relevant term as the new keyword. By so doing, the user might become distracted and distanced from the original search result. Moreover, the conventional search engines do not provide the capability to allow users to dynamically conduct an automatic search based on keywords that appear in an abstract or summary. Rather, the full text of the abstract or summary is displayed to the user.
There is currently no adequate mechanism by which search engines allow the user to dynamically interface with the search abstract, such as by selecting a term of interest in the abstract to obtain more information about this term within the context of the domain being queried. The need for such a mechanism has heretofore remained unsatisfied.
SUMMARY OF THE INVENTION
The abstract keywords association system and associated method of the present invention satisfy this need. In accordance with one embodiment, the abstract keywords association system allows the user to dynamically interface with the search abstract. The user selects a term of interest in the abstract, and the abstract keywords association system automatically provides the user with additional information about this term within the context of the domain being queried. This permits the user to consider more information and to better judge the usefulness of the resource and search result.
The abstract keywords association system of the present invention provides several features and advantages, among which are the following:
The ability to automatically detect and select keywords from abstracts of search result items, by using a domain-specific dictionary of keywords.
The ability to select and generate an optimal query string for a particular keyword. T

Affiliated with

Kraft Reiner

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Tewari Gaurav

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

International Business Machine Corporation

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Kassatly Samuel A.

Attorney

[ 0.00 ] – not rated yet Voters 0 Comments 0

Pham Hung

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

Vu Kim

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for dynamically associating keywords with... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System and method for dynamically associating keywords with..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for dynamically associating keywords with... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-3164623

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure