Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
1998-04-22
2001-01-30
Alam, Hosain T. (Department: 2771)
Data processing: database and file management or data structures
Database design
Data structure types
C707S793000, C707S793000, C707S793000
Reexamination Certificate
active
06182065
ABSTRACT:
TECHNICAL FIELD
The present invention relates to a search engine for searching data stored in a database.
BACKGROUND OF THE INVENTION
In recent years, there has been explosive growth in the Internet, and in particular of the WorldWide Web (WWW), which is one of the facilities provided via the Internet. The WWW comprises many pages or files of information, distributed across many different remote servers. Each page is identified by an individual address or “Universal Resource Locator (URL)”. Each URL denotes both a remote server, and a particular file or page on that remote server. There may be many pages or URLs resident on a single remote server.
Typically, to utilise the WWW, a user runs a computer program called a Web browser on a user terminal such as a personal computer system. Examples of widely available Web browsers include the “WebExplorer” Web browser provided by International Business Machines Corporation in the OS/
2
Operating System software, or the “Navigator” Web browser available from Netscape Communications Corporation. The user interacts with the Web browser to select a particular URL. The interaction causes the browser to send a request for the page or file identified in the selected URL to the server identified in the selected URL. Typically, the remote server responds to the request by retrieving the requested page, and transmitting the data for that page back to the requesting user terminal. The client-server interaction between the user terminal and the remote server is usually performed in accordance with a protocol called the hypertext transfer protocol (“http”). The page received by the user terminal is then displayed to the user on a display screen of the client. The client may also cause the server to launch an application such as a search engine to search for WWW pages relating to particular topics stored on other servers connected to the Internet.
WWW pages are typically formatted in accordance with a computer programming language known as hypertext markup language (“html”). Thus a typical WWW page includes text together with embedded formatting commands, referred to as tags, that can be employed to control for example, font style, font size, lay-out, etc. The Web browser parses the HTML script in order to display the text in accordance with the specified format. In addition, an html page may also contain a reference, in terms of another URL, to a portion of multimedia data such as an image, video segment, or audio file. The Web Browser responds to such a reference by retrieving and displaying or playing the multimedia data. Alternatively, the multimedia data may reside on its own WWW page, without surrounding html text.
Most WWW pages also contain one or more references to other WWW pages, which need not reside on the same server as the original page. Such references may be activated by the user selecting particular locations on the screen, typically by clicking a mouse control button. These references or locations are known as hyperlinks, and are typically flagged by the Web browser in a particular manner. For example, any text associated with a hyperlink may be displayed in a different colour. If a user selects the hyperlinked text, then the referenced page is retrieved and replaces the currently displayed page.
Further information about html and the WWW can be found in “World Wide Web and HTML” by Douglas McArthur p18-26 in Dr Dobbs Journal, December 1994, and in “The HTML Source Book” by Ian Graham, John Wiley, New York, 1995.
Conventional search engines, such as AltaVista (trade mark of Digital Equipment Corporation) and Yahoo! (trade mark of Yahoo! Inc.) search a database containing URLs of WWW pages together with one or more keywords associated with each URL. The URLs and keywords are typically sent to the entity responsible for maintaining the database by the entities responsible for the corresponding WWW pages. In operation, a typical search engine receives a search parameter from a user terminal and responds by searching the database for keywords matching the search parameter. When a match is found, the search engine adds the corresponding URL, typically in the form of a hypertext link, to a list which, in turn, is sent to the user. The user then selects a WWW page to access from the list.
A problem with conventional search engines is that they tend to return very large lists of WWW pages in response to each inquiry. Typically, a large fraction of the WWW pages listed in response to an inquiry originate at a single WWW site. The volume of WWW pages listed by a search engine in response to an inquiry makes subsequent selection of desired WWW page by a user difficult and time-consuming.
SUMMARY OF THE INVENTION
In accordance with the present invention, there is now provided a search engine for searching a database containing a plurality of data entries wherein one or more of the data entries comprise a link to one or more others of the data entries, the search engine comprising: means for receiving an input search parameter from a user; means for comparing the input search parameter with the plurality of data entries; means responsive to the comparison means for identifying from the plurality of data entries a set of data entries matching the input search parameter; means for dividing the set of matched data entries into sub-sets, each sub-set comprising data entries having links to each other; and, means for determining, for each data entry of each sub-set, a weighting in dependence on the number of links contained in each data entry to others of the data entries of the corresponding sub-set.
In preferred embodiments of the present invention, the search engine comprises means for providing the sub-sets of matched data entries to the user.
In particularly preferred embodiments of the present invention, the search engine comprises means for providing the sub-sets of matched data entries to the user arranged as a function of the weights determined for each data entry therein to the user.
Preferred examples of the present invention comprise means for providing the weights determined for each data entry in the sub-sets to the user.
Preferably, the data entries contained in the database are representative of WWW pages stored on the Internet.
It will be appreciated that the present invention extends to a computer system comprising a central processing unit, memory means, a bus architecture interconnecting the memory means and the central processing unit, and a search engine as hereinbefore described stored in the memory means for activation by the central processing unit.
Viewing the present invention from another aspect, there is now provided a method for searching a database containing a plurality of data entries wherein one or more of the data entries comprise a link to one or more others of the data entries, the method comprising: receiving an input search parameter from a user; comparing the input search parameter with the plurality of data entries; in response to the comparison, identifying from the plurality of data entries a set of data entries matching the input search parameter; dividing the set of matched data entries into sub-sets, each sub-set comprising data entries having links to each other; determining, for each data entry of each sub-set, a weighting in dependence on the number of links contained in each data entry to others of the data entries of the corresponding sub-set.
Viewing the present invention from yet another aspect, there is now provided a computer program product for searching a database containing a plurality of data entries wherein one or more of the data entries comprise a link to one or more others of the data entries, the product comprising: first code means for receiving an input search parameter from a user; second code means for comparing the input search parameter with the plurality of data entries; third code means responsive to the comparison for identifying from the plurality of data entries a set of data entries matching the input search parameter; fourth code means for dividing the set of
Alam Hosain T.
Clay A. Bruce
Corrielus Jean M.
International Business Machines Corp.
LandOfFree
Method and system for weighting the search results of a... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and system for weighting the search results of a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for weighting the search results of a... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2444264