Distributed metadata searching system and method

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C709S202000

Reexamination Certificate

active

06434548

ABSTRACT:

CROSS-REFERENCE TO RELATED APPLICATIONS
Not Applicable
BACKGROUND OF THE INVENTION
1. Field of the Invention
The invention relates to the field of Internet Search Engines, Web browsers, and resource gathering and has special application in situations where these functions must be implemented in extremely large networks.
2. Description of the Related Art
One of the greatest challenges for a repository that contains summary data (metadata) of external data resources is to keep this metadata up to date. Currently the broad solution is to periodically exhaustively search (recrawl) the stored resources, summarize them and store the update summary data in the repository while replacing the old data. This produces a huge workload. Considering a repository with 100 million documents and an average download and process time of 30 seconds could lead also to a very time consuming experience.
The problem present with the prior art is the inherent difficulty for web crawlers to adequately search and process the vast amounts of information available on the Internet. Referring to
FIG. 1
, a typical web crawler (
0101
) would use one or more communication media (
0111
,
0112
,
0113
) with corresponding communication links (
0121
,
0122
,
0123
) to access a plethora of Internet web sites (
0131
,
0132
,
0133
) and thus incur 100% of the computing and time penalty cost for performing the search and organizing the data.
As stated previously, with the volume of data available on the Internet increasing at an exponential rate, the resources required to perform this search and organizing effort is substantial and becoming a significant burden on Internet web search engines. While current approaches to this problem involve the use of additional web crawlers (
0101
), this represents an unacceptable cost burden on Internet search sites given the recent trends towards nonlinear data growth in the Internet.
Accordingly, a need exists for a method and a system to permit reduction of the resources required for Internet search engines and their corresponding data retrieval and organization tasks.
SUMMARY OF THE INVENTION
The present invention is related in the area of today's Internet search engines consist roughly of two major parts. One part is responsible for resource gathering. The other part handles the information storage and indexing. The present invention addresses one problem that arises while working on resource gathering using web crawlers or gatherers.
By using a parallel architecture for the gatherers and using new approaches (distributed team crawling), the amount of time to perform a complete web search and index can be dramatically reduced. However, it still takes a considerable amount of time and resources to keep the repository up to date. Adding more resources is generally the sole responsibility of the search engines owners. The present invention seeks to reduce the time and resources spend by the search engine companies by placing some of the resource gathering tasks in the hand of the user's web browser. The present invention would typically load a small program (e.g. Java applet) into the browser that would perform some specified resource gathering and summarization (lightweight task). A user of the search engine would still perform all the steps they currently perform in activating a search, including: starting at the home page of the search engine; typing some keyword(s) and selecting “start search”; and viewing the results screen and selecting from the results.
The present invention could be loaded for as many URL's as the search engine owner decides. However, to the user of the search engine the present invention does not need to appear graphically and may operate as a background task. With the present invention in place, a small program is loaded into the user's Internet browser and can be directed to perform several information gathering tasks such as: crawl a specified URL; inform the search engine if the site has updated since a particular date; and inform the search engine of any changes to web data since a particular date.
The present invention generally operates within the context of the user's computer on a voluntarily basis. Possible motivation for the user to participate in this searching methodology could include (but not be limited by) the following:
Free membership (receive free reviews, articles, research material, free notification service of search results) from the Internet Search engine;
Some reward (based on a specific amount of donated computing resources he/she receives a free T-Shirt, book, CD, etc.);
Though participation the search engine will be more up to date and provide improved of search accuracy. In this manner users of the search engine community may actively help to improve the search quality by their participation.
Because the present invention can be implemented in Java (Sandbox model, etc.) it is secure and cannot inflict any damage to the user's computer, because it has no write access to the user's storage systems. Processing results will be sent to the web server using a network connection. Furthermore, Java applets are already a common standard and enjoy a high acceptance among Internet users.


REFERENCES:
patent: 5442784 (1995-08-01), Powers et al.
patent: 5832500 (1998-11-01), Burrows
patent: 5855020 (1998-12-01), Kirsch
patent: 5950201 (1999-09-01), Van Huben et al.
patent: 6081840 (2000-06-01), Zhao Yan
patent: 6212545 (2001-04-01), Ohtani et al.
Gavin McCormick, “FAST Claims it wins the search engine speed slalom”, Mass High Tech, Aug. 30-Sep. 5, 1999, p. 7.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Distributed metadata searching system and method does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Distributed metadata searching system and method, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Distributed metadata searching system and method will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2954177

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.