System and method for geographically organizing and...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000, C707S793000, C715S252000, C715S252000

Reexamination Certificate

active

06691105

ABSTRACT:

BACKGROUND OF THE INVENTION
The present invention generally relates to a resource discovery system and method for facilitating local commerce on the World-Wide Web and for reducing search time by accurately isolating information for end-users. For example, distinguishing and classifying business pages on the Web by business categories using the Standard Industrial Classification (SIC) codes is achieved through an automatic iterative process which effectively localizes the Web.
Description of the Related Art
Resource discovery systems have been widely studied and deployed to collect and index textual content contained on the World-Wide Web. However, as the volume of accessible information continues to grow, it becomes increasingly difficult to index and locate relevant information. Moreover, global flat file indexes become less useful as the information space grows causing user queries to match too much information.
Leading organizations are attempting to classify and organize all of Web space in some manner. The most notable example is Yahoo, Inc. which manually categorizes Web sites under fourteen broad headings and 20,000 different sub-headings. Still others are using advanced information retrieval and mathematical techniques to automatically bring order out of chaos on the Web.
Solutions to solve this information overload problem have been addressed by C. Mic Bowman et al. using Harvest: A Scalable, Customizable Resource Discovery and Access System. Harvest supports resource discovery through topic-specific content indexing made possible by a very efficient distributed information gathering architecture. However, these topic specific brokers require manual construction and they are geared more for academic and scientific research than commercial applications.
Cornell's SMART engine developed by Gerard Salton uses a thesaurus to automatically expand a user's search and capture more documents. Individual, Inc. uses this system to sift through vast amounts of textual data from news sources by filtering, capturing, and ranking articles and documents based on news industry classification.
The latest attempts for automated topic-specific indexing include the Excite, Inc. search engine which uses statistical techniques to build a self-organizing classification scheme. Excite Inc.'s implementation is based on a modification of the popular inverted word indexing technique which takes into account concepts (i.e., synonymy and homonymy) and analyzes words that frequently occur together. Oracle has developed a system called ConText to automatically classify documents under a nine-level hierarchy that identifies a quarter-million different concepts by understanding the written English language. ConText analyzes a document and then decides which of the concepts best describe the document's topic.
The systems described above all attempt to organize the vast amounts of data residing on the Web. However, these mathematical information retrieval techniques for classifying documents only work when the message of a document is directly correlated to the words it contains. Attempts to isolate documents by regions or to separate business content from personal content in an automated fashion is not addressed by any conventional system or structure.
SUMMARY OF THE INVENTION
It is therefore an object of the present invention to provide a method and system for overcoming the above-mentioned problems of the conventional methods and techniques.
The invention is based on a heuristic algorithm which exploits common Web page design principles The key challenge is to ascertain the owner of a Web page through an iterative process. Knowing the owner of a Web page helps identify the nature of the content business or personal which, in turn, helps identify the geographic location.
In a first aspect of the invention, a method of classifying a source publishing a document on a portion of a network, includes steps of electronically receiving a document, based on the document, determining a source which published the document, and assigning a code to the document based on whether data associated with the document published by the source matches with data contained in a database.
In a second aspect, a search engine is provided for use on a network for distinguishing between business web pages and personal web pages. The search engine includes a mechanism for parsing the content of a hyper-text markup language (HTML) at a web address and searching for criteria contained therein, a mechanism for analyzing a uniform resources locator (URL) of the web address to determine characteristics thereof of a web page at the web address, a mechanism for determining whether the criteria match with data contained in a database, and a mechanism for cross-referencing a match, determined by the determining mechanism, to a second database, to classify a source which published the web page.


REFERENCES:
patent: 4586091 (1986-04-01), Panaoussis
patent: 4809081 (1989-02-01), Linehan
patent: 5200993 (1993-04-01), Wheeler et al.
patent: 5384835 (1995-01-01), Wheeler et al.
patent: 5404510 (1995-04-01), Smith et al.
patent: 5412804 (1995-05-01), Krishna
patent: 5452445 (1995-09-01), Hallmark et al.
patent: 5485608 (1996-01-01), Lomet et al.
patent: 5485610 (1996-01-01), Gioielli et al.
patent: 5495608 (1996-02-01), Antoshenkov
patent: 5500929 (1996-03-01), Dickinson
patent: 5530852 (1996-06-01), Meske, Jr. et al.
patent: 5751961 (1998-05-01), Smyk
patent: 5764906 (1998-06-01), Edelstein et al.
patent: 5778367 (1998-07-01), Wesinger, Jr. et al.
patent: 5805810 (1998-09-01), Maxwell
patent: 5828990 (1998-10-01), Nishino et al.
patent: 5878233 (1999-03-01), Schloss
patent: 5878398 (1999-03-01), Tokuda et al.
patent: 5930474 (1999-07-01), Dunworth et al.
patent: 5948040 (1999-09-01), DeLorme et al.
patent: 6002853 (1999-12-01), de Hond
patent: 6148289 (2000-11-01), Virdy
patent: 2 114 407 (1983-08-01), None
patent: WO 93/18484 (1993-09-01), None
patent: WO 95/08809 (1995-03-01), None
patent: WO 95/09395 (1995-04-01), None
Schwartz et al., “Applying an Information Gathering Architechture to Netfind: A White Pages Tool for a Changing and Growing Internet”. IEE/ACM Transactions on Networking, vol. 2, No. 5, Oct. 1994, pp. 426-439.*
Database 16 (IAC PROMT) on Dialog, No. 6303723, “IMPERATIVE! Announces New Site for Locating Companies on the Internet”, PR Newswire, Jul. 22, 1996, 1 page.
Database 148 (Trade and Industry Database) on Dialog, No. 7559252, “Open Market Inc. offers Internet Users Free Access to Directory of Commercial Sites on the Internet”, Business Wire, Nov. 8, 1994, 2 pages.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for geographically organizing and... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with System and method for geographically organizing and..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for geographically organizing and... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3350329

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.