Electrical computers and digital processing systems: multicomput – Miscellaneous
Reexamination Certificate
1998-12-30
2002-10-08
Barot, Bharat (Department: 2154)
Electrical computers and digital processing systems: multicomput
Miscellaneous
C709S203000, C709S217000, C709S229000, C707S793000, C707S793000, C707S793000
Reexamination Certificate
active
06463455
ABSTRACT:
FIELD OF THE INVENTION
This invention relates generally to a method and computer-readable medium for analyzing data and, more specifically a method and computer-readable medium for analyzing data from a plurality of network sites.
BACKGROUND OF THE INVENTION
Internet crawlers query web sites in order to get index information and provide Internet search data. In the past, no tool has existed that adequately analyzes the data resulting from web crawlers querying web sites. In this regard, it is desirable for an Internet analysis tool to provide statistics about data found on Internet sites. Desirable statistics include such diverse information as the percentage of educational sites, the average amount of graphics per site, the average amount of hyper-links per site, etc.
An acceptable Internet analysis tool must be able to query a large volume of web sites, scan the hypertext markup language (HTML) files downloaded from the sites and provide results of analysis criteria based on the contents of the HTML files. The tool should be able to process large volumes of data without operator intervention. The present invention is directed to providing such a tool.
SUMMARY OF THE INVENTION
In accordance with the present invention, a method and computer-readable medium for analyzing network data, in particular Internet data, is provided. The method and computer-readable medium for analyzing network data comprises: obtaining the identity of one or more sites (web sites in the case of the Internet) to query; obtaining one or more query criteria; accessing the one or more sites; and analyzing the query criteria in the site data.
In accordance with another aspect of the present invention, the results of an Internet analysis are displayed.
In accordance with a further aspect of the present invention, the results of an Internet analysis are stored.
In accordance with yet another aspect of the present invention, the query criteria is determined by the user. Preferably, the user determined query criteria is saved for subsequent analyses.
In accordance with yet a further aspect of the present invention, a default set of query criteria is provided. Preferably the default query criteria is user modifiable, and the user can either save modified query criteria as the new default query criteria, or as a different query criteria, leaving the existing default criteria unchanged.
In accordance with still further aspects of the present invention, a user selects the sites (e.g., the Internet web sites) to be analyzed.
In accordance with an alternative aspect of the present invention, the sites to be analyzed are randomly selected. Preferably, the number of sites to be randomly selected is determined by the user.
In accordance with further alternative aspects of the present invention, an existing site list is used to identify the sites to be analyzed. Preferably , the user can modify and save the site list.
In accordance with further aspects of the present invention analyzing the query criteria can be accomplished by counting occurrences of the query criteria in the site data. Alternatively, analysis can be accomplished by determining the size of the data specified by the query criteria.
In accordance with another aspect of the present invention, Internet trends are tracked by performing the same analysis at different times. Trends tracking can be done manually or automatically.
In accordance with yet another aspect of the present invention, the time increment for automatic trends tracking is determined by the user, such as on a monthly basis.
In accordance with yet still another aspect of the present invention, occurrences of a text string are counted if found anywhere within the HTML file. Alternatively, occurrences are only counted if found in a specified HTML tag. For example, files containing <script> tags that have the “language” attribute where the attribute value is “javascript”. The preceding example provides the user with the summary information regarding the number of files found during an analysis that include JavaScript. Alternatively, the count may be about the tag itself, for example how often bold text is included in HTML files.
In accordance with a further aspect of the present invention analysis is only performed on the sites specified in the site list. Alternatively, links found in the site can be followed and analysis can be performed on the linked sites as well as the sites referenced directly in the site list.
REFERENCES:
patent: 5659732 (1997-08-01), Kirsch
patent: 5666526 (1997-09-01), Reiter et al.
patent: 5678041 (1997-10-01), Baker et al.
patent: 5855015 (1998-12-01), Shoham
patent: 5884304 (1999-03-01), Davis, III et al.
patent: 5918010 (1999-06-01), Appleman et al.
patent: 5918013 (1999-06-01), Mighdoll et al.
patent: 5948054 (1999-09-01), Nielsen
patent: 5963944 (1999-10-01), Adams
patent: 6073167 (2000-06-01), Poulton et al.
patent: 6154745 (2000-11-01), Kari et al.
patent: 6182122 (2001-01-01), Berstis
patent: 196 51 788 A 1 (1998-06-01), None
Baldazo, R., “Navigating with a Web Compass,”Byte, Mar. 1, 1996, vol. 21, No. 3, pp. 97-98, XP 000600179.
Douglis, F., et al., “The AT&T Internet Difference Engine: Tracking and Viewing Changes on the Web,” AT&T Labs—Research Technical Report #97.23.1, Apr. 14, 1997, XP-002135690.
Montebello, M., et al., “Optimizing Recall/Precision scores in IR over the WWW,” Proceedings of the 21stAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia, 1998, pp. 361-362, XP-002135685.
Blumer Thomas P.
Turner Cameron R.
Barot Bharat
Christensen O'Connor Johnson & Kindness PLLC
Microsoft Corporation
LandOfFree
Method and apparatus for retrieving and analyzing data... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for retrieving and analyzing data..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for retrieving and analyzing data... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2980680