Information collecting apparatus

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C709S217000, C709S229000, C713S152000, C345S215000

Reexamination Certificate

active

06408297

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates to an information collecting apparatus for collecting information from WWW (World Wide Web) sites and more particularly to, an information collecting apparatus enabling quick and accurate collection of the information required by a user.
The amount of information contained in the WWW sites is increasing in association with recent rapid developments of the Internet. A browser is used for collecting and browsing this enormous information. When information is to be collected and browsed, a browser is started up in a client terminal and an URL (Uniform Resource Locator) which specifies the information containing site is supplied and the desired information is collected.
However, considering the background such as significant increase in the amount of information on the WWW sites in recent years and the high frequent of updating the information, it is getting difficult day by day to quickly collect the latest information because a browser is required to be repeatedly started up each time the desired information is to be collected.
BACKGROUND OF THE INVENTION
FIG. 9
is a block diagram showing a general configuration of an information collecting apparatus based on the conventional technology. The information collecting apparatus shown in this figure is connected to a network such as the Internet and collects information from the WWW sites. This information collecting apparatus comprises a display
101
for displaying data thereon, an input device
102
comprising a keyboard or a pointing device such as a mouse, a memory
103
for storing therein scrap data identifying information or the like described later, and a computer
104
providing controls over the display
101
, the input device
102
, and the memory
103
to execute various processing.
Herein,
FIG. 10
shows a block diagram of functions of the information collecting apparatus based on the conventional technology. A shown in this figure, the information collecting apparatus comprises a user interface
201
for specifying a particular area of a WWW document by a user, a scrap data identifying information generating section
202
for generating scrap data identifying information used for identifying the data specified by a user inside a WWW document, a scrap information memory
203
for storing therein a set of a URL of the WWW document specified by the user and scrap data identifying information as scrap information, and a scrap page updating section
207
.
The scrap page updating section
207
comprises a WWW document collecting section
205
for collecting a WWW document corresponding to the specified URL from a WWW server
208
via the Internet (not shown), a data extracting section
204
for cutting out a portion of a WWW document collected anew according to the scrap-data identifying information, and an extracted data linking section
206
for linking one extracted data to another to form one document.
In the description below, data that a user specifies on the user interface
201
is called “scrap data”, information for identifying a starting point and an end point of the scrap data inside a WWW document is called “scrap data identifying information”, and a set of a URL of a WWW document with scrap data specified by the user and the scrap data identifying information is called “scrap information”. Herein, as the user interface
201
for specifying the scrap data, any kind of interface may be employed on condition that URL of a WWW document containing a data required by the user and a starting point as well as an end point in the WWW document can be identified. As an example of this user interface
201
, a browser having a function of selecting a text on a display may be considered.
When the browser is used as the user interface
201
, the user starts up the browser and selects a particular portion in the document as shown in
FIG. 13
(the selected portion is shown in
FIG. 13
as hatched area but in reality the portion may be displayed in reverse video). The selected portion represents the scrap data required by the user.
FIG. 13
is a view showing an example (one of screen displays) of selection of scrap data on the browser.
When the scrap data is pointed using the user interface
201
as described above, the URL of the WWW document currently appearing on the browser is stored in the scrap information memory
203
. Further, the browser (the user interface
201
) transfers the displayed www document in a form of a HTML (Hyper Text Markup Language) document as well as the data specified by the user as scrap data to the scrap data identifying information generating section
202
.
The scrap data identifying information generating section
202
generates the scrap data identifying information for identifying a starting point and an end point of the scrap data in the WWW document from the HTML document and the scrap data, and the scrap information memory
203
stores this information. This scrap data identifying information is used in the data extracting section
204
when information required by the user is collected afterward from a newly collected WWW document. Therefore, the scrap data identifying information satisfies, even after the WWW document at the WWW site (WWW server
208
) is changed, a condition that the information is quite possible to remain the changed WWW document.
As an example of the scrap data identifying information satisfying the condition described above, contents of an initial line of scrap data and contents of immediately before or immediately after the starting or end points of the scrap data may be considered. Generally, the user specifies a portion inside a WWW document which have a possibility of being changed as scrap data, but, in many cases, the contents before and after such an area in the WWW document is not changed. Thus, contents of a line immediately before the scrap data, initial line of the scrap data, and a line immediately after the scrap data are important. Therefore, in the conventional type of information collecting apparatus, it is assumed that contents of a line immediately before scrap data, an initial line of the scrap data, and a line immediately after the scrap data are stored in the scrap information memory
203
.
FIG. 11
is a view showing an example of scrap information stored in the scrap information memory
203
. As show in this figure, information contained a line immediately before the scrap data, in the initial line of the scrap data, and in a line immediately after the scrap data is stored in the scrap information memory
203
in correlation with the URL of the WWW document-specified by the user. More specifically, when the scrap data (the section displayed in reverse video in
FIG. 13
) is specified by the user in a state in which the HTML document shown in
FIG. 12
is displayed using the browser as shown in
FIG. 13
, the information shown in the third line in
FIG. 11
is stored in the scrap information memory
203
. Namely, the line immediately before the scrap data is “Today's top news” (Refer to FIG.
12
), the initial line of the scrap data is “15:00 10/21 Update” (Refer to FIG.
12
), and the line immediately after the scrap data is <HR> (Refer to FIG.
12
). It should be noted that <HR> is a tag representing a horizontal line.
When the information is stored in the scrap information memory
203
as described above and a request for collecting the latest WWW document is issued by the user, in other words, when the user starts up the browser, the WWW document collecting section
205
collects the latest WWW document corresponding to, for instance, the URL described in the third line of
FIG. 11
from the WWW server
208
via the Internet (not shown). When the WWW document is collected, the data extracting section
204
identifies the starting point and the end point of data required by the user from the latest WWW document collected anew according to the scrap information stored in the scrap information memory
203
, and extracts the data enclosed within the starting point and the end

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Information collecting apparatus does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Information collecting apparatus, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Information collecting apparatus will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2959950

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.