Method and apparatus for retrieving accumulating and sorting...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C711S202000

Reexamination Certificate

active

06643644

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The invention relates to a data processing method and data processing apparatus or processing large amounts of data using a computer or other information processing apparatus, and particularly to a method and apparatus for searching for, tabulating and sorting table-format data.
2. Description of the Prior Art
Conventionally, large amounts of data are accumulated and searching and tabulating and other types of data processing is performed on the accumulated data. This data processing may be done using, for example, a known computer system including a CPU, memory, peripheral interface, a hard disk or other auxiliary storage device, a display, a printer or other output device, a keyboard, a mouse or other input device, and a power supply unit connected via a bus, and particularly as software that can be run on a readily available commercial computer system. In order to perform the aforementioned searching, tabulating or other types of data processing, various types of databases that particularly store large amounts of data are known. Among various types of large amounts of data, there is a particularly strong demand to process data that can be expressed in a table format.
FIG. 1
is a diagram showing an example of expressing the data to be processed in a table format.
FIG. 1
shows an example wherein the sex, age and occupation data for a large number of people, e.g. 1 million, are stored in a table. In
FIG. 1
, the horizontal rows in the table, namely the so-called records, consist of the record number, and the sex, age and occupation fields corresponding to the record number. The vertical columns in the table consist of the record number, sex, field, age field and occupation field. The table indicates that the person with the record number of “0” has a sex of female, age of 18 and occupation of programmer. In the following explanation, the data such as “Female,” “18” and “Programmer” set in the various fields are called field values. In addition, in the following explanation, unless otherwise indicated, the table-format data consisting of 1 million records shown in
FIG. 1
is used as a specific example of a large amount of data.
Whether or not large amounts of data can be searched for or tabulated efficiently depends on the format in which the large amount of data is stored. Conventionally, typical known storage techniques include the so-called “record-sequential” and “field-sequential” storage techniques shown in
FIGS. 2A and 2B
, respectively.
FIG.
2
A and
FIG. 2B
show a representation of the data storage format on a storage device, e.g. a hard disk. In the case of the record-sequential storage technique in
FIG. 2A
, a set of the field values of sex, age and occupation for each record number is stored on disk in the order of increasing logical addresses sequentially for each record number. On the other hand, in the case of the field-sequential storage technique in
FIG. 2B
, for each field, the field values are stored in record number order grouped by field in the direction of increasing logical addresses. To wit, in the example of
FIG. 2B
, the field values for the sex field corresponding to record numbers “0” through “999999” are arranged in order, and next, the field values for the age field are arranged in record number order, and then the field values for the occupation field are arranged in record number order.
In the case of the aforementioned prior art, field values corresponding to all fields for all record numbers are stored as is in a two-dimensional data structure (with the record number as one dimension and the other field values as one dimension). Hereinafter, such a data structure in particular shall be referred to as a “data table.” In the case of the prior art, when searching for and tabulating stored data, this is performed by accessing such a data table.
In addition to the method of storing the value of the fields as field values as is, there is also a known method of converting the values to codes and storing the codes as field values. For example, with respect to the sex field, the value “Male” may be converted to “0” while the value “Female” is converted to “1” and then the values “0” or “1” are stored as the field values instead of “Male” or “Female.” Even in this case, there is no change to the point that the converted codes are stored in a data table as field values.
In the case of searching for and tabulating large amounts of data stored using a data structure of the data table type in the aforementioned prior art, there is a problem in that the processing time for searching and tabulating becomes longer due to the access time required to access such data tables.
In addition, data tables have at least the following intrinsic drawbacks.
(1) The data tables easily become enormous in size and cannot be easily separated (physically) into individual fields. For example, when extracting records in which the sex is “Male,” the age and occupation information is unnecessary, so efficiency could be improved if the table could be separated into a table containing only the sex fields. In the case of the field-sequential storage technique shown in
FIG. 2B
, while separation into individual fields is simple, when large amounts of data are handled, the size of the data table still becomes enormous, so the actual expansion of a data table into memory or other fast storage device for the purpose of tabulating or searching is difficult.
(2) Data tables cannot be kept in a form with multiple field values sorted simultaneously. For example, in the case of the prior art illustrated in FIG.
2
A and
FIG. 2B
, the field values for the sex field arc arranged in record number order in the manner “Female, Male, Female, . . . , Female.” However, when performing searching and tabulating processes, it is typically convenient for them to arrange in the manner “Female, Female, Female, Male, . . . , Male.” However, in table data, the field values are arranged in a specific matrix order, namely record number order, so sorting the field values on a specific field is not permitted. For this reason, in the case of the prior art, it is not possible to select an arrangement of field values that is convenient for searching and tabulating.
(3) In a data table, identical values appear over and over. For example, in the case of the conventional data table given in FIG.
2
A and
FIG. 2B
, at the time of extracting records wherein the sex is “‘Male’ or ‘Man’” (or namely, record numbers), because the field value “Male” appears many times, it is necessary to perform the matching operation “‘Male’ or ‘Man’” which is the comparison condition with the field value of “Male” many times. A single comparison should be sufficient to make the determination of whether there is a match with identical values.
In order to increase greatly the speed of searching for and tabulating large amounts of data, the object of the present invention is to provide a method of searching for, tabulating and sorting table-format data and an apparatus for implementing said method by providing a data control mechanism that both has the functions of the conventional data table and solves the aforementioned problems with the data structure based on the data table.
SUMMARY OF THE INVENTION
In order to achieve the aforementioned object, the method and apparatus for searching for and tabulating table-format data according to the present invention proposes a novel data control mechanism that is usable on an ordinary computer system. The data control mechanism according to the present invention comprises a value control table and an array of pointers to the value control table, as a general rule.
FIG. 3
is a diagram used to explain the principle of the present invention, showing a value control table
10
and an array of pointers to the value control table
20
. A value control table
10
is defined to be a table made by assigning, for each field in table-format data, an (integral) field value number to each field value belonging to that field, and the table thus

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for retrieving accumulating and sorting... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for retrieving accumulating and sorting..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for retrieving accumulating and sorting... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3150107

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.