Method and apparatus for storage and retrieval of very large...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000, C707S793000

Reexamination Certificate

active

06651074

ABSTRACT:

FIELD OF THE INVENTION
This invention relates generally to storage and retrieval of large databases, and more particularly to the use of temporary storage for moving large files.
BACKGROUND OF THE INVENTION
As large computing enterprises continue their migration from centralized mainframes and volumes of data on direct access storage devices (DASD) and serial tape drives, the reliance upon open systems database technology has increased. The ability to quickly adapt mass storage systems to new platforms, while retaining high performance and reliability will remain key elements for system integrators.
In earlier times, rooms full of DASD and tape drives were maintained by legions of storage operators who physically mounted and unmounted tapes and disk packs and moved them to and from libraries according to daily schedules and batch job instructions. Technology improvements allowed the use of self-contained “mass storage” units, using robotic arms to move archived storage media to and from the drive mechanisms in a matter of seconds. Further developments of storage media have enabled a cache model in which large masses of data are held in offline resources and smaller portions can be uploaded to the high-speed cache as necessary. Data availability has also been increased through the use of arrays of mirrored databases, either single or multi-threaded, for multiple simultaneous access capabilities.
Even with mirrored systems, operational concerns often require that the magnetic or other storage media be archived or “backed up.” There are also occasional system reorganizations or restructuring during which a database may be converted by copying it out in one format (“export”) and copying it back in (“import”) with a different format, or into a different structure. Back-up issues will also arise when converting from one database management system (DBMS) to another, or sharing databases. Application programs themselves may also request the operating system to make a large “save” of data files, usually after a modification is made to the file. Backing up of large volumes of data can be a time consuming and resource intensive operation. Mass storage systems, disadvantageously may be unavailable while large back-up operations are performed.
FIG. 1
illustrates a typical system in which a host computer
10
is connected to a backup system
12
and a storage system
14
. Creating a physical backup of the entire database
24
of the storage system
14
often requires a large investment of time and resources. In an open, networked array of storage devices, a physical backup of a database may be handled by arranging for a DBMS
22
, such as provided by Oracle Corporation, to communicate with a dedicated back-up system
12
, such as the EMC Data Manager, from EMC Corporation of Hopkinton, Mass. The DBMS system vendor often supplies an Application Programming Interface (API)
20
that can be installed in the host computer to handle the scheduling of the regular backups. The DBMS system typically reads the data from the database via the local DASD interface (such as SCSI bus), and delivers a buffer of data through the API. The application running the backup may be customized or optimized for the particular mass storage system selected, such as the EMC Data Manager (EDM) which is optimized to run with EMC's Symmetrix storage system(s). The EMC backup application, or something similar, take the necessary steps to send the data over the network using a connection-oriented protocol such as Transmission Control Protocol (TCP). The receiving backup system then sends the data to a mass storage unit
18
, such as to write the data to an archive tape
18
A. The major drawbacks of physical backup include that logical structures, such as tables of data cannot be backed up. Further, data cannot be transferred between machines of differing operating systems. Additionally, data blocks cannot be reorganized into a more efficient layout.
Many of the major DBMS companies also provide a more generalized facility in which the data is exported as a standardized file, such as in ASCII format, as part of a so-called “logical backup.” The ASCII format permits the file to be imported into most other systems, without insurmountable compatibility problems. However, presently DMBS companies generally do not provide the API necessary for a customer to properly handle the data stream generated by the logical backup. The result is that many of the DBMSs generate very large backup files that have to be stored locally until they can be written to an archive device.
To overcome this disadvantage, some customers create their own primitive solution by attaching a physical tape drive to the machine. The logical backup data stream is then directed into a process that Unix calls a “pipe,” buffered, and then directed (“piped”) by another Unix command such as one that writes the data to the local tape drive
16
, a DASD, or other demountable, writable media. A Unix pipe can be thought of as a FIFO (first-in first-out) data file having one process writing data into it serially and another process serially reading data out. When the pipe is “empty,” the reading process simply waits for more data to be written by the other process. Other non-Unix operating systems such as DOS and Windows NT emulate the Unix pipe in various ways with similar results. Logical data streams are thus directed from a database export into another process that disposes of the data to the physical storage media, thus freeing up storage resources.
This primitive solution has several disadvantages. For one thing, it requires a physical tape drive
16
to be attached to the computer host
10
generating the backup. Alternatively, the logical backup could be piped to a command that writes the data onto disk or equivalent. However, this solution would require each such machine to have huge amounts of excess storage capacity. In either case, additional operations personnel must be assigned to handling the tapes and disks, and maintaining the drives. Extra storage devices, media libraries, and personnel also take up extra space in the facility. Another alternative would be to pipe the logical backup data stream into the network interface and send it to a different machine having a DASD or tape. When dealing with very large databases, these solutions could break down entirely, due to the operational difficulties of maintaining the necessary physical media, or open network connections.
The named pipe provides a standard mechanism that can be used by processes that do not have to share a common process origin for process-to-process-to-device transfers of large amounts of data. The data sent to the named pipe can be read by any authorized process that knows the name of the named pipe. In particular, named pipes are used in conjunction with the Oracle DBMS import/export utility to perform the logical backup/restores necessary to restructure or reorganize very large data bases (VLDBs). Typically, the user creates a named pipe and runs an export utility specifying the named pipe as the output device. The DBMS sees the pipe as a regular file. Another process, including for example Oracle DBMS commands such as dd (convert and copy a file), cpio (copy files in and out), rsh (execute command on a remote system), etc., then reads from the other end of the pipe and writes the data to actual media or the network. This technique is used to write export data to local disk/tape or over the network to available disk/tape on another machine.
As mentioned, a disadvantage of the existing methods is the large amount of time it takes to perform backups, during which the database may be partially or completely offline due to read/write interlocks. Some of this delay can be reduced by segmenting export/backup files, and running several processes in parallel. Even though the logical backup process can be segmented into parallel streams by some DBMSs, the implementations may be proprietary and not necessarily adaptable for import to another DBMS. Also, disadvantageously, a dedicated disk o

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and apparatus for storage and retrieval of very large... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and apparatus for storage and retrieval of very large..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for storage and retrieval of very large... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3120234

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.