Electrical computers and digital processing systems: memory – Storage accessing and control – Specific memory composition
Reexamination Certificate
1999-05-18
2003-03-04
Kim, Matthew (Department: 2187)
Electrical computers and digital processing systems: memory
Storage accessing and control
Specific memory composition
Reexamination Certificate
active
06529994
ABSTRACT:
The invention relates to a system for storing and accessing electronic data. More particularly, the invention relates to a data storage, retrieval and distribution system for enabling multiple system users to independently access previously stored streams of electronic data.
BACKGROUND OF THE DISCLOSURE
In traditional electronic data storage and retrieval systems, it is typical to store data in a bank or array of memory elements controlled by a central processing unit (CPU). Such data storage systems form the basis of most contemporary computer systems. Typically, the memory elements are a combination of semiconductor memory, such as dynamic random access memory (DRAM) or static random access memory (SRAM), and rotating disk magnetic memory (disk drive memory), such as a “Winchester” hard disk drive. The semiconductor memory is used for storage of data that requires immediate access by the CPU, while the disk drive memory is typically used for storing data that is less frequently accessed by the CPU.
Typically, the cost associated with using semiconductor memory to store a given amount of data is one or two orders of magnitude greater than using a disk drive memory to store that same amount of data. However, semiconductor memory offers a data latency, i.e., the time delay between when data is requested from memory by the CPU and when the requested data is actually available to the CPU, that is typically three to four orders of magnitude less than the data latency associated with disk drive memory. As such, in applications where data latency is critical, semiconductor memory is well worth the cost.
Moreover, disk drive memory typically requires data to be, accessed in “block-serial” form. As such, random access to any bit of data stored in the drive is typically not possible. Also, being a mechanical device, disk drive memories suffer from mechanical failure and, as such, have a lower reliability than semiconductor memory.
In computing or data retrieval systems where multiple users can simultaneously access data stored in the system, various means are used to serially process each user's data requests. Generally, the system must simulate that each of the users has independent access to the data. Commonly, such a simulation is achieved by preemptive or round robin multitasking algorithms. A system CPU executes these algorithms which are typically embedded in the operating system of the computing or data retrieval system. As such, the CPU serially transfers control of the system's data storage memory to each user in a “round-robin” manner.
To increase the apparent throughput of a disk storage system, many computing systems employ disk drives interconnected to act as a single disk. A block of data is distributed over N disks such that each disk stores 1/N of the block in a similar location. The disks are addressed in parallel such that, after the initial latency, data from each disk is read simultaneously to decrease the time required to read the block. This increase in throughput allows the storage system to service many additional users when a multi-tasking algorithm is employed. However, multi-user operation multiplies the effective latency, If M users are being serviced, a user's request for data from a different data stream would have to be queued until M−1 users have been processed. On the average, the latency will be increased by a factor of M/2.
To increase the number of users with a given effective latency, a storage system can employ multiple CPUs arranged in a parallel processing architecture. Since, in such data storage systems, a single instruction is used by each processor to operate upon a different data stream for each processor, a multiple data computer architecture is typically used. In a multiple data architecture, each CPU is attached to a disk drive memory. As such, each CPU accesses its associated disk drive memory as instructed by a host computer. As a result, the processors can simultaneously access all the disk drives in parallel to achieve improved throughput. As such, each user receives a block of data from a disk drive through a given CPU.
To ensure that the data is continuously transferred from the system to the users, a relatively large capacity semiconductor memory is utilized to buffer the parallel output data streams from the plurality of CPUs. Such data buffering is especially necessary when the data is video or audio data that can not be interrupted during transfer to the users for viewing. In such systems, the video and audio data is transferred from the disk drives to the buffer memory as distinct blocks of data. The blocks are serially arranged in the buffer memory such that as the buffer memory is read, the blocks form a contiguous data stream for each user.
However, in such an information storage system, the buffer memory must be very large and, as such, very costly. For example, in a round-robin type access system having M users, buffer memory must temporarily store a given user's data while the other M−1 users are serviced by the parallel processing computer. In a typical video storage system, where 10-100 kbyte blocks of data are read from 100-1000 disk drives for 1000 users, the buffer memory must be 1-100 Gbytes. Such a large capacity semiconductor memory array is extremely costly.
Another disadvantage associated with using disk drives as storage media is the fact that disk drives are not capable of continuous, uninterrupted read or write operations. Typically, external commands requesting access to data are ignored or delayed when the drive performs internal housekeeping or maintenance operations. The most lengthy delay is introduced by the drive's recalibration of the head position. Such recalibration is accomplished periodically to correct mistracking errors that occur due to differential thermal expansion of the disks within the drive. Common, inexpensive disk drives require 0.1-1.0 seconds to complete a recalibration procedure, which is typically performed every 10-100 minutes of operation.
To prevent interruption of the output data streams, the data distribution system must provide additional buffer memory to store data to be used as an output during each disk drive recalibration cycle. In a typical system where data is being transferred to users at 1 to 10 Mbits/sec for each user, the buffer memory must have a capacity of 0.1 to 10 Mbits. For a system having 1000 users, 10 Gbits or 1.25 Gbytes of semiconductor memory is required.
Therefore, a need exists in the art for a multiple user data distribution system that significantly reduces the necessary capacity of buffer memory and has a data access latency period that is unnoticeable to each user.
SUMMARY OF THE INVENTION
The invention advantageously overcomes the disadvantages heretofore associated with the prior art by utilizing an inventive multiple user data distribution system. Specifically, the multiple user data distribution system contains a digital information server that is a parallel processing computer having a plurality of parallel processors each connected to an information storage device such as a magnetic disk drive, optical disk drive, random-access-memory or the like. In the preferred embodiment of the invention, an array of magnetic disk drives are illustratively utilized.
The system uses a heretofore unknown data striping method for storing information in the plurality of disk drives. This data striping method evenly divides the plurality of disk drives into a plurality of subsets of disk drives. For example, if the server contains 500 disk drives and the subset is 5 drives, then there are 100 subsets of drives. A first subset is selected and a contiguous block of data is stored in a repetitive striped fashion across the subset of disk drives. Thereafter, a second subset, adjacent the first subset, is selected and another contiguous block of data is stored thereupon in the striped fashion. This process is repeated for each of the subsets. When all of the subsets have been used to store data, the method returns to the first s
Bleidt Robert
Chin Danny
Kaba James Timothy Christopher
Chace C. P.
Kim Matthew
Moser Patterson & Sheridan LLP
Sarnoff Corporation
LandOfFree
Method of striping data onto a storage array does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method of striping data onto a storage array, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method of striping data onto a storage array will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3056872