Electrical computers and digital processing systems: memory – Storage accessing and control – Specific memory composition
Reexamination Certificate
2002-05-14
2004-08-31
Bataille, Pierre (Department: 2186)
Electrical computers and digital processing systems: memory
Storage accessing and control
Specific memory composition
C711S170000, C709S233000, C725S092000, C707S793000
Reexamination Certificate
active
06785768
ABSTRACT:
BACKGROUND
There are several computer system architectures which support distributed use of data over computer networks. These computer system architectures are used in applications such as corporate intranets, Internet sites, distributed database applications and video-on-demand services.
Video-on-demand services, for example, typically are designed with an assumption that a user requests an entire movie, and that the selected movie has a substantial length. The video-on-demand server therefore is designed to support read-only access by several subscribers to the same movie, possibly at different times. Such servers generally divide data into several segments and distribute the segments sequentially over several computers or computer disks. This technique commonly is called striping, and is described, for example, in U.S. Pat. Nos. 5,473,362, 5,583,868 and 5,610,841. One problem with striping data for movies over several disks is that failure of one disk or server can result in the loss of all movies, because every movie has at least one segment written on every disk.
A common technique for providing reliability in data storage is called mirroring. A hybrid system using mirroring and sequential striping is shown in U.S. Pat. No. 5,559,764 (Chen et al.). Mirroring involves maintaining two copies of each storage unit, i.e., having a primary storage and secondary backup storage for all data. Both copies also may be used for load distribution. Using this technique however, a failure of the primary storage causes its entire load to be placed on the secondary backup storage.
Another problem with sequentially striping data over several disks is the increased likelihood of what is called a “convoy effect.” A convoy effect occurs because requests for data segments from a file tend to group together at a disk and then cycle from one disk to the next (a “convoy”). As a result, one disk may be particularly burdened with requests at the one time while other disks have a light load. Any new requests to a disk also must wait for the convoy to be processed, thus resulting in increased latency for new requests. To overcome the convoy effect, data may be striped in a random fashion, i.e., segments of a data file is stored in a random order among the disks rather than sequentially. Such a system is described in “Design and Performance Tradeoffs in Clustered Video Servers,” by R. Tewari, et. al., in
Proceedings of Multimedia '
96, pp. 144-150. See also, “High Availability in Clustered Multimedia Servers,” by R. Tewari, et al.,
Proceedings of the IEEE Intern. Conf. On Data Engineering
, February 1996. Such a system still may experience random, extreme loads on one disk, however, due to the generally random nature of data accesses.
None of these systems is individually capable of transferring multiple, independent, high bandwidth streams of data, particularly isochronous media data such as video and associated audio data, between multiple storage units and multiple applications in a scalable and reliable manner. Such data transfer requirements are particularly difficult in systems supporting capture, authoring and playback of multimedia data. In an authoring system in particular, data typically is accessed in small fragments, called clips, of larger data files. These clips tend to be accessed in an arbitrary or random order with respect to how the data is stored, making efficient data transfer difficult to achieve.
It also is common to use one server for high bandwidth data, such as video, and another different server for low bandwidth data, such as text. The problems associated with video or other high bandwidth data typically involve solutions that are considered too complex for other data such as text.
SUMMARY
Data is randomly distributed on multiple storage units connected with multiple applications using a computer network. The data is divided into segments. Each segment is stored on one of the storage units. Redundancy information based on one or more segments also is stored on a different storage unit than the segments on which it is based. The redundancy information may be a copy of each segment or may be computed by an exclusive-or operation performed on two or more segments. The selection of each storage unit on which a segment or redundancy information is stored is random or pseudorandom and may be independent of the storage units on which other segments of the data are stored. Where redundancy information is based on two or more segments, each of the segments is stored on a different storage unit.
This random distribution of segments of data improves both scalability and reliability. For example, because the data is processed by accessing segments, data fragments or clips also are processed as efficiently as all of the data. The applications may request data transfer from a storage unit only when that transfer would be efficient and may request storage units to preprocess read requests. Bandwidth utilization on a computer network may be optimized by scheduling data transfers among the clients and storage units. If one of the storage units fails, its load also is distributed randomly and nearly uniformly over the remaining storage units. Procedures for recovering from failure of a storage unit also may be provided.
The storage units and applications also may operate independently and without central control. For example, each client may use only local information to schedule communication with a storage unit. Storage units and applications therefore may be added to or removed from the system. As a result, the system is expandable during operation.
When the redundancy information is a copy of one segment, system performance may be improved, although at the expense of increased storage. For example, when an application requests a selected segment of data, the request may be processed by the storage unit with the shortest queue of requests so that random fluctuations in the load applied by multiple applications on multiple storage units are balanced statistically and more equally over all of the storage units. Also, an application may send two requests to randomly selected servers. When one request is accepted by one of the selected servers, the other request to the other selected server is canceled. Both of these ways for requesting data enable transactions among multiple clients and multiple servers without using a centralized queue.
This combination of techniques results in a system which can transfer multiple, independent high-bandwidth streams of data between multiple storage units and multiple applications in a scalable and reliable manner.
These techniques also may be used to support all kinds of streams of data, for example, the system maybe used as a file system for supporting database servers and for supporting intranet and Internet applications with small files, such as single images and/or text. In particular, smaller files may be supported by using a log-structured file system that combines small files into larger segments of data for storage on a server. Each server maintains and accesses a log for read/write recovery and archiving operations of small files.
Accordingly, in one aspect, a distributed data storage system includes a plurality of storage units for storing data, wherein segments of data stored on the storage units are randomly distributed among the plurality of storage units. Redundancy information corresponding to each segment also is randomly distributed among the storage units.
When the redundancy information is a copy of one segment, each copy of each segment may be stored on a different one of the storage units. Each copy of each segment may be assigned to one of the plurality of storage units according to a probability distribution defined as a function of relative specifications of the storage units. The distributed data storage system may include a computer-readable medium having computer-readable logic stored thereon and defining a segment table accessible by a computer using an indication of a segment of data to retrieve indications of the storag
Jacobs Herbert R.
Peters Eric C.
Rabinowitz Stanley
Avid Technology Inc.
Bataille Pierre
Gordon Peter J.
LandOfFree
Computer system and process for transferring streams of data... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Computer system and process for transferring streams of data..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Computer system and process for transferring streams of data... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3359525