Electrical computers and digital processing systems: multicomput – Remote data accessing
Reexamination Certificate
2000-06-08
2004-05-25
Wiley, David (Department: 2143)
Electrical computers and digital processing systems: multicomput
Remote data accessing
C709S213000, C709S224000, C707S793000
Reexamination Certificate
active
06742020
ABSTRACT:
TECHNICAL FIELD OF THE INVENTION
This invention relates in general to distributed computing systems including storage networks and, more specifically, to systems and methods for measuring and managing the flow of data in distributed computing systems and storage networks.
BACKGROUND OF THE INVENTION
Distributed computing systems generally include two or more computers that share data over a decentralized network. Examples of such systems include Automatic Teller Machine (ATM) transaction-processing networks, airline reservation networks, and credit card transaction-processing networks.
Storage networks are a subset of distributed computing systems in which the network resources used for communication between storage resources and computation resources are separate from the network resources used for general network communications. The storage area network is an attempt at providing increased storage performance through the separation of storage related traffic from general network communications. This allows general communications to go on without being blocked by data/storage traffic, and conversely allows data/storage traffic to occur without interruptions from general network communications.
In distributed storage systems, there is need to store both data and metadata. Metadata is information about data being stored, like size, location, and ownership. Even simple filesystems store metadata in their directory structures.
In a distributed computing system or storage network, it is desirable that data be accessible, even if a single machine of the system fails. It is also desirable that data be stored and delivered to machines of the system in ways that overcome the differences between various machines and operating systems. It is desirable that system performance be high, and further, that adding machines to the system increase both performance and storage capacity—that the system be scalable. Data should be maintained in a secure manner, with strict control over device access, and the system should be easily managed.
In distributed computing systems and storage networks, there may be multiple copies of at least some data and metadata existing in multiple machines simultaneously. Coherency requires that these copies be identical, or interlocked such that only the most recently altered version is subject to further modification. It desirable that data, and especially metadata, be stored and access interlocked to enforce coherency in the multiple machines of the system to prevent such faux-pas as assigning a single airline seat to two different passengers.
In particular, it is desirable that users, and user-level applications, not need to track and select storage devices and partitions thereon. Users or application programs should be able to specify storage and performance requirements for data to be stored, allowing the storage subsystem to select the physical device. These performance requirements for specific data are quality of service (QOS) metrics. Further, the system should ensure that QOS requirements are met insofar as possible.
Various methods have been devised for managing data flow in distributed computing systems. For example, U.S. Pat. No. 5,724,575 to Hoover et al. describes a system for managing data flow in a distributed, heterogeneous health care computing system. In the Hoover system, individual heterogeneous computers make their data homogeneous by mapping their heterogeneous data fields to a homogeneous object model through an interface server. The flow of homogeneous data to and from the individual computers is then managed by a central object broker computer capable of managing data presented in accordance with the object model. This approach may suffer if the central object broker or interface server becomes overloaded or fails; data is then inaccessible even if physically located on a functioning machine.
Other methods for enabling data flow in distributed computing systems are disclosed in U.S. Pat. Nos. 5,758,333, 5,675,802, 5,655,154, and 5,412,772. In U.S. Pat. No. 5,675,802 a system for managing distributed software development projects has multiple, “weakly consistent,” copies of a source code database at multiple locations. A “mastership enforcer” at each location enforces access-for-change limitation rules to files of the source-code database. Periodically the “weakly consistent” copies are synchronized, such that updated copies replace outdated copies in other locations.
In U.S. Pat. No. 5,412,772, an object format for operation under several different operating system environments is described. This object format includes view format information for each object that incorporates information that may be accessible under only one or another of the operating system environments. The object format of U.S. Pat. No. 5,412,772 is described without reference to a locking or other data coherency enforcement mechanism.
U.S. Pat. No. 5,758,333 describes an application-independent database system having a central access control system, a central storage system, and a central
These methods appear to be limited to application in a homogenous distributed computing system, are limited to point-to-point data transactions, or fail to provide the high level of data coherency required for such applications as air travel reservation transaction processing.
It is known that there are systems on the market that provide at least partial solutions to the problems of managing and measuring data flow in distributed computing systems and storage network systems. These systems include Sun JIRO—which is documented on the Internet at www.jiro.com. Sun's device and data management services. The data services portion of Jiro does not manage the network interconnect to control data flow. This could limit performance and ability to operate in a truly heterogeneous environment with non-StoreX devices.
Accordingly, there is a need in the art for an improved system and method for managing data flow in a distributed computing system.
SOLUTION TO THE PROBLEM
A distributed computing system implements a shared-memory model of storage on a network. The network may be a storage area network. The shared memory model contains a distributed metadata database, or registry, that provides a coherent and consistent image of the state of data activity, including data storage, movement and execution, across the storage network. Upon the same network, but not necessarily in the same machines of the network, is a distributed data database controlled and indexed by the distributed metadata registry.
The metadata registry is implemented to provide data availability, reliability, scalability of the system, compatibility, and simplicity of management. The metadata registry also contains information about available resources of the system, including quality-of-service (QOS) metrics of each available resource, and information about transactions in progress over the network. The information about transactions in progress is stored in command objects of the metadata registry.
The shared-memory model is implemented on top of a network infrastructure and under the file system and operating system—it acts as an abstraction layer that masks the out differences between the hardware and software platforms. This allows incorporation of new technologies without redesign of an entire system.
In order to ensure coherency between multiple copies of sections of the distributed metadata registry, an agent may be injected onto a switch of a storage network, onto a router of a general network, or maintained on a system on which a section of the metadata registry is stored. This agent monitors communications between machines that write metadata and machines on which a section of the metadata registry is stored for creation of write command objects. When a write command object is created, the agent adds additional destination machines to the write command object such that those machines will be updated when the write command executes.
Command objects ready for execution are evaluated for potential dependencies and
Dimitroff John
Nguyen Minh Chau Alvin
Nguyen Phuoc
Wiley David
LandOfFree
System and method for managing data flow and measuring... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System and method for managing data flow and measuring..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for managing data flow and measuring... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3219837