Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
2001-09-25
2004-02-03
Mizrahi, Diane D. (Department: 2175)
Data processing: database and file management or data structures
Database design
Data structure types
Reexamination Certificate
active
06687701
ABSTRACT:
FIELD OF THE INVENTION
The present invention generally relates to distributed file systems, and more particularly to management of a namespace in a distributed file system.
BACKGROUND
A partition-based approach to achieve high scalability for access to distributed storage services is currently being explored. The partition-based approach addresses the inherent scalability problems of cluster file systems, which are due to contention for the globally shared resources. In a partition-based approach, the resources of the system are divided into partitions, with each partition stored on a different partition server. Shared access is controlled on a per-partition basis.
All implementations of partition-based distributed storage services must maintain namespaces, which generally are distributed and reference objects that reside in multiple partitions. A namespace provides a mapping between names and physical objects in the system (e.g., files). A user usually refers to an object by a textual name. The textual name is mapped to a lower-level reference that identifies the actual object, including a location and object identifier. The namespace is implemented by means of directories, which are persistent files of <name, reference> pairs.
The requirement for consistency of the namespace can be formalized in terms of four properties:
1. One name is mapped to exactly one object.
2. One object may be referenced by one or more names.
3. If there exists a name that references an object, then that object exists.
4. If an object exists, then there is at least one name in the namespace that references the object.
Changes to the global namespace take the form of one of two classes of operations: link operations that insert a reference to an object, for example, a newly created object; and unlink operations that remove a reference to an object. Any of the above operations potentially spans more than one server in a distributed system. The server containing the directory (or “namespace object”) and the server containing the referenced object may be physically separated.
Some systems presently use 2-phase commit to implement distributed namespace operations. However, to provide recoverability in the event of system failure during a namespace operation, atomic commitment protocols perform synchronous logging in the critical path of the operations, thereby incurring considerable overhead.
In addition to the overhead, atomic commitment protocols lock system resources across all the sites involved in an operation for the duration of the multi-phase commit, thereby increasing contention for resources such as free block lists and block allocation maps. Atomic commitment protocols also follow a conservative approach for recovery from failure: in the presence of failure, incomplete operations are typically aborted rather than attempting to complete the operation.
A system and method that address the aforementioned problems, as well as other related problems, are therefore desirable.
SUMMARY OF THE INVENTION
In various embodiments, the present invention performs namespace operations in a distributed file system. The file system is disposed on a plurality of partition servers, and each partition server controls access to a subset of hierarchically-related, shared storage objects. Each namespace operation involves a namespace object and a target object that are part of the shared storage objects. Namespace operations received at each partition server are serialized. In response to an unlink namespace operation, a reference in the namespace object to the target object is removed, and after removal the target object is modified in accordance with the unlink operation. In response to a link operation, the target object is modified consistent with the link operation. After modification of the target object, a reference to the target object is inserted in the namespace object. A log record is stored in association with each namespace operation when the operation is started, and a log record is deleted upon completion of the associated operation.
Various example embodiments are set forth in the Detailed Description and Claims which follow.
REFERENCES:
patent: 5617568 (1997-04-01), Ault et al.
patent: 5689701 (1997-11-01), Ault et al.
patent: 6502109 (2002-12-01), Aravamudan et al.
patent: 6567398 (2003-05-01), Aravamudan et al.
patent: 1246061 (2002-10-01), None
Anderson, D., Chase, J., and Vadhat, A. “Interposed Request Routing for Scalable Network Storage”, inProc. of the 4thUSENIX Windows Systems Symposium. San Diego, CA, Aug. 2000.
Howard, J., et al.,Scale and Performance in a Distributed File System. ACM Transactions on Computer Systems, vol. 6(1): pp. 51-81, 1988.
Ji, M. Felten, E.W., Wang, R., and Singh, J.P., “Archipelago: An Island-Based File System for Highly Available and Scalable Internet Services”, inProc. of the 4thUSENIX Windows Systems Symposium, San Diego, CA, Aug. 2000.
NFSVersion4 Technical Brief, Sun Microsystems, Oct. 1999.
Karamanolis Christos
Zhang Zheng
Hewlett--Packard Development Company, L.P.
Mizrahi Diane D.
Mofiz Apu M
LandOfFree
Namespace management in a distributed file system does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Namespace management in a distributed file system, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Namespace management in a distributed file system will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3316289