Electrical computers and digital processing systems: memory – Storage accessing and control – Hierarchical memories
Reexamination Certificate
2001-05-07
2004-01-06
Sparks, Donald (Department: 2187)
Electrical computers and digital processing systems: memory
Storage accessing and control
Hierarchical memories
C711S130000, C711S162000, C714S004110, C714S006130, C714S031000
Reexamination Certificate
active
06675264
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to data storage systems, and in particular, to a method and apparatus for utilizing cache in a number of storage nodes in a cluster storage subsystem.
2. Description of the Related Art
The ability to manage massive amounts of information in large scale databases has become of increasing importance in recent years. Increasingly, data analysts are faced with ever larger data sets, some of which measure in gigabytes or even terabytes. To access the large amount of data, two or more systems that work together may be clustered. Clustering provides a way to improve throughput performance through proper load balancing techniques. Clustering generally refers to multiple computer systems or nodes (that comprise a central processing unit (CPU), memory, and adapter) that are linked together in order to handle variable workloads or to provide continued operation in the event one computer system or node fails. Each node in a cluster may be a multiprocessor system itself. For example, a cluster of four nodes, each with four CPUs, would provide a total of 16 CPUs processing simultaneously. Practical applications of clustering include unsupervised classification and taxonomy generation, nearest neighbor searching, scientific discovery, vector quantization, time series analysis, multidimensional visualization, and text analysis and navigation. Further, many practical applications are write-intensive with a high amount of transaction processing. Such applications include fraud determination in credit card processing or investment house account updating.
In a clustered environment, the data may be distributed across multiple nodes that communicate with each other. Each node maintains a data storage device, processor, etc. to manage and access a portion of the data that may or may not be shared. When a device is shared, all the nodes can access the shared device. However, such a distributed system requires a mechanism for managing the data across the system and communicating between the nodes.
In order to increase data delivery and access for the nodes, cache may be utilized. Cache provides a mechanism to store frequently used data in a location that is more quickly accessed. Cache speeds up data transfer and may be either temporary or permanent. Memory and disk caches are utilized in most computers to speed up instruction execution and data retrieval. These temporary caches serve as staging areas, and their contents can be changed in seconds or milliseconds.
In the prior art, caching and prefetching strategies are often complicated, confusing, based on scientific workloads for cache management, and designed to guard against file cache corruption due to application faults and power failures with unreliable file systems. Accordingly, what is needed is a storage and caching system that is efficient, does not require special hardware support, and provides sufficient reliability.
SUMMARY OF THE INVENTION
To address the requirements described above, the present invention discloses a method, apparatus, article of manufacture, and a memory structure that provides a mirrored-cache write scheme in a cluster-based file system. When a user application or host issues a write request from a node, the data is written to the cache of both the receiving node (referred to as node i) and a partner of the receiving node (referred to as node i+1). In one or more embodiments of the invention, node i's partner is always node i+1, except for the last node, whose partner is node 0 instead.
A global cache directory manager (that may or may not be used depending on the implementation) is embedded in a file system and checks to determine if the data being written is currently owned by another node (referred to as a remote node). If so, the cache directory manager invalidates the copy in the remote node based on an invalidation protocol. Once invalidation is complete, node i writes the data to its own local file cache. Node i may also write the data to the node i+1 and to disk as a nonblocking write (asynchronous write). Once node i receives confirmation of the completed cache write from node i+1, the user/host write can return.
REFERENCES:
patent: 5636355 (1997-06-01), Ramakrishnan et al.
patent: 5826002 (1998-10-01), Yamamoto et al.
patent: 5884046 (1999-03-01), Antonov
patent: 5893149 (1999-04-01), Hagersten et al.
patent: 5903907 (1999-05-01), Hagersten et al.
patent: 6151684 (2000-11-01), Alexander et al.
patent: 6151688 (2000-11-01), Wipfel et al.
patent: 6154816 (2000-11-01), Steely et al.
patent: 6167490 (2000-12-01), Levy et al.
patent: 6360231 (2002-03-01), Pong et al.
patent: 6449641 (2002-09-01), Moiin et al.
John Hennessy et al., Manual, “Computer Architecture: A Quantitative Approach”, Morgan Kaufman Publishers, Inc., (1990), Chapter 8, pp. 466-487.
E. Omiecniski et al., “Performance Analysis of a Concurrent File Reorganization Algorithm for Record Clustering,” 1994, IEEE Transactions on Knowledge and Data Engineering, 6(2):248-257.
E. Omiecinski et al., “Concurrent File Reorganization for Record Clustering A Performance Study,” 1992, IEEE, pp. 265-272.
F.E. Bassow, IBM AIX Parallel I/O System: Installation, Administration, and Use. IBM Kingstom, May 1995. Document No. SH34-6065-00.
R. Bennett et al., “Jovian: A Framework for Optimizing Parallel I/O,” 1994, In Proc. of the Scalable Parallel Libraries Conf., IEEE Computer Society Press, pp. 10-20.
P.F. Corbett et al., “The Vesta Parallel File System,” 1996, ACM Transactions on Computer Systems, 14(3):225-264.
J. Huber et al., “PPFS: A High Performance Portable Parallel File System,” 1995, In Proc. of the 9thACM Int'l Conf. on Supercomputing, ACM Press, pp. 385-394.
D. Kotz et al., “Caching and Writeback Policies in Parallel File Systems,” 1991, IEEE Symp. on Parallel Distributed Processing, pp. 60-67.
S. Moyer et al., “PIOUS: A Scalable Parallel I/O System for Distributed Computing Environments,” 1994, In Proc. of the Scalable High-Performance Computing Conference, pp. 71-78.
W. Ng et al., “The systematic improvement of fault tolerance in the Rio file cache,” 1999, In Proc. of 1999 Symposium on Fault-Tolerant Computing, pp. 76-83.
N. Nieuwejaar et al., “The Galley parallel file system,” 1997, Parallel Computing, 23(4):447-476.
B. Nitzberg, “Performance of the iPSC/860 Concurrent File System,” 1992, Technical Report RND-92-020, NAS Systems Division, NASA Assoc Research Center.
N. Peyrouze et al., “An efficient fault-tolerant NFS server designed for off-the-shelf workstations,” 1996, IEEE Proceeding of 1996 Symp on Fault-Tolerant Computing, pp. 64-73.
P. Pierce, “A concurrent File System for a Highly Parallel Mass Storage Subsystem,” 1989, In Proc. of the Fourth Conf. on Hypercube Concurrent Computers and Applications, pp. 155-160.
A. Purakayastha et al., “ENWRICH: A compute-processor write caching scheme for parallel file systems,” 1996, ACM Press In Proc. of the Fourth Workshop on Input/Output in Parallel and Distributed Systems, pp. 55-68.
K.E. Seamons et al., “Server-Directed Collective I/O in Panda,” 1995, In Proc. of Supercomputing, IEEE, pp. 1-14.
R. Thakur et al., “Passion: Optimized I/O for Parallel Applications,” 1996, IEEE Computer, 29(6):70-78.
Chen Ying
Young Honesty Cheng
Dinh Ngoc V.
Gates & Cooper LLP
International Business Machines - Corporation
Sparks Donald
LandOfFree
Method and apparatus for improving write performance in a... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and apparatus for improving write performance in a..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and apparatus for improving write performance in a... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3208054