Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
1999-05-28
2002-05-14
Breene, John (Department: 2177)
Data processing: database and file management or data structures
Database design
Data structure types
Reexamination Certificate
active
06389427
ABSTRACT:
BACKGROUND
The present invention relates to computer system components that perform file system operations.
Modern computers are made up of several different components. Some of these components are physical devices—hardware like the CPU (a central processing unit, such as a microprocessor), main memory (high-speed random access memory), disk drives, keyboard, and so on—and some are software like the applications or programs that the computer executes. One such software component is the operating system, which manages the interaction between the applications that it executes and the physical devices that together make up the computer. Included within virtually every operating system is the concept of a file system. This is the combination of the structure of the data stored on the physical disk drive and the file system driver—a software component that coordinates access to the data.
File systems are generally structured as an inverted tree structure where each node is given a sequence of characters describing it, as well as access restrictions, date and time of creation or access and many other features and information that are specific to the operating system. Each non-terminal node is generally called a directory or folder. Each terminal node is generally called a file. Locating a file to be opened requires parsing a path, which is a string composed of a hierarchical name for the file with each named component separated by some delimiter. For example, on a computer running a Microsoft® Windows operating system (95 or NT), the path “\\Work\Project\Month\Document” indicates the hard disk drive partition (volume) named Work, the directory Project within the root directory of that volume, the directory Month within the Project directory, and the file Document within the directory Month.
The contents of a file may be called file data to distinguish it from meta data. Meta data is “data about data”. Meta data is the file system overhead that is used to keep track of everything about all of the files on a volume. For example, meta data tells what allocation units make up the file data for a given file, what allocation units are free, what allocation units contain bad sectors, and so on.
The data that is managed by the file system is generally stored on a mechanical magnetic storage device called a disk drive. For an application program to access a particular file on the disk, a directory lookup must usually be performed. A directory lookup can require: (i) accessing the sectors for each of the directories that are components of the file's path, (ii) retaining the information necessary to access each directory's physical data from disk, and (iii) computing the number of the sector where the file is located on the disk. It is at this point that a request for the operating system to open a file is satisfied. The application can then use operating system functions to read data from the opened file and finally to close the file, releasing any operating system resources being maintained for the file. Because the act of opening a file and reading its contents by performing this directory lookup requires several steps, file systems typically use a variety of techniques to minimize the adverse performance effects of repeating these steps over and over again. The caching of frequently used disk data in memory is one popular technique for minimizing adverse performance effects. Another technique is the indexing of directory data.
File systems commonly use an internal identifier to refer to a directory or file. For some file systems these internal identifiers are long-lived (i.e., persistent) and validly refer to the same file for the life of the file. When accessing a file, the directory lookup determines the internal identifier of each directory in the path, reads each directory's data from the disk, and ultimately locates the internal identifier for the file. The data associated with the file identifier is then read. This data generally includes such attributes as access permissions, file size, file name, and where on the disk the file data is located. Finally, using the cluster information, the file data—the data that a user regards as the contents of the file—is read from the disk.
The overhead associated with directory lookup is both necessary and useful in the general case. However, when an application provides its own mechanism for referring to files, using both the application's lookup mechanism and that of the file system results in duplication of effort. The applications which perform their own mapping to files, and consequently cause this redundancy, are many and diverse. Applications that experience this performance degradation that can benefit from the present invention.
SUMMARY OF THE INVENTION
The invention provides methods and apparatus that enhance the performance of computer file systems, and in particular the performance of read-only operations in such file systems. In the principal embodiment that will be described, the methods and apparatus of the invention are implemented in a suite of computer program modules that together make up a performance enhancement product.
In general, in one aspect, the invention provides a product that transparently exists in an operating system after an initial setup is completed. The initial setup involves identifying what directories or files are to be monitored in order to intercept access requests for those files and to respond to those requests with enhanced performance.
Advantageous embodiments of the invention include one or more of the following features. A system administrator can specify what directories or files are to be monitored by the product. The product automatically creates and maintains a high-performance index of monitored directories or files. It transparently and automatically begins enhancing requests for monitored files whenever the application suite starts running.
Most operations are simply forwarded to the underlying file system driver. However, when a file is opened that is monitored, the open is performed using the file identifier bypassing the access of any directory meta data information.
A further enhancement of the invention is the elimination of access time updates for monitored files thereby eliminating write updates to directory contents, file system meta data and the operating system log file.
In the Windows NT file system NTFS, access to monitored files is enhanced by “pinning” files in the data cache maintained by the NTFS cache manager. Pinning forces the NTFS cache manager to retain file data in memory by leaving an outstanding operation in place (CcMdlRead) until such time as the invention calls the complete operation (CcMdlReadComplete). Maintaining access to the file data in memory for as long as possible increases the likelihood of the file being in memory when the next access request for the file is received.
Replacement of pinned files when available memory is exceeded is performed using a least-recently-used (LRU) selection process. As memory usage for the cache increases, adverse impacts of aggressive memory utilization are mitigated through monitoring of memory usage for other applications and adjusting memory as required.
Additionally, for the NTFS implementation, data operations are processed more efficiently in a number of ways explained in more detail later in the document. It can be configured and its runtime behavior can be controlled through the use of configuration parameters stored in either operating system-provided locations (ie. Windows Registry) or in configuration files read at startup. It provides a mechanism that allows a system administrator to cause all file operation requests to be directed to the standard file system driver without enhancement.
Among the advantages of the invention are the following. It improves the performance of applications that rely on high volumes of file accesses without resorting to a custom implementation of the file system. It improves the performance of applications that perform large numbers of file
Breene John
Fish & Richardson P.C.
Pham Linh M
Redleaf Group, Inc.
LandOfFree
File system performance enhancement does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with File system performance enhancement, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and File system performance enhancement will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2867733