Replacement of substrings in file/directory pathnames with...

Data processing: database and file management or data structures – Database design – Data structure types

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C707S793000, C707S793000, C707S793000

Reexamination Certificate

active

06470345

ABSTRACT:

BACKGROUND OF THE INVENTION
The present invention is generally related to data processing systems; and more particularly is related to a method and system for the replacement of substrings in file and directory pathnames with numeric tokens.
Most file systems will complete a partial file or directory specification by using the current working directory information along with whatever partial information is given. This process of creating a complete, syntactically correct specification (the canonical form) is sometimes referred to as “canonicalization”. This canonical form is important, since it completely and uniquely identifies the file system resource, whether a file, directory or some other type of resource.
Another important task is the semantic validation of a path, made up of the root, intermediate directories, and file or directory specification. All intermediate directories must be valid for a pathname to refer to a valid file system resource. The exception is that the final term, whether a file, directory or other name, might not exist at the time of validation, since the operation requested of the file system may be to create, or indeed, to check whether it exists.
These two tasks are often intertwined in a single function or set of functions. This makes sense in some file systems, such as UNIX's file system (UFS), where all resources are local and creations, modifications and deletions are all within the same data scope of an operating system process and can be easily synchronized.
The combination of these two functions can also effect some savings by being more efficient. If the current working directory for a given process is taken to be always valid (which assumes some method to prevent other processes from modifying that file system information while a process is “in it”), then validation of a path can start with the partial information specified by the user of the file system.
However useful this method of combining these two functions can be, it should always be remembered that these are two separate tasks. Severe performance penalties can be the cost of forgetting this. During recent development of a Virtual File System (VFS) and related network file system (NFS) work by the inventors, it was found that some NFS clients were sending remote procedure call (RPC) requests to validate each intermediate part of the path (via NFS_LOOKUP) instead of sending the full path as far as it was thought to be valid. This means in many cases 12 to 15 RPCs instead of a single RPC.
In the design of the file system that is structured on a client/server split, where the client portion keeps track of the current working directory and therefore has to perform the canonicalization, the path validation can often only be efficiently done by the server. The inventors' research has shown that in most cases even where there is no client/server split, it is advantageous to separate canonicalization from validation and perform these two operations in a close sequence, but not interleaving validation of intermediate path information with a forming of a canonical name. This results in a simpler implementation and superior performance, especially in a network environment.
SUMMARY OF THE INVENTION
In a network of computers, there is often a need to extend some operating systems' file systems to accommodate file and directory names that are not supported natively. When implementing Java Virtual Machines (JVMs) on file systems that only support “8.3” names (up to eight characters for the name and up to three characters for extension or type) this becomes very apparent. A trivial example is: “SomeJavaApplication.class”, which violates both the eight character name and the three character extension limits. Special characters, DBCS (Double Byte Character Set), uppercase and lowercase letters, spaces within names and a host of other limitations can cause problems that limit the usefulness of an otherwise desirable file system.
A virtual file system (VFS) has been implemented that allows clients to map many names that use these problem characters and can far exceed the length of the file or directory name or total length of a “path”. In general, a VFS is an indirection layer that handles the file-oriented system calls and calls the necessary functions in the physical file system code to perform input/output. The VFS consists of a Name Space Server accessed via TCP/IP sockets and a run-time VFS client. In a sense the run-time client intercepts names that are allowed to exceed the limits of the native file system and sends them to the Name Space Server to be converted into names that are supported natively.
In dealing with file/directory pathnames, the number of sometimes quite lengthy strings poses a significant problem, especially when these are broken into substrings which then are constantly compared to other substrings. By parsing the strings into their semantically correct substrings and replacing those substrings with unique numeric tokens, a significant improvement is realized in the storage of the strings as well as better performance in comparing those substrings. Since each substring (typically a subdirectory, filename or extension) is replaced with a numeric value, these numeric values can be arithmetically compared (e.g., is a ==b) instead of string compared (i.e., are all characters the same, what about uppercase vs. lowercase, etc.). This alone represents a substantial improvement in performance. In addition, by keeping a string dictionary, which the token uniquely indexes, only one copy is kept of any substring. This too can represent a substantial savings in the amount of storage needed to implement a file system.


REFERENCES:
patent: 5325091 (1994-06-01), Kaplan et al.
patent: 5325531 (1994-06-01), McKeeman et al.
patent: 5475743 (1995-12-01), Nixon et al.
patent: 5497492 (1996-03-01), Zbikowski et al.
patent: 5525982 (1996-06-01), Cheng et al.
patent: 5574903 (1996-11-01), Szymanski et al.
patent: 5577249 (1996-11-01), Califano
patent: 5608901 (1997-03-01), Letwin
patent: 5652876 (1997-07-01), Ashe et al.
patent: 5659755 (1997-08-01), Strohacker
patent: 5666114 (1997-09-01), Brodie et al.
patent: 5740353 (1998-04-01), Kreulen et al.
patent: 5778255 (1998-07-01), Clark et al.
patent: 5778361 (1998-07-01), Nanjo et al.
patent: 5873118 (1999-02-01), Letwin
patent: 6021433 (2000-02-01), Payne et al.
patent: 6105027 (2000-08-01), Schneider et al.
patent: 6185575 (2001-02-01), Orcutt
patent: 6195689 (2001-02-01), Bahlmann
patent: 6199068 (2001-03-01), Carpenter
patent: 6266678 (2001-07-01), McDevitt et al.
patent: 6366988 (2002-04-01), Skiba et al.
patent: 6374250 (2002-04-01), Ajtai et al.
“Separation of file/directory pathname canonicalization from validation”, Research Disclosure, IBM Corporation, Nov. 1999, p. 1.*
Peterson, Larry “The Profile Naming Service”, ACM Transactions on Computer Systems, vol. 6, No. 4, Nov. 1988, pp. 341-364.*
Santry, Douglas J. et al., “Elephant: The File System that Never Forgets”, Proceedings of Seventh Workshop on Hot Topics in Operating Systems, Mar. 29-30, 1999, pp. 2-7.*
Bach, M.J., “Design of the Unix Operating System,” pp. 76-88, Prentice-Hall, Inc., 1986.
“Complexity of Preprocessor in MPM Data Compression System,” Kiefer et al., Proceedings DCC '98 Data Compression Conference, Mar. 30-Apr. 1 1998, p. 554 (abstract only).
“An LR Substring Parser Applied in a Parallel Environment,” Clarke, G, et al., Journal of Parallel and Distributed Computing, vol. 35, No. 1, May 25, 1996, pp. 2-17, (abstract only).

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Replacement of substrings in file/directory pathnames with... does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Replacement of substrings in file/directory pathnames with..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Replacement of substrings in file/directory pathnames with... will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2946138

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.