Data processing: database and file management or data structures – Database design – Data structure types
Reexamination Certificate
1999-03-15
2001-11-20
Black, Thomas (Department: 2171)
Data processing: database and file management or data structures
Database design
Data structure types
C717S152000, C717S152000, C707S793000, C707S793000
Reexamination Certificate
active
06321240
ABSTRACT:
FIELD OF THE INVENTION
This invention relates generally to the field of computer memory management and in particular to optimizing cache utilization by modifying data structures.
REFERENCE TO RELATED APPLICATIONS
This application is related to co-pending applications Field Reordering to Optimize Cache Utilization, Ser. No. 09/270,124 Ser. No. 09/270,125 and Data Structure Partitioning to Optimize Cache Utilization, assigned to the same assignee as the present application, filed on the same day herewith and hereby incorporated by reference. U.S. patent application Ser. No. 09/024,248 now U.S. Pat. No. 6,189,069 for OPTIMIZED LOGGING OF DATA ELEMENTS TO A DATA STORAGE DEVICE is hereby incorporated by reference, at least with respect to its teaching of the logging of access of data structure elements. U.S. Pat. No. 5,752,038 for METHOD AND SYSTEM FOR DETERMINING AN OPTIMAL PLACEMENT ORDER FOR CODE PORTIONS WITHIN A MODULE which is also hereby incorporated by reference for its teaching of the use of bit vectors which contain multiple bits representing unique time intervals.
COPYRIGHT NOTICE/PERMISSION
A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever. The following notice applies to the software and data as described below and in the drawing hereto: Copyright ©1999, Microsoft Corporation, All Rights Reserved.
BACKGROUND
Users are demanding increased performance of their applications running on their computers. Computer hardware, including central processing units (CPUs), are becoming increasingly faster. However, their performance is limited by the speed at which data is available to be processed. There are several devices that provide the data. Disk drives, compact disks and other secondary storage devices can store great amounts of data cost effectively, but have great delays in providing data because the physical media on which the data is stored must be moved to a position where it can be read. This type of physical motion requires great amounts of time when compared to the cycle times of processors. The next fastest common data storage device is referred to as random access memory (RAM) which is much faster. However, processor speeds have increased, and even RAM cannot provide data fast enough to keep up with them.
In a typical computer, Level 1 (L1) and Level 2 (L2) cache memories are similar to RAM, but are even faster, and are physically close to a processor to provide data at very high rates. The cache memory is typically divided into 32, 64, or 128 byte cache lines. The size of a cache line normally corresponds to a common unit of data retrieved from memory. When data required by a processor is not available in L1 cache, a cache line fault occurs and the data must be loaded from lower speed L2 cache memory, or relatively slow RAM. The application is often effectively stalled during the loading of this data, and until such time as the data is available to the CPU. By decreasing the number of cache faults, an application will run faster. There is a need to reduce the number of cache line faults and provide data to processors even faster to keep applications from waiting.
Computer applications utilize data structures which are referred to as classes which are instantiated as objects. Classes define containers of data or information and code which operates on the data in response to method calls from other users or classes. Some classes can be very large, and take up several cache lines. The amount of each class actually used may be significantly less than the amount of data stored in the class. If the entire class is recalled from storage, even though only a small part of it is actually needed, many cache line misses will occur due to them containing unneeded data from the objects. Since there are a limited number of cache lines available for use by an application, it is important to use them efficiently. If there is insufficient space available for the desired data, time is spent in obtaining the data from slower storage and then populating the cache lines so the desired data is more quickly available to the processor.
There is a need for a better way to manage the cache lines so that data commonly needed by applications is available with a minimal amount of cache line misses.
SUMMARY OF THE INVENTION
Data structures are partitioned into heavily referenced and less heavily referenced portions. The partitioning is based on profile information about field access counts. Garbage collection is combined with a cache-conscious object co-location scheme to further improve cache miss rations. Garbage collection ensures that the splitting algorithm is applied only for longer lived objects which survive scavenges. This helps ensure that cache lines are most effectively utilized for data that is most likely to be needed by a processor without any required modification to cache line algorithms.In one embodiment, the top 5% most heavily referenced portions or hot portions of an object are kept in a hot object, while the remaining portions of the object are placed in a subordinate or cold object which is referenced by the original hot object as needed.
In a further aspect of the invention, the heavily referenced portions of a hot object are then placed next to each other in memory so that they are likely combined into common cache lines. The “cold” portions which were extracted from the class are placed in a new class that can be referenced from the original class. Accesses to hot portions remain unchanged. Garbage collection insures that longer lived objects are combined. One aspect of the invention involves the selection of classes that show the greatest potential and least compatibility problems. Yet a further aspect of the invention involves the application of the partitioning to programs written in languages which result in relatively small object sizes.
REFERENCES:
patent: 5752038 (1998-05-01), Blake et al.
patent: 5920720 (1999-07-01), Toutonghi et al.
patent: 5966702 (1999-10-01), Fresko et al.
patent: 6003038 (1999-12-01), Chen
patent: 6105072 (2000-08-01), Fischer
Brown, “Incremental garbage collection in massive object stores”, Computer Sciences Conference, Jan. 2001, ACSC 2001, Proceedings, 24th Australiasian, pp. 38-46.*
Blackburn et al., “Starting with termination: a methodology for building distributed garbage collection algorithms”, Computer Science Conference, Jan. 2001, ASCS 2001, Proceedings 24th Australasian, pp. 20-28.*
Avvenuti et al., “Supporting remote reference updating through garbage collection in a mobile object system”, Parallel and Distributed Processing 2001, Proceedings Ninth Euromicro Workshop on, Feb. 2001, pp. 65-70.*
U.S. patent application Ser. No. 09/024,248 entitled Optimized Logging of Data Elements to a Data Storage Device.
Fraser, C.W., et al., “A Retargetable C. Compiler: Design and Implementation”, Benjamin/Cummings, Redwood City, California, (1995).
Chilimbi Trishul M.
Larus James R.
Black Thomas
Jung David
Merchant & Gould
LandOfFree
Data structure partitioning with garbage collection to... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Data structure partitioning with garbage collection to..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Data structure partitioning with garbage collection to... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2601194