Electrical computers and digital processing systems: memory – Storage accessing and control – Memory configuring
Reexamination Certificate
1999-01-29
2001-11-13
Yoo, Do Hyun (Department: 2187)
Electrical computers and digital processing systems: memory
Storage accessing and control
Memory configuring
C711S118000, C707S793000
Reexamination Certificate
active
06317816
ABSTRACT:
FIELD OF THE INVENTION
The invention generally relates to allocating memory storage from a heap, and in particular, to the allocation of cache sized and sub-cache sized objects in a multiprocessor shared cache memory area.
BACKGROUND
JAVA, and other computer programming languages, allocate storage from a memory storage area, often referred to as the heap. JAVA also implements the concepts of threads. Threads are independent units of work that share a common, process local address space. Thus, while threads are a collection of units that can be individually dispatched, the threads share the common, local memory storage.
When processing threads in a language which allocates storage from a heap, especially in multiprocessor configurations, there is a potential for “false sharing” that will at least intermittently affect performance. Modern processors typically select cache line sizes of 128 bytes and sometimes larger powers of two. However, many modern object-oriented languages, especially JAVA, have a much smaller average object size (e.g. sizes of under 32 bytes are common). Sometimes, objects are created by two separate threads, but they happen to share the same physical cache line. Sharing the same physical cache line may happen by chance because the objects' virtual addresses were “close enough” to each other to share a cache line.
If two objects happen to share the same physical cache line and if the two threads access the objects in different processors, then the performance of the processing suffers. This problem occurs because the processors are forced to move the contents of the cache line from one processor to the other. Even a rare occurrence seriously impacts performance since certain forms of this occurrence (especially when at least one of the objects is being modified) cause operations that take 100 or so cycles to resolve when a typical load takes about 1 cycle. Worse, if the event occurs once, then it might happen repeatedly. This leads to situations that are difficult-to-reproduce. For example, a situation may occur where a program sometimes runs well and sometimes runs poorly.
This situation is called “false sharing.” When this occurs, even though the hardware “sees” a shared cache line, and invokes worst-case hardware functions to maintain both machines in “synch,” in fact the two objects in the same cache line have no dependency on each other. If the objects happen to fall in separate cache lines, nothing would happen except it would run faster. Thus, “false sharing” is an artifact of the cache line size being larger than the individual objects.
To avoid this situation, designers of allocation schemes resort to certain practices. First, using a somewhat drastic measure, the minimum object allocation can be made a cache line in size or ½ a cache line. While this scheme may waste much main memory storage, it results in little or no “false” sharing. But, there is no guarantee that a thread will stay on a given processor, so false sharing might still occur. The per CPU strategy also has the additional problem that sub-dividing the heap may introduce other inefficiencies known in the art. Second, CPU local heaps are utilized under the assumption that modern processor schedulers will try and keep the same thread running on the same processor whenever possible. This generally minimizes other cache performance problems. Thus, assuming threads are not often moved between processors, sub-allocating objects based on the current processor of the thread helps minimize false sharing. But even infrequent movement may reintroduce false sharing.
From the above, it can be seen that in the JAVA language running under multiple processors, there is a need to allocate storage from a thread local heap to minimize or eliminate false sharing.
SUMMARY
In one embodiment, the present invention relates to a method and system of allocating memory storage in a memory storage space commonly shared by multiple processors, the method comprising: allocating a minimum apparent memory storage space equal to one cache line; in response to storing an object, said object occupying a memory storage space equal to less than one cache line, determining the value of a variable; allocating said memory storage space for said object based on the value of said variable; and updating said variable.
In another embodiment, the present invention provides, a multiprocessor system, comprising: a plurality of processors, each processor coupled to a cache for storing data; an external common memory storage space; a bus coupled to said common cache line and said external memory storage space for transferring data between said common cache line and said external common memory storage space; and an independent executable module; wherein said independent executable module allocates memory storage space for at least one object.
In yet another embodiment, the present invention provides a program storage device readable by a computer system, the program storage device tangibly embodying a program of instructions executable by the computer system to perform a method of allocating memory storage on a memory storage space commonly shared by multiple processors, the method comprising: allocating a minimum apparent memory storage space equal to one cache line; in response to storing an object, said object occupying a memory storage space equal to less than one cache line, determining the value of a variable; allocating said memory storage space for said object based on the value of said variable; and updating said variable.
In a further embodiment, the present invention provides, a method of transmitting a product to a computer system, the method comprising: (a) establishing a connection with the computer system; and (b) transmitting the program product being executable by the computer system to perform a method of allocating memory storage on a memory storage space commonly shared by multiple processors, the method comprising: allocating a minimum apparent memory storage space equal to one cache line; in response to storing an object, said object occupying a memory storage space equal to less than one cache line, determining the value of a variable; allocating said memory storage space for said object based on the value of said variable; and updating said variable.
These and various other features of novelty as well as advantages of the present invention will become readily apparent to those skilled in the art from the following detailed description and corresponding drawings. As will be realized, the invention is capable of modification without departing from the invention. Accordingly, the drawings and description are to be regarded as being illustrative in nature, and not as restrictive.
REFERENCES:
patent: 4775955 (1988-10-01), Liu
patent: 5715430 (1998-02-01), Hirayama
patent: 5875461 (1999-02-01), Lindholm
patent: 6052699 (2000-04-01), Huelsbergen et al.
patent: 6141737 (2000-10-01), Krantz et al.
Baker, H., “CONS Should not CONS its Arguments, or, a Lazy Alloc is a Smart Alloc”,ACM Sigplan Notices, vol. 27, No. 3, pp. 24-34 (Mar. 1992).
Baker, Jr., H., “Optimizing Allocation and Garbage Collection of Spaces”,Artificial Intelligence, An MIT Perspective, vol. 2, pp. 391-396 (1979).
Bozman, G. et al., “Analysis of Free-Storage Algorithms—Revisited”,IBM Systems Journal, vol. 23, No. 1, pp. 44-64 (1984).
Matsuoka, S. et al., “A Fast Parallel Conservative Garbage Collector for Concurrent Object-Oriented Systems”,IEEE, pp. 87-93 (1991).
McDowell, C.E., “Reducing Garbage in Java”,ACM SIGPLAN Notices, vol. 33, No. 9, pp. 84-86 (Sep. 1998).
Monson, L., “Caching & WeakReferences”,JAVA Developer's Journal, vol. 3, Issue 8, pp. 32-36 (Date Unknown).
Park, Y. et al., “Escape Analysis on Lists”,ACM SIGPLAN '92, pp. 116-127 (1992).
Smith, L.C., “Software Mechanism to Reuse Object Instances for Improved Performance”,IBM Technical Disclosure Bulletin, vol. 38, No. 10, pp. 237-241 (Oct. 1995).
Wilson, P. et al., “Dynamic Storage Allocation: A Survey and Critical Review”,Department of Computer Sciences, Universi
International Business Machines - Corporation
Merchant & Gould
Portka Gary J.
Truelson Roy W.
Yoo Do Hyun
LandOfFree
Multiprocessor scaleable system and method for allocating... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Multiprocessor scaleable system and method for allocating..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Multiprocessor scaleable system and method for allocating... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2598773