Memory allocator for multithread environment

Electrical computers and digital processing systems: memory – Storage accessing and control – Memory configuring

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C711S171000, C711S147000, C711S152000

Reexamination Certificate

active

06539464

ABSTRACT:

REFERENCE TO MICROFICHE APPENDIX
The appendix contains 4308 lines of source code in C++ programming language.
BACKGROUND—Field of Invention
The invention relates to memory allocation in computer systems and more particularly to dynamic memory allocation in single-thread and parallel (multi-process/multi-thread) environments.
BACKGROUND—Discription of Prior Art
Today, most modern programming languages, including C and C++, allow the user to request memory blocks from the system memory at run-time and release these blocks back to the system memory when the program no longer needs them. This is known as dynamic memory allocation. The part of the programming environment that serves the dynamic memory allocation requests is known as heap manager or dynamic memory allocator. Typically such services are provided by the operating system, by the standard (compiler) run-time libraries, by third-party libraries, or by combination of the above.
The C programming language provides memory management capability with a set of library functions known as “memory allocation” routines. The most basic memory allocation function is called “malloc” which allocates a requested number of bytes and returns a pointer that is the starting address of the memory allocated. A function known as “free” returns the memory previously allocated by “malloc” back to the system so that it can be allocated again later for use by other routines. Another frequently used function is “realloc” which changes the size and possibly the address of an allocated memory block, while preserving its contents.
The requirements for dynamic memory allocators are these: high speed, low maximal response time, high memory utilization, scalability with CPU count, and threads count independence. A desirable feature is the simplicity of their usage, preferably in a form, equivalent to the de-facto standard C language memory allocation functions.
Many dynamic allocators for single-thread environments have been designed over the years. A representative survey on this work have been done by Wilson, Johnstone, Neely and Boles in their paper “Dynamic Storage Allocation: A Survey and Critical Review”, 1995. This survey does not cover the aspects of memory allocation specific for multithread environments. On the other hand, almost every multithread allocator uses some form of underlying monolithic (single-thread) memory allocator in its implementation.
Prior Art—Fixed Size Free Block List Allocators
Some of the memory allocators, known in the prior art, use deferred free block coalescing and storing free blocks in segregated lists of blocks of the same fixed size. They define a plurality of predetermined fixed block sizes. Each block list contains free blocks of the same size—one of the predetermined fixed sizes. In the prior art terminology these lists are also known as “bins”, “quick lists”, “block queues”, or “fast lists”. In some variants blocks, bigger than a predetermined maximal block size, are managed by a separate method. Additional method is used for coalescing and splitting of blocks, when necessary.
Prior Art—Fixed Size Free Block List Allocators—Determination of the Fixed Size Index
The operation of the allocators of the aforementioned type includes a step of determining of the fixed size, corresponding to certain block size. Depending on what kind of request is being served (allocation or disallocation), the fixed size to be found is either the smallest one that is not less than the requested size, or the greatest one that is not greater than the requested size.
There are several methods for determining such a size. Some methods include storing of the fixed sizes in an array in e.g. increasing order and searching the array, either sequentially, or by binary search, for the size, satisfying the search condition. For instance, the “Hoard” allocator version 2.0.5.1 by Emery Berger, uses sequential search.
These search methods require several steps, which take time. This time adds to the time of the allocation request.
An obvious way to speed up this search is using a table of size equal to the maximal fixed size, filling each entry in table with the fixed size, corresponding to the index of the entry, and obtaining the fixed size, corresponding to a block size, by just taking the value in the table at position equal to the block size. This method is very fast and typically takes only several CPU instructions. Obviously, the same techniques may be used for obtaining the index of a fixed size rather than the fixed size itself, if this is more appropriate.
A drawback of this method is that the table that is needed might be quite big, relative to the size of the CPU cache. Typical values are 5 Kbytes for the table, corresponding to a maximal fixed block size of 5 Kbytes, and 32 Kbytes for data cache in a CPU.
The relatively big table size is a reason for increased CPU cache miss ratio, which in turn is a reason for deteriorated performance of the memory allocation operations, using this techniques.
Another method for determining such a fixed size index is presented in U.S. Pat. No. 5,623,654, 1997, Peterman. It uses table of limited size (e.g. 1024) and storing the free blocks of size having the same residue by module the table size to the same entry in the table. This method still requires several steps for determination of the fixed size, corresponding to a block size.
Prior Art—Fixed Size Free Block List Allocators—Controlling the List Length
The blocks, residing in any particular free block list, are unavailable for anything else except for allocating blocks of size equal or smaller than the fixed size for this list. It would be desirable that the count of the blocks in any given free list is big enough, so that most allocation requests are satisfied by taking blocks from it, rather than from the splitting/coalescing part of the allocator. On the other hand, this count should be small, so that the memory overhead is low. It would also be desirable that the aforementioned count changes dynamically in accordance with the current program workload; i.e. provided that there is sufficiently long period of no allocation/disallocation activity for the particular fixed size, the entire list should be purged to the coalescing level, this way making the memory available for allocation as blocks of other sizes, or to other programs.
U.S. Pat. No. 5,109,336, 1992, Guenther et al, discloses a method for continuously purging unused blocks from the free lists in order to improve the memory usage. The method disclosed, though, does not provide any correlation between the program workload and the counts of the blocks remaining in the free block lists.
In general, using of segregated fixed size block lists with deferred coalescing did not find wide support in the monolithic (single-thread) memory allocators in the prior art. Some of the reasons for this fact are the unsolved problems, mentioned above.
Prior Art—Multithreading
In parallel computing environment with shared memory, e.g. multithread programs, there are more than one threads of execution of programming instructions that run simultaneously. Although typically threads are scheduled to run concurrently on the available CPUs, from the end-user's point of view they appear to run in parallel. In multi-processor systems this parallelism is real. Typically, each thread has a set of associated system resources (registers, stack, local thread data, etc.), accessible by this thread only. There are also resources, among which is the dynamically allocated memory, that are shared among the threads.
Prior Art—Access Serialization—Mutexes
In parallel environments, methods are needed to ensure that the threads coordinate their activities, so that one thread does not accidentally change the data (or other type of object) that another thread is working on. This is known as access serialization. Serialization is achieved by providing means that limit the number of threads that access the same object concurrently.
For purposes of serialization, traditionally the multithread programming

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Memory allocator for multithread environment does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Memory allocator for multithread environment, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Memory allocator for multithread environment will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3044119

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.