Optimizing cache data load required for functions in loop...

Electrical computers and digital processing systems: processing – Processing control – Branching

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Optimizing cache data load required for functions in loop... Optimizing cache data load required for functions in loop...

: 1998-10-20
: 2001-06-05
: Kim, Kenneth S. (Department: 2183)
: Electrical computers and digital processing systems: processing
: Processing control
: Branching

: C713S152000, C711S137000, C717S152000
: Reexamination Certificate
: active
: 06243807
: ABSTRACT:

TECHNICAL FIELD OF THE INVENTION
The present invention relates generally to the field of computer architecture, and in particular, to optimizing the performance of a computer architecture by organizing functions within a loop routine.
BACKGROUND OF THE INVENTION
In a typical computer architecture, a central processing unit (CPU) controls computer operations and provides processing capability. In particular, a CPU can receive instructions and data. Acting under the instructions, the CPU processes the data—i.e., accepts data, performs one or more operations on the data, and returns corresponding results. With modern technology, a CPU can be implemented on a single integrated circuit (IC) device.
A CPU is generally supported by several different types of memory, each of which can store information for data or instructions. These different types of memory may include cache memory, external IC memory, and mass memory. Cache memory is internal to the CPU (i.e., implemented on the same IC device as the CPU) and comprises high-speed memory for storing frequently used data or instructions. The cache memory enables a processor to get data and instructions much more quickly than if the same information were stored in some other type of memory. External memory is typically implemented on an IC device separate from that on which the CPU is implemented and can be in the form of random access memory (RAM). RAM can be either dynamic RAM (DRAM) or static RAM (SRAM). Individual data stored in a external memory can be accessed directly by the CPU. External memory is slower than cache memory. Mass memory may comprise disc and/or tape storage devices. Mass memory holds more information than either cache memory or external memory, but is generally slower than both.
In typical operation, information which is relevant for current operations by a CPU are held in cache memory. Such information can be either data or instructions. Data is generally information which may be manipulated, operated upon, or otherwise processed. In some cases, data can be defined and structured in arrays, with each array having a separate array operand. An instruction is information which may be used to command, direct, or otherwise control operations in a computer. Instructions are typically executed by performing one or more functions.
If information which is needed for processing is not contained within cache memory, the processor may direct that such information be retrieved from external memory or mass memory. With direct mapped cache, information within a particular part of external memory can only be mapped into a specific part of cache memory. In such case, the new information is brought into the cache memory where it overwrites the old information. As such, cache memory is constantly overwritten during typical operation of a computer architecture.
In some cases, especially loop operations or routines, the same information (i.e., data or instructions) may be written multiple times into cache memory during execution of the loop routine. That is, the same information is brought into cache memory, overwritten, and then brought in again at another point within the same loop routine. This process of repeatedly retrieving and overwriting the same information in cache memory during a loop routine is extremely inefficient, and thus, adversely impacts the performance of the computer architecture.
SUMMARY
The present invention optimizes the performance of a computer architecture having direct mapped cache.
In accordance with one embodiment of the present invention, a method for optimizing the performance of a computer architecture comprises the following steps: identifying all functions for a loop routine, the functions for executing a plurality of instructions; creating a function sequence file in which the functions are arranged according to a calling sequence of the loop routine; and storing the instructions into an external memory according to the function sequence file.
In accordance with another embodiment of the present invention, a method for optimizing the performance of a computer architecture comprises the following steps: identifying all data arrays for a loop routine; determining an existing data structure for each data array, each existing data structure comprising a respective operand; defining a new data structure for each data array, each new data structure not including any operand; redefining the loop routine based upon the new data structures; and storing data into an external memory according to the new data structures.
An important technical advantage of the present invention includes reorganizing the structure of information before such information is written into an external memory coupled to a processor. Specifically, loops of repeated processing steps are identified, each loop routine operating upon particular data and in response to particular instructions. The instructions and data for these loop routines are organized into structures of information, each of which comprises all instructions or data for one loop routine. Each structure is stored into external memory and can be brought into cache memory as a single block of information.
In one embodiment, a structure for the instruction(s) associated with a loop routine is generated as follows. The executable functions for each instruction are identified and the sequence/order of execution determined. The functions are organized into a text file in the order of execution. This text file is written into external memory as a single block of information. When the respective loop routine is executed by a processor, the entire block of information is brought into cache memory. Because all of the instructions for the loop routine are present at once in cache memory during execution of the loop routine, the cache memory is not continuously overwritten with the same instructions.
In another embodiment, a structure for the data associated with a loop routine is generated as follows. The arrays of data operated upon by the loop routine are identified. The respective operands for each array are separated. A new data structure is created using the operands only. Furthermore, for each old data array, a new data structure is created from the portion the array remaining after separation of the respective operand. The new data structures are then stored in their entirety in respective blocks of external memory. When the respective loop routine is executed, these data structures are retrieved and stored as a whole in cache memory. As such, the same data does not need to be repeatedly written into cache memory.
Thus, in the manner described herein, the present invention optimizes the performance of a computer architecture having direct mapped cache memory.
Other important technical advantages of the present invention are readily apparent to one skilled in the art from the following figures, descriptions, and claims.

REFERENCES:
patent: 4991088 (1991-02-01), Kam
patent: 5113370 (1992-05-01), Tomita
patent: 5303377 (1994-04-01), Gupta et al.
patent: 5797013 (1998-08-01), Mahadevan et al.
patent: 5835776 (1998-11-01), Tirumalai et al.
patent: 5918246 (1999-06-01), Goodnow et al.
patent: 5930507 (1999-07-01), Nakahira et al.

Affiliated with

Chi Ben H. F.

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Kim Kenneth S.

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

PC-Tel, Inc.

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Skjerven Morrill & MacPherson

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Optimizing cache data load required for functions in loop... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Optimizing cache data load required for functions in loop..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Optimizing cache data load required for functions in loop... will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-2525228

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure