Electrical computers and digital processing systems: memory – Addressing combined with specific memory configuration or... – For multiple memory modules
Reexamination Certificate
2000-10-12
2003-12-16
Kim, Matthew (Department: 2186)
Electrical computers and digital processing systems: memory
Addressing combined with specific memory configuration or...
For multiple memory modules
C711S154000, C711S157000, C711S150000, C711S147000, C711S219000, C711S221000, C365S230030, C365S230040, C365S230050
Reexamination Certificate
active
06665768
ABSTRACT:
BACKGROUND OF THE INVENTION
One approach to implementing a high-performance memory system is to use interleaved banks in the memory. Such a memory is built out of smaller memories referred to as banks, each of which handles part of the data being processed by the system. The addresses of locations in each bank are not sequential within a bank, but, rather, are sequential across the multiple banks. That is, for example, if the memory has four banks, then bank
0
could handle location addresses
0
,
4
,
8
, . . . , in sequence; bank
1
could handle addresses
1
,
5
,
9
, . . . ; bank
2
could handle addresses
2
,
6
,
10
, . . . ; and bank
3
could handle addresses
3
,
7
,
11
, . . . Hence, if instructions that are being executed access the memory addresses in sequence, as is often the case, the instructions can access different banks of the memory and can, therefore, be executed simultaneously.
In many cases, the addresses of accessed data are independent. For example, this is the case when a table look-up for the result of some function such as a complex mathematical function is being performed. In this case, the address is dependent upon the input data and may have no correlation to other addresses, e.g., the addresses are not in sequence. If the addresses are applied to an interleaved memory at once, they may spread across all the banks, or they may all hit on one bank. This situation results in reduced performance because of the simultaneous accesses. It also makes the time to process a table look-up indeterminate. This makes it difficult for the system to meet real-time performance time constraints.
One prior approach to dealing with these problems is for each processor to have an individual small private memory for holding tables. In this approach, because there is only one unit accessing the memory, it always takes the same time to handle table look-ups. However, this approach can be inefficient because, in general, the private memories are either too small or too big. They are too small if programmers want to store more tables than will fit, and they are too big if tables are not being used. In either case, the use of these individual memories can be inefficient.
SUMMARY OF THE INVENTION
The present invention is directed to an apparatus and method for accessing data in a processing system. A plurality of processing elements are provided for executing program instructions. Each processing element is associated with an identifying value. A memory stores data accessed by the processing elements while executing instructions. An instruction is received for execution. The instruction requires access to the memory and includes information associated with an address in the memory. For example, the information can be an address word containing the base or start address of a table stored in the memory. Each of the processing elements executing the instruction generates an address for accessing the memory by combining at least a portion of the information associated with an address in the memory with at least a portion of the identifying value associated with the processing element. The processing element then accesses the memory using the generated address.
In one embodiment, the memory is partitioned into a plurality of banks and can be interleaved. The instruction can require a look-up of data stored in a table. The table can be stored at a plurality of locations of the memory. Multiple copies of the table can be stored in respective multiple banks of the memory. Accordingly, multiple accesses to the same data item can be made to the multiple banks simultaneously by multiple processing elements.
In one embodiment, the processing system is a single instruction stream, multiple data stream (SIMD) processing system. Accordingly, instructions can be executed along multiple data paths simultaneously by the system. In this embodiment, each processing element is a data path in the SIMD system.
The address used by the processing element, i.e., data path, to access the memory is generated from the information associated with an address, e.g., the start address of the table, and the identifying value associated with the processing element. The identifying value uniquely identifies the processing element and therefore distinguishes it from the other processing elements. Therefore, adding it to the address used to access the memory ensures that the number of processing elements accessing the relevant memory address at one time can be controlled. In one embodiment, the identifying value is a multi-bit word. Only a predetermined number of the least significant bits of the word are added to the generated address and, therefore, control the number of processing elements that can access the memory location at a time. In one embodiment, the number of banks containing copies of the table being accessed is used to determine the number of bits. Specifically, in one embodiment, the number n of bits used to generate the memory address is defined by N=2
n
, where N is the number of banks and/or the number of copies of the table stored in the banks. Therefore, the number of bits used in the address is the number of bits required to uniquely select one of the banks.
For example, if there are four copies of the table in the banks and four processing elements (data paths), two bits of the data path identifying word are used. The two bits are used to select one of the four banks and/or table copies for each of the four data paths. As a result, a one-to-one correspondence between the data paths and the table copies in the banks is developed, such that no simultaneous accesses by multiple processing elements to the same memory location can occur. As another example, if there are four banks (table copies) and eight data paths, the two least significant bits of the identifying value are used. As a result, four pairs of data paths will access each bank (table copy). Hence, only two simultaneous accesses can occur at each memory location. The accesses will be delayed slightly, but much less than they would be delayed in an uncontrolled setting where all eight data paths could make eight simultaneous accesses to the same location.
The approach of the invention provides numerous advantages over approaches of the prior art. For example, the invention spreads table look-up memory accesses over multiple memory banks in an interleaved memory in a controlled and predictable fashion. As a result, the number of simultaneous accesses can be limited. In addition to limiting simultaneous accesses, even where simultaneous accesses do occur, the maximum possible number of them is limited and known. Therefore, the time required for memory accesses is predictable, thus facilitating meeting real-time scheduling constraints.
REFERENCES:
patent: 4604695 (1986-08-01), Widen et al.
patent: 4679140 (1987-07-01), Gotou et al.
patent: 5111389 (1992-05-01), McAuliffe et al.
patent: 5479624 (1995-12-01), Lee
patent: 5497478 (1996-03-01), Murata
patent: 5638533 (1997-06-01), Law
patent: 5684973 (1997-11-01), Sullivan et al.
patent: 5737572 (1998-04-01), Nunziata
patent: 5872987 (1999-02-01), Wade et al.
patent: 5924111 (1999-07-01), Huang et al.
patent: 5924117 (1999-07-01), Luick
patent: 5991857 (1999-11-01), Koetje et al.
patent: 6076136 (2000-06-01), Burroughs et al.
patent: 6381668 (2002-04-01), Lunteren
patent: 6452864 (2002-09-01), Condemi et al.
patent: 6473339 (2002-10-01), De Ambroggi et al.
Hennessey, et al., “Enhancing Vector Performance,” Computer Architecture a Quantitative Approach, Second Edition, Section B5, pp.B23-B29, 1996.
ChipWrights Design, Inc.
Kim Matthew
Li Zhuo H.
MIlls & Onello, LLP
LandOfFree
Table look-up operation for SIMD processors with interleaved... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Table look-up operation for SIMD processors with interleaved..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Table look-up operation for SIMD processors with interleaved... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3176879