Electrical computers: arithmetic processing and calculating – Electrical digital calculating computer – Particular function performed
Reexamination Certificate
2000-01-12
2003-01-28
Ngo, Chuong Dinh (Department: 2124)
Electrical computers: arithmetic processing and calculating
Electrical digital calculating computer
Particular function performed
C708S211000, C712S225000
Reexamination Certificate
active
06513053
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates generally to a data processing circuit and method for determining locations of a predetermined value in a sequence of data bits.
2. Description of the Prior Art
In data processing there are many circumstances in which it is helpful to know the locations of a predetermined value in a sequence of data bits. For example, the number of leading zeros in a sequence of data bits needs to be determined in many floating point implementations (see for example U.S. Pat. No. 5,040,138). Further, knowing the number of leading zeros, or in other words the position of the first logic one value, in an operand can increase the speed of arithmetic operations performed on the operand. U.S. Pat. No. 5,111,415 discloses an asynchronous leading zero counter for calculating the number of leading zeros of an operand in order to increase arithmetic processing speed. It comprises a plurality of leading zero detector cells of like kind arranged in an array that provides a digital output word having a magnitude indicative of the leading zero count (and thus the position of the first logic one value) on the inputs to the plurality of cells.
Similarly, in an ARM processor (as designed by ARM Limited), a find first one and find next one is a fundamental part of implementing LDM instructions (block data loads from memory) and STM instructions (block data stores to memory). These vector instructions operate with a register list, which is a 16 bit field in the instruction in which each bit corresponds to a register. A logic
1
value in bit
0
of the register field indicates that register R
0
is to be transferred, while a logic
0
value indicates that it is not to be transferred; similarly bit
1
controls the transfer of register R
1
and so on. Thus, to implement these instructions it is necessary to perform a very fast find first
1
, and sometimes find second
1
, on up to 16 bits.
A conventional way of determining find first one followed by find next one for block loads and stores is illustrated in
FIGS. 1 and 2
. The sequence of data bits in which the first one is to be detected is referred to herein as a “vector”, and arrives in the instruction pipeline
10
at the start of the decode cycle. The location of the first one in the vector is needed by the time register reads occur.
FIG. 1
shows an instruction pipeline
10
with a multiplexer
20
connected to the instruction pipeline
10
and a find first
1
circuit
30
connected to the output of the multiplexer
20
. The vector in which the first one is to be detected travels through the latches
12
of the instruction pipeline
10
and during one pipelined stage (in preferred embodiments the decode stage) is taken via the multiplexer
20
to the find-first-one circuit
30
. This circuit finds the first one in the vector and outputs the location as a 4 bit binary number to register bank
40
to identify a particular register in the register bank. Further, the find first one circuit
30
is arranged to mask the first one in order to generate a revised vector, and to return this revised vector to the multiplexer
20
. The instruction containing the vector will also include a base address, which is passed to the address adder
24
, from where it is output to the memory
50
. Hence, this address will identify the memory location in the memory
50
whose contents are to be loaded into the particular register identified by the find first one circuit
30
in the case of an LDM instruction, or will identify the address to which the contents of the register specified by the find first one circuit
30
are to be stored in the case of an STM instruction.
The address adder
24
is also arranged to receive as an input the output of a circuit
22
provided to count the number of logic one values in the original vector. This enables the adder to calculate the memory addresses from which data is to be loaded or to which data is to be stored. During the next iteration of the process, the multiplexer
20
is arranged to pass the revised vector to the find first one circuit
30
, such that the location of the next one is output to register bank
40
to identify a corresponding register in the register bank. Further, on this iteration, the address adder
24
is arranged to increment the base address and to provide the incremented address to the memory
50
. Accordingly, in this next iteration, the next register is identified, and the next memory location is also identified, thereby enabling the load or store process to be repeated based on the new register and new memory address. This sequence is repeated several times until all logic one values have been found in the vector, and accordingly all load or store operations have been performed.
The find first one circuit, giving an example vector, is shown in greater detail in
FIG. 2
, in which like parts have like reference numerals. In the example shown a vector 11101100 is input to the find first one circuit
30
. The location of the first one (bit
2
in this case) which specifies the first register to be transferred is then output to register bank
40
. The find first one circuit
30
also acts to mask the first one that has been found in the original vector and outputs to the latch
34
a revised vector 11101000. This revised vector with the first one masked is then re-input in the next cycle via the multiplexer
20
into the find-first-one circuit
30
and the first one in this vector is then found, this being in effect the second one of the original vector. This step is then repeated until all ones are found. Thus, in each cycle the find-first-one circuit
30
operates on the output vector from the previous cycle.
An example, showing an 8 bit vector for clarity is shown below, where bit
0
is the first bit and bit
7
the last.
Bit position
7
6
5
4
3
2
1
0
for vector 0 0 1 1 0 1 0 1
Input at start
cycle
of decode stage
output
1
00110101
0 and 00110100
2
00110100
2 and 00110000
3
00110000
4 and 00100000
4
00100000
5 and 00000000
As is shown above, vector 00110101 is input to the pipeline at the start of decode. The location of the first logic one value at bit position
0
is determined and this location along with a revised vector, being the original vector with the first one (in bit
0
) masked, i.e. vector 00110100, is output in a single clock cycle. In the next clock cycle the vector output from the previous calculation i.e. 00110100 is input to the find first one circuit and the position of the first logic one value in this vector is found, i.e. 2. This result is then output along with a further revised vector, being the revised vector with the first one masked, this vector being used as the input vector in the next clock cycle. Thus, in this example, at the end of four clock cycles all of the logic one values have been found and their positions output.
This works well provided that you only need to do one find first one per cycle. However, to increase processing speed, it may be desirable to execute instructions which require two logic one value values to be determined in a single cycle. For example, it would be desirable to execute LDM instructions, which can load two registers in a single cycle. Thus, find first one and find second one would need to be performed in one cycle.
FIG. 3
illustrates a circuit for finding first and next one, while
FIG. 4
shows a flow diagram of such a circuit.
FIG. 3
is very similar to
FIG. 2
, except that there are two identical find first one circuits
30
,
32
arranged in series. The original input vector passes from the instruction pipeline
10
through multiplexer
20
to the first find one circuit
30
. The first one in the input vector is found and its location output, this logic one value is then masked and a revised vector with the logic one value masked is sent to the next find first one circuit
32
. This circuit finds the first logic one value in the revised vector (which corresponds to the second logic one value in the original vector), outputs its location and then masks this logi
Arm Limited
Ngo Chuong Dinh
LandOfFree
Data processing circuit and method for determining the first... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Data processing circuit and method for determining the first..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Data processing circuit and method for determining the first... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3023582