Electrical computers and digital processing systems: processing – Byte-word rearranging – bit-field insertion or extraction,...
Reexamination Certificate
1999-10-01
2004-11-16
Kim, Kenneth S. (Department: 2111)
Electrical computers and digital processing systems: processing
Byte-word rearranging, bit-field insertion or extraction,...
C711S201000, C712S225000
Reexamination Certificate
active
06820195
ABSTRACT:
BACKGROUND OF THE INVENTION
The present invention relates generally to microprocessor or microcontroller architecture, and particularly to an architecture structured to handle unaligned memory references.
In computer architecture over the past decade RISC (Reduced Instruction Set Computer) devices, in which each instruction is ideally performed in a single operational cycle, have become popular. The RISC architecture has advantages over computers having standard architecture and instruction sets in that they were capable of much higher data processing speeds due to their ability to perform frequent operations in shorter periods of time. The RISC devices began with 16-bit instruction sets, and grew to 32-bit instruction set architectures having graphics capabilities. With such thirty-two bit instruction set architectures and more complex applications, there was a requirement for larger memory sizes, e.g., words two, four, or eight bytes in length (i.e., words of 16, 32, or 64 bits each). However, certain peripheral devices and applications generate or accept data of only one or two bytes. One result of this type of data is that it produces an unaligned word reference. Other examples, include some compressed data streams, which may pack data in ways that require access to unaligned data.
To understand what an unaligned word reference is, there needs to be a description of an aligned word reference. If a data object is of size N bytes at address A, then the object is aligned if A mod N=0. Table 1 shows examples of aligned and unaligned accesses of data, were the byte offsets are specified for the low-order three bits of the address (Computer Architecture A Quantitative Approach, John Hennessy and David Patterson, Morgan Kaufmann, Publishers, Inc., Copyright 1990, page 96, herein referred to as “Hennessy”).
TABLE 1
Object Addresses
Aligned by byte offsets
Unaligned at byte Offset
byte (8-bits)
0, 1, 2, 3, 4, 5, 6, 7
(never)
word (16-bits)
0, 2, 4, 6
1, 3, 5, 7
long word (32-bits)
0, 4
1, 2, 3, 5, 6, 7
quad-word (64-bits)
0
1, 2, 3, 4, 5, 6, 7
Hence, for a machine capable of handling 4 byte long words, if incoming data is loaded sequentially as 2 bytes of data followed by 2 more bytes of data, the 4 bytes of data cannot be retrieved or stored in a single cycle because it would overlap a word boundary within memory. Thus, some prior art RISC devices either do not accept data in this form, in which case special procedures must be used to ensure that all data is aligned at word boundaries, or programming is required which uses up at least two consecutive instruction cycles. One way to ensure, for example, that all data is aligned in word boundaries would be to add extra bits to data of shorter length usually known as bit stuffing. Whether bit stuffing is used or the programming is altered, the unaligned references degrade the performance of these prior art RISC devices.
To handle the loading and storing of unaligned data words in a system, i.e., a data word which straddles a word boundary in memory (Table 1), prior art machines have also used either an alignment network to load or store bytes in a word or a shifter, which shifts the data only in those cases where alignment is required (Hennessy, ibid., pages 95-97).
FIG. 1
illustrates a prior art alignment network
114
. In
FIG. 1
, memory
100
shows eight consecutive bytes (i.e., a byte equals 8 bits): Y
3
, Y
2
, Y
1
, D
4
, D
3
, D
2
, D
1
, and X
4
. Each byte in memory
100
is given an address which ranges from
0
to
7
. For example, address
2
in memory
100
has memory contents Y
1
. The desired data bytes that are used in this and the following examples are D
4
at address
3
, D
3
at address
4
, D
2
at address
5
, and D
1
at address
6
. Each of these desired data bytes are to be loaded and stored to and from register R
110
. Register R
110
has 4 byte positions: P
4
, P
3
, P
2
, and P
1
. Memory slice
112
of memory
100
shows a desired data byte D
4
at address
3
. D
4
could be loaded from memory slice
112
through the alignment network
114
into register R
115
at positions P
4
, P
3
, P
2
, or P
1
. In this case D
4
is loaded from memory slice
112
at address
3
to P
4
in register R
115
through alignment network
114
. Similarly, desired data bytes D
3
, D
2
, and D
1
located in memory
100
addresses
4
,
5
, and
6
can be loaded through a similar alignment network to positions P
3
, P
2
, and P
1
in register R
115
to give register R
110
. This type of hardware alignment network
114
could be seen in Intel's 8086 and 8088 which came out in the late 1970s. The Intel 8088 was word and byte addressable. The 8088 used a cross-bar switch to swap bytes (Structured Computer Organization, 3
rd
Edition, Andrew Tanenbaum, Copyright 1990, pages 215-217, pages 230-237). Note that Intel 8088 instruction set had separate instructions for shifting and rotating as these were considered different operations. For example, shifting one bit left would discard the leftmost bit, while rotating left would cycle the leftmost bit around to the rightmost bit.
FIG. 2
illustrates a prior art example of aligning a misaligned data word using shifting operations. An example can be seen in U.S. Pat. No. 4,814,976, RISC Computer With Unaligned Reference Handling And Method For The Same, Hansen, et al., issued Mar. 21, 1989 (herein referred to as “Hansen”). The contents of memory
100
at address
0
-
3
are loaded into register
120
, locations PA
4
to PA
1
. The contents of memory
100
in addresses
4
to
7
are loaded into register B
130
at locations PB
4
to PB
1
. Register A
120
is then shifted left three places, so that D
4
is in position PA
4
. Register B
130
is shifted right one place so that D
3
is in location PB
3
, D
2
is in PB
2
, and D
1
is in PB
1
. Register A
122
is merged
144
with register B
132
to give the desired data located in the proper position in register R
110
. The merge
144
was done by either overwriting locations PA
3
to PA
1
in register A
122
with locations PB
3
to PB
1
in register B
132
or the appropriate positions in register B
132
were overwritten by the appropriate places in register A
122
. In the alternative, the merge
144
may copy the contents of PA
4
in register A
122
to position P
4
in register R
110
and may copy the contents of PB
3
, PB
2
, and PB
1
of register B
132
into locations P
3
, P
2
, and P
1
of register R
110
.
Thus, unaligned words in memory were loaded and aligned in the microprocessor and aligned words in the microprocessor were unaligned and stored in memory using either an alignment network
114
of
FIG. 1
or a shift left, shift right, and merge
144
of FIG.
2
. These techniques were used, for example, on 32-bit words being loaded and stored from a 32-bit computer architecture. There are new problems which arise in a 64 bit architecture which loads and stores 32, 16, and 8 data bits. A 64 bit memory system requires twice as many alignment paths for bytes and half-words as a 32-bit memory system, as well as two 32-bits alignment pads for word accesses. Thus, the alignment network of the prior art becomes a complicated and expensive solution. Also, in
FIG. 2
, the merge
144
becomes more complicated as it must handle many more don't cares
116
that are shifted into the registers. In addition, such prior art as Hansen, et al. does not disclose how sign extension is done in going from 32 to 64 bit words.
FIG. 2
either has two M-bit shifters or a shift left and a shift right or a more complicated M-bit bi-directional shifter. Thus, as computer architectures go from 32 bit to 64 and maybe 128 bits, there needs to be a better method of handling unaligned data, which includes proper sign extension.
SUMMARY OF THE INVENTION
The present invention discloses a method for loading unaligned data stored in several memory locations, including a step of loading a first part of the unaligned data into a first storage location and rotating the first part from a first position to a second position in the first m
Kim Kenneth S.
Townsend and Townsend / and Crew LLP
LandOfFree
Aligning load/store data with big/little endian determined... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Aligning load/store data with big/little endian determined..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Aligning load/store data with big/little endian determined... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3354495