Coded data generation or conversion – Digital code to digital code converters – To or from number of pulses
Reexamination Certificate
2000-03-14
2001-10-16
Young, Brian (Department: 2819)
Coded data generation or conversion
Digital code to digital code converters
To or from number of pulses
Reexamination Certificate
active
06304197
ABSTRACT:
CROSS-REFERENCE TO RELATED APPLICATIONS
Not applicable.
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to the processing of digital signals to effect variable-length coding (VLC) wherein execution transpires over multiple data concurrently.
2. Description of Related Art
Much of the putative parallel-processing art in VLC and its subclasses, such as Huffman coding, is labeled as such despite lacking the ability to process multiple VLC data units simultaneously. Instead, acting on just one unit at a time, a result is produced in an accelerated course by examining several limited aspects of the problem at once. True parallel processing is rare in the art as a result of a fundamental incongruity between VLC and parallelism.
The digital signal processing technique, VLC, entails the translation of multiple-bit units of data between fixed-length and variable-length representations, often for the purpose of achieving an aggregate reduction in data in the variable-length domain. The fixed-length units, termed symbols, conveniently partition unencoded data. A symbol's unique variable-length encoded counterpart, the codeword, serves the role of an intermediary, facilitating storage and/or data transmission until it may be decoded back into a usable symbol. In canonical form, a VLC system comprises an encoder, a decoder and an intervening channel providing unidirectional communication of coded data from the encoder to the decoder. An encoder generates a contiguous stream of codewords of uneven extents in ordered correspondence with the input symbol stream, while a decoder reverses this process.
In VLC, length is either an embedded or an implicitly associated attribute of each codeword. Therefore, apart from knowledge internalized with each codeword, the coded stream has no discernible demarcation between codewords. Only by accounting for the span of all codewords succeeding a known reference can a subsequent codeword boundary be deduced. Hence, a serial dependency is established on the process of resolving codewords within a contiguous stream.
As a general principle, parallel processing requires a method of separating data into independent elements for distribution among multiple processing elements (PEs). Inasmuch as the data fail to be independent, processing will be subject to serial dependencies. In VLC, irregular stream delimitation precludes a means of selecting independent data prior to ascertaining the extent of each preceding codeword. Unfortunately, determining codeword lengths is tantamount to decoding if length information is implicit and nearly so if it is embedded. As a result of this dilemma, fully concurrent processing has not heretofore been achieved in a VLC context.
The article, H. D. Lin and D. G. Messerschmitt, “Designing a High-Throughput VLC Decoder Part II”, IEEE Transactions on Circuits and Systems for Video Technology, pp. 197-206, 1992, proposes forcing codeword alignment at fixed intervals. In effect, the codeword stream is divided into equal-sized multiple-codeword blocks in which the first codeword of each block is justified with the block boundary. Rather than processing individual codewords in parallel, the larger blocks are distributed among PEs. Where the final codeword would otherwise extend beyond the block boundary, it is simply detained, becoming the first codeword of the subsequent block. As a result of this practice, voids of varying length are introduced at the end of a majority of the blocks. These lacunae in the coded transmission clearly contravene the purposes of the principal motivation for VLC, data compression. If blocks are made longer to amortize this compressive inefficiency, more buffering memory is needed. In general, a prohibitive quantity of memory is required to store the large blocks central to this scheme. Moreover, as a result of this buffering, a long latency is incurred between the arrival of input and the emission of output.
In the same article, a variation on the foregoing method is discussed in which several multiple-codeword blocks are again distributed among as many PEs. However, no attempt is made to align codewords with block boundaries. Instead, each block is examined over many ramifications resulting from all possible codeword initiation sites within the block. Problem growth is typically substantial since the lengthiest codeword dictates the number of potential codeword origination points. The correct decoding among all alternatives is identified upon receipt of the terminal location within the previous block. Hence, serial dependence between codewords, while not completely eliminated has been deferred to the final steps. Although no compression-rate overhead is suffered by this second approach, as the consequence of the trial processing of so many ramifications, memory requirements are many multiples of the preceding memory-intensive approach. Furthermore, all but one of the ramified processing courses undertaken for each block is discarded, resulting in exorbitant computational waste. Within each concurrent block either each course is simultaneously computed through a more deeply nested level of concurrency, thus multiplying hardware excessively, or each course is dealt with in turn slowing computation proportionally. In the latter circumstance, an exceptionally high parallelism factor exceeding the longest codeword length is necessary to derive any performance benefit at all from the approach.
In U.S. Pat. No. 5,841,380 a similarly ramified approach to neutralizing serial dependency is suggested, but on the scale of individual codewords rather than blocks. In that exemplar, two codewords are decoded in tandem, with the first processed in the normal fashion and the second, whose locality within the stream is yet unidentified, processed from all potential initiation sites in parallel. Upon the conclusion of decoding, the correct result for the second codeword is isolated from among the many provisional decodings in accordance with the discovered length of the first codeword. Although the approach might have been extended to more than two codewords, as was suggested in the previously described block-based method, the restriction to just two codewords in the case cited is in keeping with the undeniably impractical growth in complexity attending such ramified methodologies.
U.S. Pat. No. 5,202,967 attempts to circumvent ramifications by placing format restrictions on every other codeword such that, in a two-processor system, the second processor can identify and thus decode the second codeword without reference to the first. This special format violates the basic principles of VLC, however.
Also in the aforementioned article, as well as in the article, K. K. Parhi, “High-Speed VLSI Architectures for Huffman and Viterbi Decoders,” IEEE Transactions on Circuits and Systems II, vol. 39, no. 6, 1992, the decoding problem is recast in a circuit-dependency framework known as the finite state machine (FSM). Exploiting known parallelizing transformations on FSMs, the computation can be expanded over duplicate nodes, producing a concurrent process. Since codewords are involved directly in the formulation of the FSM, the resultant hardware is not programmable. Thus, the FSM design process must be repeated and new hardware fabricated for each new code. As intimated in the former citation and admitted in the latter, the approach is chiefly of academic interest since it is only applicable for symbols drawn from artificially diminutive sets. Furthermore, as demonstrated in the latter article, the manipulations required to arrive at the final result, and accordingly the final result itself, are quite complicated.
In the article, P. G. Howard and J. S. Vitter, “Parallel Lossless Image Compression Using Huffman and Arithmetic Coding,” in Proceedings of the Data Compression Conference, pp. 229-308, Snowbird, Utah, March, 1992, block processing is again suggested. In this case, each codeword in a block is disassembled bit by bit, commencing with the first bit and proceeding to the last. All the fi
Freking Robert Allen
Parhi Keshab K.
LandOfFree
Concurrent method for parallel Huffman compression coding... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Concurrent method for parallel Huffman compression coding..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Concurrent method for parallel Huffman compression coding... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2557470