Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability
Reexamination Certificate
2000-12-12
2004-09-14
Baderman, Scott (Department: 2184)
Error detection/correction and fault detection/recovery
Data processing system error or fault handling
Reliability and availability
C714S049000, C714S733000
Reexamination Certificate
active
06792560
ABSTRACT:
BACKGROUND TO THE INVENTION
There are many applications for software based digital processing systems that need to be particularly reliable. The need for high reliability may be for the provision of safety, as in the case of software controlling the flight surfaces of an otherwise inherently unstable aircraft, or for the control of a potentially hazardous industrial process. High reliability is also required in systems handling financial transactions. The reliability issue can be divided into software reliability and hardware reliability.
Regarding software reliability, there is a difficulty in proving correct operation since “conventional” programming languages are based upon the destructive assignment statement at the very heart of the von Neumann paradigm of computation. The level of indirection introduced as a consequence of “location addressing”, i.e. that the contents of a given storage location has no relation with its address, results in non-tractable problems which, arguably, become manifest throughout an entire computer system. This essentially means that entities such as formal system proofs and meaningful system metrics cannot be attained whilst using this class of language. Also, conventional processors tend to have instruction sets which are completely defined. As a consequence, if there is an error in code being executed, the resulting output may be unpredictable, even if the error is well defined.
At the same time, it is well known that purely declarative programs are very amenable to formal proof of conformance to a given system specification, and that with the assistance of various formal toolsets, can also highlight any inconsistencies or ambiguities contained within that system specification.
The inventors have recognised that if the software component of a system is provably correct at the outset, it will remain correct in the future. On the other hand, even if the hardware component of the system is provably correct at the outset, it will eventually fail. This leads to the aim of producing a hardware design which is provably correct at the outset, and that the integrity of its operational correctness will be checked repeatedly throughout its life.
The VIPER
1
and VIPER
2
projects were a very serious attempt to realise a formally verified processor, made some years ago, and were used in a railway signalling application. Besides being criticised as being too slow and restrictive, they did not gain widespread adoption. VIPER
1
and VIPER
2
were processors based upon the “conventional” location addressing paradigm. In effect, the degree of proof which they could attain was the consequence of an engineering trade-off where a severely restricted set of machine code instructions impeded the usefulness of the reduced instruction set computer.
Specialised processors more suited to some declarative languages have been developed. For example, dataflow architectures have been developed over several decades. One is shown in an article in Communications of the ACM, January 1985, vol 28 no. 1, “The Manchester Prototype Dataflow computer” by Gurd et al. This involves a pipelined ring structure including a token queue, for storing data and instructions to be processed, a combiner (also called a matching unit), for combining instructions and associated data, an instruction store containing machine code for each instruction, and a number of execution units coupled in parallel for carrying out particular functions. The ring also contains a switch for switching the output of the execution units either to a system output, or back to the start of the ring, the token queue, for further processing.
The reliability and provability of correct operation of this type of hardware architecture, or of its component parts still presents problems. Conventional ways of improving hardware reliability include specifying high reliability components for mission critical parts, carrying out burn in of parts, and providing redundancy at component and/or system level. A drawback with redundancy is the additional cost, and the risk of the failure in the hardware for detecting failure and selecting which of the redundant systems or components to choose in the event of a fault. Such additional complexity makes the task of verifying correct operation, or of being certain of detecting faulty operation, much more difficult. Another conventional way of handling both hardware and software faults such as radiation induced errors in stored values, is to include a checking mechanism where, for example, a parity check is the simplest method of detecting the occurrence of a single error. This is used in some random access memories (RAM), which store a parity check bit for each byte of data, then verify the parity bit is correct for that byte when the byte is read out.
SUMMARY OF THE INVENTION
It is an object of the invention to provide improved arrangements which address the above mentioned problems.
According to the invention there is provided a processor for executing instructions, comprising a data store, an instruction store, a combiner for combining instructions and data associated with a respective one of the instructions, processing elements for carrying out the instructions and outputting results, wherein the processing elements and the combiner comprise trusted circuitry, the trusted circuitry comprising circuitry whose design has been proven to operate correctly, and comprising self checking circuitry for checking that it has not operated incorrectly, the processor further comprising circuitry for checking for errors in data and instructions input to the processing elements and to the combiner.
An advantage of the combination of error checking at the inputs to the processing elements and the combiner, and having self-checking circuitry for these parts, is that the amount of circuitry which needs to be trusted, (i.e. of proven design, and verified operation) can be advantageously limited. This enables other parts of the processor to use circuitry which is not necessarily rigorously verified, and therefore can be constructed more simply and to operate faster.
An advantage of checking the combined data and instruction, rather than e.g. conventional parity checking of individual bytes read out of RAM, is that a wider range of errors, such as addressing errors, and multi-bit errors, can be detected before execution.
Furthermore, an advantage of having error checking at the input to the processing elements is that it enables different types of data to be segregated. Different parts of the circuitry can be allocated to handle different types of data, and finer granularity of checking and error confinement can be used to check that the right type or form of data is being input. This can provide a greater guarantee of correct segregation of data, and thus further guarantee integrity of the system. It is a simpler and more direct method than existing software based segregation of different types of data.
Preferably the processor comprises circuitry for detecting an error in data output by the processor.
An advantage of this is that it enables untrusted (or possibly flawed) circuitry such as data storage elements, to be used before the data is output, yet still maintain reassurance that correct data is being output.
Preferably circuitry for checking the data comprises circuitry for adding error detection information to the data before the data is passed to untrusted circuitry, and circuitry for using the error detecting information to detect errors in the data after it has passed through the untrusted circuitry. This is an efficient way of verifying that the data has not been corrupted by the untrusted parts of the circuitry, with little reduction in bandwidth or additional cost in processing time or hardware.
Preferably the circuitry for detecting an error and adding the error detection information comprises trusted circuitry. An advantage of this is that otherwise any errors in the detecting or adding of this information might not be captured, and the error detection cannot be trusted completely. Thi
Francis Patrick J
Hicks Richard Charles John
Baderman Scott
Barnes & Thornburg
Lohn Joshua
Nortel Networks Limited
LandOfFree
Reliable hardware support for the use of formal languages in... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Reliable hardware support for the use of formal languages in..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Reliable hardware support for the use of formal languages in... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3248248