System and method for avoiding deadlock in multi-node network

Electrical computers and digital processing systems: multicomput – Computer-to-computer protocol implementing – Computer-to-computer data transfer regulating

Reexamination Certificate

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details System and method for avoiding deadlock in multi-node network System and method for avoiding deadlock in multi-node network

: 1999-04-02
: 2002-12-03
: Najjar, Saleh (Department: 2154)
: Electrical computers and digital processing systems: multicomput
: Computer-to-computer protocol implementing
: Computer-to-computer data transfer regulating

: C709S232000, C709S234000, C370S229000, C370S231000, C370S232000
: Reexamination Certificate
: active
: 06490630
: ABSTRACT:

CROSS-REFERENCE TO CO-PENDING APPLICATIONS
This application is related to co-pending U.S. patent application Ser. No. 09/041,568, entitled “Cache Coherence Unit for Interconnecting Multiprocessor Nodes Having Pipelined Snoopy Protocol,” filed on Mar. 12, 1998 now pending; co-pending U.S. patent application Ser. No. 09/003,771, entitled “Memory Protection Mechanism for a Distributed Shared Memory Multiprocessor with Integrated Message Passing Support,” filed on Jan. 7, 1998 now pending; co-pending U.S. patent application Ser. No. 09/003,721, entitled “Cache Coherence Unit with Integrated Message Passing and Memory Protection for a Distributed, Shared Memory Multiprocessor System,” filed on Jan. 7, 1998 now pending; and co-pending U.S. patent application Ser. No. 09/281,714 “Split Sparse Directory for a Distributed Shared Memory Multiprocessor System,” filed on Mar. 30, 1999, which are hereby incorporated by reference now pending.
BACKGROUND OF THE INVENTION
1. Technical Field
This invention relates generally to computer network messaging and more particular to avoiding deadlock while controlling messages in a multi-node computer network.
2. Discussion of Background Art
In multi-node computer networks, nodes communicate with each other by passing network messages through an interconnect. These network messages support different forms of communication between nodes, depending on the nature and requirements of the network. In parallel processing systems, for example, the network messages specifically support cache-coherence communication in shared-memory multiprocessor systems, and support message-passing communication in distributed-memory multi-computer systems. Frequently, a single computer system supports more than one form of message communication.
For a network to operate correctly, it is important to prevent deadlock while controlling network messages. In general, deadlock occurs when all four of the following conditions are met: (1) mutual exclusion in which a resource is assigned to one process; (2) hold and wait in which resources are acquired incrementally and processes may hold one resource while waiting for another; (3) no preemption, in which allocated resources cannot be forcibly acquired by another process; and (4) circular wait in which two or more processes form a circular chain of dependency with each process waiting for a resource held by another.
In the context of network messaging, “resources” are defined as the buffer spaces available to hold network messages while in transit from one node to another node and “processes” are defined as the nodes which generate and consume the network messages. When deadlock occurs, some nodes in the network are unable to make progress (i.e., service the network messages). Without appropriate recovery measures, the network must initiate a reset or interrupt, which may result in a loss of messages and cause damage to the system as a whole.
Deadlock may be dealt with by any of several techniques including prevention, avoidance, and detection and recovery. Prevention techniques remove one of the four conditions described above, thereby making it impossible for a deadlock to occur. Avoidance techniques check for the deadlock conditions before allocating each resource, and allow the allocation only if there is no possibility of deadlock. Detection and recovery techniques do not prevent or avoid deadlock, but detect deadlock situations after they occur and then recover from those deadlock situations.
One common technique for avoiding deadlock provides two separate interconnects, or two separate channels within the same interconnect, for request and reply messages. In this technique, a node guarantees sufficient buffering for its reply messages by limiting the number of requests it has outstanding. An example of this is described in ANSI/IEEE Std. 1596-1992, Scalable Coherence Interface (SCI) (1992). In networks that only allow a simple request-reply messaging protocol, this technique is sufficient to avoid deadlock.
With more sophisticated messaging protocols, such as those that allow request forwarding, the two-interconnect technique may be extended by increasing the number of interconnects. However, the number of required independent interconnects corresponds to the maximum length of the dependence chains in the messaging protocols.
Another technique, described by Lenoski and Weber in Scalable Shared-Memory Multiprocessing (1995), allows request forwarding messaging with two separate interconnect channels, but couples the two channels with a back-off mechanism in the messaging protocol. When a potential deadlock situation is detected the back-off mechanism reverts to a request-reply transaction by sending a negative acknowledgement reply to all requests which need forwarding until the potential deadlock situation is resolved.
However, requiring separate interconnects or interconnect channels for request and reply messages imposes additional overhead on the interconnection network and its management structures. The multiple-interconnect techniques also impose complexity on the messaging protocol because messages on the separate interconnects can not be ordered with respect to one another and simplifying assumptions about message ordering cannot be made. Having back-off mechanisms also adds complexity to the messaging protocol.
Another technique, which employs detection and recovery, involves extending the buffer space available when deadlock is detected. This has been implemented in the Alewife machine by Kubiatowicz and Agarwal, “Anatomy of a Message in the Alewife Multiprocessor,” Proceedings of the 7th International Conference on Supercomputing (1993). In the Alewife approach, a network interface chip signals an interrupt to the processor when its output queue has been blocked for some specified period of time. The processor then empties the input queue into local memory.
The Alewife approach emulates a virtually infinite buffer by augmenting the input queue in local memory whenever it overflows. But it does not address management of this buffer size. Moreover, Alewife relies on first detecting a potential deadlock situation and then resolving the deadlock situation in software by having the processor extend the queue into local memory. This is not always feasible because the processor may have outstanding requests that are caught in the same deadlock and, without special abort and fault recovery mechanisms, cannot service an interrupt until this deadlock has been resolved.
What is required, therefore, is a technique to avoid messaging deadlock that does not increase the interconnect-management overhead required to support separate interconnect channels for request and reply messages and that eliminates the complexities of back-off mechanism support and software-managed deadlock recovery.
SUMMARY OF THE INVENTION
The present invention provides a computer architecture for avoiding a deadlock while controlling messages between nodes in a multi-node computer network.
The invention, avoiding deadlock, inserts a buffer and associated control circuitry between the output of a node and the network in order to buffer all outgoing network messages from that node. Proper sizing of the buffer along with associated flow control circuitry guarantees sufficient buffering such that the buffer does not overflow and at least one in a group of nodes involved in a circular wait is always able to service incoming messages, thereby facilitating forward progress and avoiding deadlock.
To effectively manage the buffer, network messages are classified into preferably three types based on their service requirements and messaging protocols. In the preferred embodiment, these message types are called reliable transaction messages, posted messages, and unreliable transaction messages. The invention reserves a quota for each of the message types in the buffer and, based on this quota, controls the number of network messages of each type that is outstanding at any one time. The total buffer size is the sum of the space requirements of these message types.

Affiliated with

Helland Patrick J.

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Poon Wing Leong

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Shimizu Takeshi

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Umezawa Yasushi

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Weber Wolf-Dietrich

Inventor

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

Carr & Ferrell LLP

Law Firm

[ 0.00 ] – not rated yet Voters 0 Comments 0

Fujitsu Limited

Corporate Assignee

[ 0.00 ] – not rated yet Voters 0 Comments 0

Najjar Saleh

Examiner

[ 0.00 ] – not rated yet Voters 0 Comments 0

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

System and method for avoiding deadlock in multi-node network does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with System and method for avoiding deadlock in multi-node network, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and System and method for avoiding deadlock in multi-node network will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFUS-PAI-O-2952949

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure