Recovery from failure of a data processor in a network server

Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C714S004110

Reexamination Certificate

active

06275953

ABSTRACT:

BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to data processing systems, and more particularly to network servers.
2. Background Art
Mainframe data processing, and more recently distributed computing, have required increasingly large amounts of data storage. This data storage is most economically provided by an array of low-cost disk drives integrated with a large semiconductor cache memory. Such cached disk arrays were originally introduced for use with IBM compatible host computers. A channel director in the cached disk array executed channel commands received over a channel from the host computer. More recently, a network attachment has been proposed for interfacing the cached disk array to a network. The network attachment, for example, is a computer programmed to communicate with clients on a network by following a network communication protocol, and to communicate with the cached disk array by issuing channel commands. Although this approach has the advantage of using a conventional cached disk array, the capabilities of the cached disk array are under utilized in this configuration, because the network attachment is a bottleneck to data access.
Cached disk arrays typically have multiple internal data processors, dual redundant internal data paths, and multiple input channels in order to provide a high degree of data availability in the event of various kinds of failures. If a data access request directed to one input channel is not acknowledged due to a failure of an internal data processor or input channel, then the data access request can be retransmitted to the cached disk array on another input channel with a high probability that the request will be acknowledged. A network attachment in the form of a single conventional digital computer would not provide such a high degree of data availability because a failure of the central processing unit, program memory, or power supply of the single conventional digital computer would block access by all network clients to the cached disk array.
Conventional digital computers known as personal computers or commodity digital computers, however, are very much less expensive than cached disk arrays or digital computers designed for high data availability. Therefore, it would be very desirable to construct a network attachment or a network file server using only commodity digital computers in some way that would provide the same high degree of data availability provided by a typical cached disk array. Moreover, it would be desirable to recover from a data processor failure in such a way that a network client would not have to retransmit a data access request to a different network address.
SUMMARY OF THE INVENTION
In accordance with one aspect of the invention, there is provided a method of operating data processors for servicing clients in a network. Each of the data processors has a respective network interface for interfacing to the network. Each network interface has a respective network address. Each network interface is programmable for setting its network address. An operational data processor responds to a failure of a failed data processor by setting the network address of the network interface of the operational data processor to the network address of the network interface of the failed data processor. Then the operational data processor services client requests received by the network interface of the operational data processor.
In accordance with another aspect of the invention, there is provided a method of operating data processors including a first set of data processors and a second set of data processors for providing clients with read-write access to read-write file systems. Each of the data processors in the first set of data processors receives requests from the clients. Each of the data processors in the second set of data processors is assigned to manage locks on at least one of the read-write file systems. Locks on each of the read-write file systems is managed by an assigned one of the data processors in the second set of data processors. Each data processor in the first set of data processors responds to a client request for access to a respective one of the read-write file systems by accessing stored assignment information indicating the assigned one of the data processors in the second set of data processors presently assigned to manage locks on the respective one of the read-write file systems. Processing for the client request is continued by the assigned one of the data processors in the second set of data processors indicated by the stored assignment information as being presently assigned to manage locks on the respective one of the read-write file systems. Each data processor in the second set of data processors continues processing for a client request for read-write access to a read-write file system to which the data processor in the second set of data processors is presently assigned to manage locks on by performing an access operation including management of locks on the read-write file system to which the data processor in the second set of data processors is presently assigned to manage locks on. A data processor performs failure recovery of a failed data processor in the second set of processors by detecting failure of the failed data processor, and upon detecting the failure of the failed data processor in the second set of data processors, re-assigning to an operational data processor each of the read-write file systems to which the failed data processor had been assigned to manage locks on at the time of detecting the failure of the failed data processor.
In a preferred embodiment, a file server for servicing clients in a data network includes a cached disk storage subsystem, and a plurality of data mover computers linking the cached disk storage subsystem to the data network for transfer of data between the cached disk storage subsystem and the network. Each data mover computer is programmed to maintain a local cache of file access information including locking information for a respective group of files that the data mover computer has been assigned to directly access, and an index that indicates the group of files that the data mover computer has been assigned to directly access. Each data mover computer is programmed to respond to a request from a client for access to a file by checking the index to determine whether or not the data mover computer has been assigned to directly access the file. When the checking determines that the data mover computer has been assigned to directly access the file, the data mover computer accesses the file. When the checking determines that the data mover computer has not been assigned to directly access the file, the data mover computer forwards the request to another data mover computer that maintains a local cache of file access information for the file. A data processor in the file server is programmed to perform failure recovery of a failed data mover computer by detecting failure of the failed data mover computer, and upon detecting failure of the failed data mover computer, re-assigning to an operational data mover computer each group of files to which the failed data mover computer had been assigned to directly access at the time of detecting the failure of the failed data mover computer.


REFERENCES:
patent: 4133027 (1979-01-01), Hogan
patent: 4141066 (1979-02-01), Keiles
patent: 5491788 (1996-02-01), Cepulis et al.
patent: 5513314 (1996-04-01), Kandasamy et al.
patent: 5555404 (1996-09-01), Torbjornsen et al.
patent: 5592611 (1997-01-01), Midgely et al.
patent: 5608865 (1997-03-01), Midgely et al.
patent: 5652833 (1997-07-01), Takizawa et al.
patent: 5737747 (1998-04-01), Vishlitzky et al.
patent: 5774640 (1998-06-01), Kurio
patent: 5812748 (1998-09-01), Ohran et al.
patent: 5815651 (1998-09-01), Litt
patent: 5829046 (1998-10-01), Tzelnic et al.
patent: 5845061 (1998-12-01), Miyamoto et al.
patent: 5864653 (1999-01-01), Tavallaei et al.
patent: 5864654 (1999-01-01), Marchant
patent: 5892915

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Recovery from failure of a data processor in a network server does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Recovery from failure of a data processor in a network server, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Recovery from failure of a data processor in a network server will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2533003

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.