Multicomputer with distributed directory and operating system

Electrical computers and digital processing systems: multicomput – Distributed data processing – Client/server

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C709S226000, C709S241000, C709S205000, C712S029000

Reexamination Certificate

active

06393459

ABSTRACT:

FIELD OF THE INVENTION
The present invention relates generally to multicomputer systems, and more particularly, to such employing a microkernel-based serverized distributed operating system and to associated methods; as well as to such with a distributed process directory.
BACKGROUND OF THE INVENTION
Description of the Related Art
Microkernel-based operating system architectures have been employed to distribute operating system services among loosely-coupled processors in a multicomputer system. In an earlier system, a set of modular computer software-based system servers sit on top of a minimal microkernel which provides the system servers with fundamental services such as processor scheduling and memory management. The microkernel may also provide an inter-process communication facility that allows the system servers to call each other and to exchange data regardless of where the servers are located in the system. The system servers manage the other physical and logical resources of the system, such as devices, files and high level communication resources, for example. Often, it is desirable for a microkernel to be interoperable with a number of different conventional operating systems. In order to achieve this interoperability, computer software-based system servers may be employed to provide an application programming interface to a conventional operating system.
The block diagram drawing of
FIG. 1
shows an illustrative multicomputer system. The term “multicomputer” as used herein shall refer to a distributed non-shared memory multiprocessor machine comprising multiple sites. A site is a single processor and its supporting environment or a set of tightly coupled processors and their supporting environment. The sites in a multicomputer may be connected to each other via an internal network (e.g., Intel MESH interconnect), and the multicomputer may be connected to other machines via n external network (e.g., Ethernet for workstations). Each site is independent in that it has its own private memory, interrupt control, etc. Sites use messages to communicate with each other. A microkernel-based “serverized” operating system is well suited to provide operating system services among the multiple independent non-shared memory sites in a multicomputer system.
An important objective in certain multicomputer systems is to achieve a single-system image (SSI) across all sites of the system. From the point of view of the use, application developer, and for the most part, the system administrator, the multicomputer system appears to be a single computer even though it is really comprised of multiple independent computer sites running in parallel and communicating with each other over a high speed interconnect. Some of the advantages of a SSI include, simplified installation and administration, ease-of-use, open system solutions (i.e., fewer compatibility issues), exploitation of multisite architecture while preserving conventional API's and ease of scability.
There are several possible component features that may play a part in a SSI such as, a global naming process, global file access, distributed boot facilities and global STREAMS facilities, for example. In one earlier system, a SSI is provided which employs a process directory (or name space) which is distributed across multiple sites. Each site maintains a fragment of the process directory. The distribution of the process directory across multiple sites ensures that no single site is unduly burdened by the volume of message traffic accessing the directory. There are challenges in implementing a distributed process directory. For example, “global atomic operations” which must be applied to multiple target processes and may have to traverse process directory fragments on multiples sites in the system. This traversal of directory fragments on different sites in search of processes targeted by an operation can be complicate by the migration of processes between sites in the course of the operation. In other words, a global atomic operation and process migration may progress simultaneously. Thus, there may be a particular challenge involved in ensuring that a global atomic operation is applied at least once, but only once, to each target process.
The problem of a global atomic operation potentially missing a migrating process will be further explained through an example involving the global getdents (get directory entries) operation. The getdents operation is a global atomic operation. The timing diagram of
FIG. 2
illustrates the example. At time=t, process manager server “A” (PM A) on site A initiates a migration of a process from PM A on site A to the process manager server “B” (PM B) on site B (dashed lines). Meanwhile, an object manager server (OM) has broadcast a getdents request to both PM A and PM B. At time=t
1
, PM B receives and processes the getdents request and returns the response to the OM. This response by PM B does not include a process identification (PID) for the migrating process which has not yet arrived at PM B. At time=t
2
, PM B receives the migration request from PM A. PM B adds the PID for the migrating process to the directory fragment on site B and returns to PM A a response indicating the completion of the process migration. PM A removes the PID for the migrating process from the site A directory fragment. At time=t
3
, PM A receives and processes the getdents request and returns the response to the OM. This response by PM A does not include the PID for the migrating process since that process has already migrated to PM B on site B. Thus, the global getdents operation missed the migrating process which was not yet represented by a PID in the site B directory fragment when PM B processed the getdents operation, and which already has its PID removed from the site A directory fragment by the time PM A processed the getdents operation.
A prior solution to the problem of simultaneous occurrence of process migrations and global atomic operations involved the use of a “global ticket” (a token) to serialize global operations at the system level and migrations at the site level. More specifically, a computer software-based global operation server issues a global ticket (a token) to a site which requests a global operation. A number associated with the global ticket monotonically increases every time a new ticket is issued so that different global operations in the system are uniquely identified and can proceed one after the other.
Global tickets are used to serialize all global atomic operations so that they do not conflict among themselves. However, a problem remains between global operations and process migrations. A prior solution makes global operations result in a multicast message carrying the global ticket to process managers on each site. Each process manager would then acquire the lock to the process directory fragment of its own site and iterate over all entries. The global operation to the entry's corresponding process is only performed if a global ticket number marked on the entry is lower than the current iteration global ticket number. A global ticket number marked on a process directory fragment entry is carried over from a site the process migrates from (origin site) to a site the process migrates to (destination site). It represents the last global operation ticket such process has seen before the migration.
The migration of a process is a bit more complex. The process being migrated acquires the process directory fragment lock on its origin site first. It then marks the corresponding process directory entry as being in the process of migration. The migration procedure stamps the process' process directory entry with the present global operation ticket number, locks the process directory on the migration destination site and transmits the process directory entry contents to the destination site. The global operation ticket number on the destination site is then copied back in the reply message to the migration origin site. The migration procedure on the o

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Multicomputer with distributed directory and operating system does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Multicomputer with distributed directory and operating system, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Multicomputer with distributed directory and operating system will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-2843494

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.