Electrical computers and digital processing systems: multicomput – Computer-to-computer data routing – Least weight routing
Reexamination Certificate
1999-08-12
2004-04-13
Burgess, Glenton B. (Department: 2153)
Electrical computers and digital processing systems: multicomput
Computer-to-computer data routing
Least weight routing
Reexamination Certificate
active
06721775
ABSTRACT:
TECHNICAL FIELD
This invention relates to the monitoring and control of concurrent processes in a multiprocessing, multiprogramming computing environment, and more particularly, to detection and monitoring of resource contention between multiple processes thereof.
BACKGROUND OF THE INVENTION
As used herein, the term “computing environment” includes any single system or multi-system computing environment as available or known in the art. A “task” or “process” means an independent unit of work that can complete for the “resources” of a computing environment. A “task control block” is a consolidation of control information pertaining to a task including any user-assigned priority and its state, i.e., active or waiting. The “wait state” is a condition of a task that is dependent upon the execution of other tasks in order for the “waiting” task to become “active”.
Also in this specification, a “resource” is any facility of a computing environment or of an “operating system” running thereon which is required for the execution of a task. Typical resources include main store, input/output devices, the central processing unit (CPU), data sets, and control or processing programs. In this regard, an “operating system” is a set of supervisory routines running on a computing system for providing, for example, one or more of the following functions: determining the order in which requesting tasks or their computations will be carried out, providing long-term storage of data sets including programs, protecting data sets from unauthorized access or usage, and/or system logging and recovery.
“Multiprogramming” which pertains to the concurrent execution of two or more programs by a computing environment, can be managed on a computer running under, for example, OS/390 offered by International Business Machines Corporation. Modern operating systems, by permitting more than one task to be performed concurrently, make possible more efficient use of resources. For example, if a program that is being executed to accomplish a task must be delayed (for instance, until more data is read into the CPU), then performance of some other completely independent task may proceed. The CPU can execute another program or even execute the same program so as to satisfy another task.
In today's computing environments, mutual exclusion (or resource serialization) is often provided within the operating system itself. With IBM's OS/390 system, a customer has the option of configuring a multi-image environment to increase capacity and enhance availability. To allow these images to co-exist, resources shared between systems need to be serialized to ensure integrity. OS/390 uses a Global Resource Serialization (GRS) component to serialize both single system and multi-system resources. These resources can number in the thousands, if not millions. For more information on GRS reference an IBM publication entitled “OS/390 MVS Planning: Global Resource Serialization”; doc. #GC28-1759-OS (September, 1998) (6th edition), the entirety of which is hereby incorporated herein by reference.
In the allocation and use of these resources, contention for a resource can occasionally cause progress of the workload to be negatively impacted for a number of reasons. For example: (1) a resource allocation deadlock might occur; (2) a long-running task might hold a resource (resource starvation); or (3) a task holding resources may have ceased to respond (“enabled hang”).
A task is said to be “deadlocked” if its progress is blocked indefinitely because it is stuck in a “circular wait” upon other tasks. In this circumstance, each task is holding a “non-preemptable” resource which must be acquired by some other task in order to proceed, i.e., each task in the circle is waiting upon some other task to release its claim on a resource. The characteristics of deadlock then are mutual exclusion, non-preemption, and resource waiting. In the case of resource starvation, a long-running task or job holds one or more critical resources, in which case, workload also requiring that resource(s) must wait until the job ends. In severe cases, software errors can cause tasks that hold resources to fail without ending, causing the resource to be permanently held, thereby blocking workload that requires the task.
In view of the above, resource contention monitoring and analysis can be significant functions in today's computing environments.
DISCLOSURE OF THE INVENTION
In certain systems, resource serialization managers have an ability to report on resource contention, and document blocking requests and waiting requests for resources However, such systems do not provide for any intelligent ordering of the assembled information. For example, the current GRS implementation assembles the contended resources in alphabetical order of resource name. Thus, provided herein is an enhanced approach wherein blocking requests and waiting requests are explicitly listed in a time-based manner.
Briefly summarized then, this invention comprises a method for analyzing resource contention in a computing environment. This method includes: selecting a current waiting request for a resource; using a resource queue for the resource, chaining to a current top blocker request for the resource; chaining to a task related waiter queue (TRWQ) for the current top blocker request, wherein any requests waiting for a computer environment resource are listed in a first-in/first-out manner; and searching the TRWQ for any waiting request made by a task generating the current top blocker request, and if there are no waiting requests associated with the current top blocker, dependency analysis is complete.
In a further aspect, a method for analyzing contention in a computing environment is provided. This method includes identifying at least one of a longest blocking process or a longest waiting process for a resource of the computing environment; and wherein the identifying comprises examining one of a blocking queue or a waiting queue for the resource, wherein the blocking queue comprises a time-ordered listing of all currently blocking processes requesting the resource, and wherein the waiting queue comprises a time-ordered listing of all currently waiting processes requesting the resource.
Systems and computer program products corresponding to the above-summarized methods are also described and claimed herein.
To restate, provided herein is an enhanced resource contention analysis technique which provides an ability to readily report on: (1) tasks (and resources) that have been blocking requests for the longest period of time; (2) tasks (and resources) that have been waiting for the longest period of time; and (3) tasks (and resources) involved in a request dependency chain, and whether or not that chain represents a deadlock. With the information provided by the enhanced contention analysis disclosed herein, an installation can determine if a high volume of contention is actually a problem or not. If the contention is a problem, then the tasks involved in that contention are apparent, allowing the installation to take action against a task, subsystem, or system, to alleviate the problem. With the current art, a customer would have to take the output from multiple instances of the contention display to determine whether or not the systems are making progress and then by hand build the dependency graph, and determine which resources and tasks are at fault. Obviously, the problem is nearly insolvable when hundreds of resources and tasks are involved in contention.
As noted, the blocker and waiter lists disclosed herein comprise lists sorted by the time of the event (i.e., the longest blocker or waiter is at the head of the list). The advantages of this approach are that:
(1) Finding the most effected resource/request is simplified. The element at the front of the list is the request that has been blocking/waiting (depending on the list) for the longest period of time. This means that an analysis of the resources does not have to query the state of all resources. Generally, only a
Fagen Scott A.
Nick Jeffrey M.
Burgess Glenton B.
Flynn Kimberly
Kinnaman, Jr. Esq. William A.
Radigan, Esq. Kevin P.
LandOfFree
Resource contention analysis employing time-ordered entries... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Resource contention analysis employing time-ordered entries..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Resource contention analysis employing time-ordered entries... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3185700