Data processing: software development – installation – and managem – Software program development tool – Testing or debugging
Reexamination Certificate
2000-03-06
2004-12-14
Nguyen-Ba, Antony (Department: 2122)
Data processing: software development, installation, and managem
Software program development tool
Testing or debugging
C717S124000
Reexamination Certificate
active
06832367
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to computer processing systems, and more particularly to tools, techniques and processes, such as debugging tools and visualization tools, for recording and replaying the execution of distributed programs on such computer processing systems.
2. Description of the Related Art
Distributed programming is a form of information processing in which work is performed by separate computers linked through a communication network.
Typically, a complex set of software services and hardware services that implement a standardized set of communication protocols, such as transfer control protocol (TCP)/Internet Protocol (IP), is used to communicate information over the communication network. A more detailed description of exemplary communication protocols used in today's communication networks can bc found in Tanenbaum, “Computer Networks,” Prentice-Hall, Inc., Third Edition, 1996, herein incorporated by reference in its entirety
The JAVA™ programming language reduces many of the complexities of distributed programming by providing many programmer-friendly features including language-level support for multiple threads of execution within a single program.
Further, standard and relatively simple Application Programming Interfaces (APIs) have been provided in JAVA™ for defining a set of interfaces to the complex set of software services and hardware services used in communicating information over today's communication network. The core communication APIs in Java™ are centered around communication end points called “sockets”. The concepts exported and options supported by the Java™ Socket API are essentially a set of higher level abstractions and operations that can be mapped on a simple blocking subset of a low-level, but more powerful, socket-based interfaces offered by operating systems such as UNIX®, the Microsoft Windows®, and the Microsoft NT® operating systems.
In JAVA Socket API, three types of sockets are supported: 1) a point-to-point stream socket that supports reliable, streaming delivery of bytes; 2) a point-to-point datagram or packet-based socket on which message packets can be lost or received out of order; and 3) a multicast (e.g., point-to-multiple-points) socket on which a datagram may be sent to multiple destination sockets. For more details, see “Java Language Specification”, J Gosling, B. Joy and G. Steele, Addison Wesley and “Java 1.1 Developer's Handbook”, P. Heller, S. Roberts, with P. Seymour and T. McGinn, Sybex. These features have resulted in the growing use of JAVA for creating application components in JAVA that communicate over the network.
However, the factors of non-determinism introduced by the presence of concurrent threads of execution, operating system scheduling, variable network delays and potentially variable delivery of network messages make the understanding and testing of multi-threaded distributed JAVA application execution a difficult and a laborious process.
Moreover, repeated execution of a program is common while debugging a program and non-determinism may result in a “bug” appearing in one execution instance of the program and not appearing in another execution instance of the same program.
Further, the performance behavior can be different from one execution instance of a program to another execution instance of the same program. Given the size and the number of execution sequences possible in the completion of these distributed programs, it is an extremely difficult task for a programmer to solve correctness and performance problems since it is difficult to reproduce an execution instance.
For example, as mentioned above, replay is a widely accepted technique for debugging deterministic sequential applications. Replay for debugging, however, fails to work for non-deterministic applications, such as distributed and multithreaded Java applications. BUGNET's handling of non-deterministic message sent and received by processes is similar to the handling of User Datagram Protocol (UDP) datagrams (e.g., see R. Curtis and L. Wittie, “BUGNET: A debugging system for parallel programming environments”,
Proceedings of the
3
rd IEEE International Conference on Distributed Computing Systems
, pages 394-399, 1982). It logs the received message identifications during the record phase, and consumes the received messages according to the log during the replay phase while buffering yet to be consumed messages. However, it does not address the issue of non-deterministic events due to multithreading within a process that interact with non-deterministic message receives, nor does it address non-deterministic partial receive of messages through “reliable” connections.
Additionally, replay systems based on Instant Replay (e.g., see Thomas J. Leblanc and John M. Mellor-Crummy, “Debugging parallel programs with instant replay”
IEEE Transactions on Computers
, C-36(4):471-481, April 1987; and J. Sienkiewicz and T. Radhakrishnan. DDB: A distributed debugger based on replay”,
Proceedings of the IEEE Second International Conference on ICAPP
, pages 487-494, June 1996) addresses both non-determinism due to shared variable accesses and messages. Each access of a shared variable, however, is modeled after interprocess communication similar to message exchanges. When the granularity of the communication is very small, such as the case with multithreaded applications, the space and time overhead for logging the interactions becomes prohibitively large. Instant Replay also addresses only atomic network messages like the UDP datagram.
Russinovich and Cogswell's approach (e.g., see Mark Russinovich and Bryce Cogswell, “Replay for concurrent non-deterministic shared memory applications”,
Proceedings of ACM SIGPLAN Conference on Programming Languages and Implementation (PLDI)
pages 258-266, May 1996) addresses specifically multithreaded applications running only on a uniprocessor system. They modified the Mach operating system to capture the physical thread scheduling information. This makes their approach highly dependent on an operating system.
Another scheme for event logging (e.g., see L. J. Levrouw, K. M. R. Audenaert and J. M. Van Campenhout, “Execution replay with compact logs for shared-memory systems,”
Proceedings of the IFP WG
10.3
Working Conference on Applications in Parallel and Distributed Computing, IFIP Transactions
A-44pages 125-134. April 1994) computes consecutive accesses for each object, using one counter for each shared object.
As described in detail below, the unique and unobvious structure and method of the present invention differ from theirs in that the present invention computes a logical thread schedule, using a single global counter. Thus, the inventive scheme is much simpler and more efficient than the conventional techniques on a uniprocessor system.
Further, Netzer et. al. address the issue of how to balance the overhead of logging during the record phase with the replay time (e.g., see R. H. B Netzer, S. Subramanian, and X. Jian, “Critical-path-based message logging for incremental replay of message-passing programs”,
Proceedings of the
14
th IEEE International Conference on Distributed Computing Systems
, June 1994). Even for a closed world system (e.g., where all components of the distributed application are being replayed), contents of messages are stored selectively to avoid executing the program from the start. Combined with checkpointing (e.g., see Y. M. Wang and W. K. Fuchs, “Optimistic message logging for independent checkpointing in message-passing systems”,
Proceedings of IEEE Symposium on Reliable Distributed Systems
, pages 147-154, October 1992), storing contents of messages allows for bounded-time replay to an arbitrary program points.
Accordingly, it is highly advantageous to have methods for recording and replaying a distributed JAVA application so that programmers can easily reproduce application behavior and focus their efforts towards analyzing and solving the problems in application execution. However, hitherto
Choi Jong-Deok
Konuru Ravi
Srinivasan Harini
Ludwin, Esq. Richard M.
McGinn & Gibb PLLC
Nguyen-Ba Antony
Steelman Mary J.
LandOfFree
Method and system for recording and replaying the execution... does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Method and system for recording and replaying the execution..., we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for recording and replaying the execution... will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3291859