Electrical computers and digital processing systems: multicomput – Computer-to-computer session/connection establishing – Session/connection parameter setting
Reexamination Certificate
1998-09-03
2001-11-27
Maung, Zarni (Department: 2152)
Electrical computers and digital processing systems: multicomput
Computer-to-computer session/connection establishing
Session/connection parameter setting
C709S241000
Reexamination Certificate
active
06324580
ABSTRACT:
BACKGROUND
This invention relates to the field of computer systems. More particularly, a system and methods are provided for load balancing among replicated services using policies.
In many computing environments, clients such as computer systems and users connect to computer servers offering a desired service—such as electronic mail or Internet browsing. One computer server may, however, only be capable of efficiently satisfying the needs of a limited number of clients. In such a case, an organization may employ multiple servers offering the same service, in which case the client may be connected to any of the multiple servers in order to satisfy the client's request.
A service offered simultaneously on multiple servers is often termed “replicated” in recognition of the fact that each instance of the service operates in substantially the same manner and provides substantially the same functionality as the others. The multiple servers may, however, be situated in various locations and serve different clients. In order to make effective use of a replicated service offered by multiple servers (e.g., to satisfy clients' requests for the service), there must be a method of distributing clients' requests among the servers. This process is often known as load balancing.
In one method of load balancing, clients' requests are assigned to the servers offering the replicated service on a round-robin basis. In other words, client requests are routed to the servers in a rotational order. Each instance of the replicated service may thus receive substantially the same number of requests as the other instances. Unfortunately, this scheme can be very inefficient.
Because the servers that offer the replicated service can be geographically distributed, a client's request may be routed to a relatively distant server, thus increasing the transmission time and cost incurred in submitting the request and receiving a response. In addition, the processing power of the servers may vary widely. One server may, for example, be capable of handling a larger number of requests or be able to process requests faster than another server. As a result, the more powerful server may periodically be idle while the slower server is overburdened.
In another method of load balancing, specialized hardware is employed to store information concerning the servers offering the replicated service. In particular, this method stores information, on a computer system other than the system that initially receives client requests, about which of the servers has the smallest load (e.g., fewest client requests). Based on that information a user's request is routed to the least-loaded server. In a web-browsing environment, for example, when a user's service access request (e.g., a connection request to a particular Uniform Resource Locator (URL) or virtual server name) is received by a server offering Domain Name Services (DNS), the DNS server queries or passes the request to the specialized hardware. Based on the stored information, the user's request is then forwarded to the least-loaded server offering the requested service.
This method is also inefficient because it delays and adds a level of complexity to satisfying access requests. In particular, one purpose of a DNS server is to quickly resolve a client's request for a particular service to a specific server (e.g., a specific network address) offering the service. Requiring the DNS server to query or access another server in order to resolve the request is inefficient and delays the satisfaction of the request.
In yet other methods of balancing requests among multiple instances of a replicated service, client requests are randomly assigned to a server or are assigned to the closest server. Random assignment of client requests often results in requests being routed to geographically distant servers or servers that are more burdened than others, thus resulting in unnecessary delay. Assigning requests to the closest server is also inefficient because a faster response may be available from a server that, although further from the client, has less of a load.
In addition to the above disadvantages of present load balancing techniques, present techniques are limited in scope. For example, in the methods described above, load-balancing decisions are made solely on the basis of operational statistics concerning the servers offering a replicated service, not the status of the service itself. In other words, present techniques do not provide for the collection or consideration of information concerning the status of individual applications or services executing on the servers. Thus, a client's request for a particular application or service may be routed to a first server that has less of an overall load than a second server, even though the specific application request could be more efficiently and/or rapidly handled by the second server.
SUMMARY
In one embodiment of the invention a system and methods are provided for balancing client (e.g., user) requests among multiple instances of a replicated service or application in accordance with a selected policy. In this embodiment, instances of the replicated service execute on separate computer servers.
A load balancing policy is selected to specify one or more factors to be used in determining the server (e.g., one of multiple servers offering a replicated service) that is to receive a client request. The identity of the “preferred” server is periodically updated in order to distribute requests for the service or application among the multiple servers. Illustrative policies include selecting the least-loaded or closest server. Illustratively, the least-loaded server is the server having the shortest response time or fewest pending client requests and the closest server is the server that can be reached in the fewest network hops or connections.
Depending upon the selected policy, status objects or modules are created to collect information from each server offering the replicated service or application that is being load-balanced. The information collected from each server may include the number of requests held and/or processed by the server or service, the response time and/or operational status (e.g., is it up or down) of the server or service, the distance (e.g., the number of network hops) to the server, etc.
Each instance of a replicated service or application is associated with its own status object(s). In one embodiment of the invention multiple status objects having different functions are associated with one instance. Each instance of the replicated service is also associated with an individual monitor object (IMO) or module. Each IMO thus collects and saves information from the status object(s) of one service instance. Illustratively, the IMO queries its status object(s) on a periodic basis and stores the information that is returned.
A replicated monitor object (RMO) or module is employed to collect information from the IMOs associated with the various instances of the replicated service. The RMO stores this information, which is then processed to identify a preferred server (e.g., least-loaded or closest).
In an embodiment of the invention in which clients access the replicated service through a system such as a Domain Name Service (DNS) server, a DNS updater object or module updates a DNS zone file to identify the preferred server (e.g., by its network address). A DNS zone file may be used to resolve a virtual server name (e.g., a virtual identity of a service replicated on multiple servers) to a particular server. When a client requests a replicated service accessed via a virtual name, the DNS server directs the request to the server indicated in the zone file.
In one embodiment of the invention the status objects, IMOs, the RMO and the DNS updater are co-located (e.g., on a DNS server). Illustratively, the servers and replicated services need not be modified in this non-intrusive mode of operation. The status objects use network functions or commands (e.g., Ping, Connect
Chang Whei-Ling
Jindal Anita
Lim Swee Boon
Radia Sanjay
Cardone Jason D.
Maung Zarni
Park Vaughan & Fleming LLP
Sun Microsystems Inc.
LandOfFree
Load balancing for replicated services does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Load balancing for replicated services, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Load balancing for replicated services will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2594218