Method and system for load balancing and management

Error detection/correction and fault detection/recovery – Data processing system error or fault handling – Reliability and availability

Reexamination Certificate

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

C714S047300, C709S241000

Reexamination Certificate

active

06560717

ABSTRACT:

FIELD OF THE INVENTION
This invention relates to computer systems and, more particularly, to servers for Internet web sites.
BACKGROUND OF THE INVENTION
In the use of the Internet, users may contact an Internet web site to view or obtain information. The user's contact with the web site is typically with a web server, or Hyper Text Transfer Protocol (HTTP) server. Behind and supporting the web server is an application server. A web site intended to handle lots of demand may use multiple web servers and/or multiple application servers.
To a point, adding an application server allows the system to be scaled to handle increased use. Theoretically, the system would scale linearly. For example, by doubling the hardware for the application servers, the system capacity would be doubled.
When using multiple servers, it is useful to use some form of load balancing of the servers. One way to perform load balancing would be to use a round-robin approach, where each new session is assigned, in turn, to the next server. An alternative technique in the prior art is to use a “fair share” load balancing approach. With a fair share approach, each server is assigned an equal portion of a range in which a random number may fall. For example, with 4 servers and the selection of a random number less than one, the first may be assigned 0-0.24, the second may be assigned 0.25-0.49, the third may be assigned 0.5-0.74, and the fourth may be assigned 0.75-0.99. A random number is then selected, and the server within whose range the number falls is assigned the session. That server then hosts the duration of the session.
However, it would be useful to provide load balancing based on measurements, estimates, or predictions of past, present, and/or future load on a server.
SUMMARY OF THE INVENTION
According to the present invention, load balancing of World Wide Web sessions is achieved by taking into account metrics of application server performance. A load manager collects load information from each application server. A new session is assigned to an application server according to a probabilities table, where each application server is assigned a probability by a load balancer and that probability is used by a module within the web or HTTP server to determine the application server assigned to the new session. The load balancer considers measurement, estimates, or predictions of past, present, and future load on a server. In one embodiment, the load balancer considers both latency—for example, the amount of time it takes the server to serve a request—and the number of active sessions running on the server. The load balancer can consider the average latency of requests over a predetermined, but adjustable, polling interval and the number of currently active sessions at the end of the polling interval in assigning the probabilities.
Measurements from prior polling intervals can be factored into the load balancing algorithm, in user-adjustable ways, in order to dampen the effects of short term changes. The weights assigned to the average latency relative to the number of active sessions also can be adjusted. The effects of changing the weights also can be examined.
The load balancer may adjust for extreme high and low loads. In order to avoid distortions when latency or the number of active sessions is relatively low, the load balancer uses a minimum latency value for a server when the server's actual latency falls below a user-defined minimum number. Similarly, the load balancer uses a minimum number of active sessions value for a server when the server's actual number of sessions falls below a user-defined minimum number.
At the other extreme, if the latency exceeds an adjustable maximum level, the application server is considered to be overloaded and assigned a probability of 0, so that future sessions are not assigned to it. Similarly, if the number of active sessions for an application server exceeds an adjustable maximum level, the application server is considered to be overloaded and is assigned a probability of 0.
When the load on an application server is sufficiently high or the performance of an application server is sufficiently degraded, requests related to existing sessions (as opposed to new sessions) are routed to a different server. This failover mechanism may be triggered, for example, in the following three situations. First, if an application server is configured to have a fixed number of handler threads, all of those threads are handling a requests, and all of those threads have neither received nor sent a packet in a configurable time interval, then the failover mechanism is triggered. Second, if the memory usage of a process exceeds a configurable limit, the failover mechanism is triggered. Third, if an attempt to connect to an application server times out after a configurable limit, the failover mechanism is triggered.
The failover mechanism can be implemented in different ways. For example, in the first two failover situations described above, the failover of requests to the application server may be disabled when the conditions for triggering the failover no longer exist. In the third failover situation, attempts to connect may be made after a configurable back-off interval.
As a further mechanism for managing server load, an application server can be restarted automatically under appropriate conditions. For example, an external monitoring process can be used to connect to the application server and request a predefined monitoring page. If the external process fails to receive the page a configurable number of times, it sends a message to the application server to force it to restart. While the server is restarting, it is not available to handle requests.
As yet a further mechanism for managing server load, if a web (or HTTP) server cannot access any of the application servers to which it is connected, it directs the browser which is requesting information to a different web server.


REFERENCES:
patent: 5283897 (1994-02-01), Georgiadis et al.
patent: 5898870 (1999-04-01), Okuda et al.
patent: 6026425 (2000-02-01), Suguri et al.
patent: 6178160 (2001-01-01), Bolton et al.
patent: 6199065 (2001-03-01), Kenyon
patent: 6249801 (2001-06-01), Zisapel et al.
patent: 6330605 (2001-12-01), Christensen et al.
patent: 6353847 (2002-03-01), Maruyama et al.
patent: 6363461 (2002-03-01), Pawlowski et al.
patent: 6374297 (2002-04-01), Wolf et al.
patent: 6401121 (2002-06-01), Yoshida et al.
Chao-Ju Hou et al: Load Sharing with Consideration of Future Task Arrivals in Heteroneneous Distributed Real-Time Systems, IEEE Transactions on Computer, IEEE Inc., New York, US, vol. 43, No. 9, Sep. 1, 1994, pp. 1076-1090.
Goswami K K et al: “Prediction-Based Dynamic Load-Sharing Heuristics”, IEEE Transactions on Parallel and Distributed Systems, IEEE Inc., New York, US, vol. 4, No. 6, Jun. 1, 1993, pp. 638-648.
Colajanni M et al: “Analysis of Task Assignment Policies in Scalable Distributed Web-Server Systems”, IEEE Transactions on Parallel and Distributed Systems, IEEE Inc., New York, US, vol. 9, No. 6, Jun. 1, 1998, pp. 585-599.

LandOfFree

Say what you really think

Search LandOfFree.com for the USA inventors and patents. Rate them and share your experience with other people.

Rating

Method and system for load balancing and management does not yet have a rating. At this time, there are no reviews or comments for this patent.

If you have personal experience with Method and system for load balancing and management, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Method and system for load balancing and management will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFUS-PAI-O-3047253

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.