6 Request distribution strategies in Nginx

6 Request distribution strategies in Nginx

Nginx is a popular open-source web server software, that can be used for web serving, reverse proxying, caching, load balancing, media streaming, and more!

One common use of Nginx is as an HTTP load balancer to distribute traffic across multiple application servers, which can significantly improve resource availability and efficiency.

In this article, I'll delve into the Request distribution strategies (AKA: load-balancing methods) supported in both Nginx and Nginx Plus (the commercial version of Nginx offers additional enterprise-grade features and support services not available in the open-source version.).

1. Round Robin (the default)

The round-robin scheme in Nginx load balancing selects each server in a cyclical order based on the order specified in the configuration file.

Nginx uses a round-robin algorithm by default if no other method is defined.

http {
    upstream myapp {
        server srv1.example.com;
        server srv2.example.com;
        server srv3.example.com;
        server srv4.example.com;
    }
    server {
        listen 80;
        location / {
            proxy_pass http://myapp;
        }
    }
}

This method ensures equal distribution of incoming requests among available servers but does not ensure equal distribution of the load among servers.

What if one of the servers is twice as capable as the other two?

Weighted Round-robin

In the example above, the server weights are not configured which means that all specified servers are treated as equally qualified for serving the requests.

When the weight parameter is specified for a server, the weight is accounted as part of the load balancing decision.

As you can see in the preceding figure, Nginx will forward two out of the first four incoming requests to server 1 (because its weight=2), forward one to server 2, and another one to server 3 before it starts the cycle again.

and this is how you would configure it:

  [...]
  upstream myapp {
      server srv1.example.com weight=2;
      server srv2.example.com;
      server srv3.example.com;
  }

2. Least Connections

With the least connections method, the incoming request is routed to the server with the least number of active connections, taking into consideration the server weights.

This method ensures a more even distribution of the workload among servers and can be useful in situations where some requests take longer to complete than others.

By balancing the load based on the number of active connections, this method can help prevent overloading a single server and improve the overall performance and availability of the application.

If there are several servers with the same weight and number of active connections, they are tried in turn using a weighted round-robin balancing method.

You can configure it by least_conn directive:

  [...]
  upstream myapp {
      least_conn;

      server srv1.example.com;
      server srv2.example.com;
      server srv3.example.com;
  }

3. IP_hash

IP hash is a method where requests are distributed between servers based on clients' IP addresses. The first three octets of the client IPv4 address, or the entire IPv6 address, are used as a hashing key.

This method is helpful if your web application is stateful.

The method ensures that requests from the same client will always be passed to the same server except when this server is unavailable. In this case (server is unavailable) client requests will be passed to another server. Most probably, it will always be the same server as well.

You can configure it by ip_hash directive:

  [...]
  upstream myapp {
      ip_hash;

      server srv1.example.com;
      server srv2.example.com;
      server srv3.example.com;
  }

4. Generic Hash

Like the IP_hash this method is based on the hashing concept but instead of depending on the client's IP addresses, you can use any hashed key-value. The key can contain text, variables, and their combinations.

For example, the key may be a paired source IP address and port, or a URI as in this example:

  [...]
  upstream myapp {
    hash $request_uri;

    server srv1.example.com;
    server srv2.example.com;
    server srv3.example.com;
  }

Or based on a custom hash key to determine the backend server, like this example:

  [...]
  upstream myapp {
    hash $cookie_session_id;

    server srv1.example.com;
    server srv2.example.com;
    server srv3.example.com;
  }

Note that adding or removing a server from the group may result in remapping most of the keys to different servers. The method is compatible with the Cache::Memcached Perl library.

5. Least Time (Nginx Plus only)

In this method, Nginx Plus selects the server with the lowest average latency and the lowest number of active connections, where the lowest average latency is calculated based on which of the following parameters to the least_time directive is included

  • header – Time to receive the first byte from the server

  • last_byte – Time to receive the full response from the server

  • last_byte inflight – Time to receive the full response from the server, taking into account incomplete requests.

  [...]
  upstream myapp {
    least_time header;

    server srv1.example.com;
    server srv2.example.com;
    server srv3.example.com;
  }

6. Random

In this method, each request will be passed to a randomly selected server.

Simply configure it by adding the random directive.

  [...]
  upstream myapp {
    random;

    server srv1.example.com;
    server srv2.example.com;
    server srv3.example.com;
  }

two-parameter

random directive has an optional parameter called two, using it will instruct Nginx to pick 2 servers from the list randomly. Then the load will be balanced between those 2 servers according to the method that we specified taking into account server weights, and then chooses one of these servers using the specified method:

  • least_conn – The least number of active connections

  • least_time=header (NGINX Plus) – The least average time to receive the response header from the server ($upstream_header_time)

  • least_time=last_byte (NGINX Plus) – The least average time to receive the full response from the server ($upstream_response_time)

In this example:

  [...]
  upstream myapp {
    random two least_time=last_byte;

    server srv1.example.com;
    server srv2.example.com;
    server srv3.example.com;
  }

The Random load balancing method should be used for distributed environments where multiple load balancers are passing requests to the same set of backends. For environments where the load balancer has a full view of all requests, use other load balancing methods, such as round-robin, least connections, and least time.

Conclusion

In conclusion, Nginx offers various load-balancing methods to distribute incoming requests across multiple servers, including round-robin, weighted round-robin, least connections, IP hash, generic hash, least time, and random. Each method has its benefits and can be configured based on the specific needs of the application. By using Nginx for load balancing with the right strategy, organizations can significantly improve resource availability, efficiency, and overall application performance.