Tuesday, October 7, 2014

How to configure HTTP load balancer with HAProxy on Linux

http://xmodulo.com/haproxy-http-load-balancer-linux.html

Increased demand on web based applications and services are putting more and more weight on the shoulders of IT administrators. When faced with unexpected traffic spikes, organic traffic growth, or internal challenges such as hardware failures and urgent maintenance, your web application must remain available, no matter what. Even modern devops and continuous delivery practices can threaten the reliability and consistent performance of your web service.
Unpredictability or inconsistent performance is not something you can afford. But how can we eliminate these downsides? In most cases a proper load balancing solution will do the job. And today I will show you how to set up HTTP load balancer using HAProxy.

What is HTTP load balancing?

HTTP load balancing is a networking solution responsible for distributing incoming HTTP or HTTPS traffic among servers hosting the same application content. By balancing application requests across multiple available servers, a load balancer prevents any application server from becoming a single point of failure, thus improving overall application availability and responsiveness. It also allows you to easily scale in/out an application deployment by adding or removing extra application servers with changing workloads.

Where and when to use load balancing?

As load balancers improve server utilization and maximize availability, you should use it whenever your servers start to be under high loads. Or if you are just planning your architecture for a bigger project, it's a good habit to plan usage of load balancer upfront. It will prove itself useful in the future when you need to scale your environment.

What is HAProxy?

HAProxy is a popular open-source load balancer and proxy for TCP/HTTP servers on GNU/Linux platforms. Designed in a single-threaded event-driven architecture, HAproxy is capable of handling 10G NIC line rate easily, and is being extensively used in many production environments. Its features include automatic health checks, customizable load balancing algorithms, HTTPS/SSL support, session rate limiting, etc.

What are we going to achieve in this tutorial?

In this tutorial, we will go through the process of configuring a HAProxy-based load balancer for HTTP web servers.

Prerequisites

You will need at least one, or preferably two web servers to verify functionality of your load balancer. We assume that backend HTTP web servers are already up and running.

Install HAProxy on Linux

For most distributions, we can install HAProxy using your distribution's package manager.

Install HAProxy on Debian

In Debian we need to add backports for Wheezy. To do that, please create a new file called "backports.list" in /etc/apt/sources.list.d, with the following content:
1
deb http://cdn.debian.net/debian wheezy­backports main
Refresh your repository data and install HAProxy.
# apt­ get update
# apt ­get install haproxy

Install HAProxy on Ubuntu

# apt ­get install haproxy

Install HAProxy on CentOS and RHEL

# yum install haproxy

Configure HAProxy

In this tutorial, we assume that there are two HTTP web servers up and running with IP addresses 192.168.100.2 and 192.168.100.3. We also assume that the load balancer will be configured at a server with IP address 192.168.100.4.
To make HAProxy functional, you need to change a number of items in /etc/haproxy/haproxy.cfg. These changes are described in this section. In case some configuration differs for different GNU/Linux distributions, it will be noted in the paragraph.

1. Configure Logging

One of the first things you should do is to set up proper logging for your HAProxy, which will be useful for future debugging. Log configuration can be found in the global section of /etc/haproxy/haproxy.cfg. The following are distro-specific instructions for configuring logging for HAProxy.
CentOS or RHEL:
To enable logging on CentOS/RHEL, replace:
1
log         127.0.0.1 local2
with:
1
log         127.0.0.1 local0
The next step is to set up separate log files for HAProxy in /var/log. For that, we need to modify our current rsyslog configuration. To make the configuration simple and clear, we will create a new file called haproxy.conf in /etc/rsyslog.d/ with the following content.
1
2
3
4
5
6
$ModLoad imudp
$UDPServerRun 514 
$template Haproxy,"%msg%\n"
local0.=info ­/var/log/haproxy.log;Haproxy
local0.notice ­/var/log/haproxy­status.log;Haproxy
local0.* ~
This configuration will separate all HAProxy messages based on the $template to log files in /var/log. Now restart rsyslog to apply the changes.
# service rsyslog restart
Debian or Ubuntu:
To enable logging for HAProxy on Debian or Ubuntu, replace:
1
2
log /dev/log        local0
log /dev/log        local1 notice
with:
1
log         127.0.0.1 local0
Next, to configure separate log files for HAProxy, edit a file called haproxy.conf (or 49-haproxy.conf in Debian) in /etc/rsyslog.d/ with the following content.
1
2
3
4
5
6
$ModLoad imudp
$UDPServerRun 514 
$template Haproxy,"%msg%\n"
local0.=info ­/var/log/haproxy.log;Haproxy
local0.notice ­/var/log/haproxy­status.log;Haproxy
local0.* ~
This configuration will separate all HAProxy messages based on the $template to log files in /var/log. Now restart rsyslog to apply the changes.
# service rsyslog restart

2. Setting Defaults

The next step is to set default variables for HAProxy. Find the defaults section in /etc/haproxy/haproxy.cfg, and replace it with the following configuration.
1
2
3
4
5
6
7
8
9
10
11
defaults
log     global
mode    http
option  httplog
option  dontlognull
retries 3
option redispatch
maxconn 20000
contimeout      5000
clitimeout      50000
srvtimeout      50000
The configuration stated above is recommended for HTTP load balancer use, but it may not be the optimal solution for your environment. In that case, feel free to explore HAProxy man pages to tweak it.

3. Webfarm Configuration

Webfarm configuration defines the pool of available HTTP servers. Most of the settings for our load balancer will be placed here. Now we will create some basic configuration, where our nodes will be defined. Replace all of the configuration from frontend section until the end of file with the following code:
1
2
3
4
5
6
7
8
9
10
11
12
listen webfarm *:80
       mode http
       stats enable
       stats uri /haproxy?stats
       stats realm Haproxy\ Statistics
       stats auth haproxy:stats
       balance roundrobin
       cookie LBN insert indirect nocache
       option httpclose
       option forwardfor
       server web01 192.168.100.2:80 cookie node1 check
       server web02 192.168.100.3:80 cookie node2 check
The line "listen webfarm *:80" defines on which interfaces our load balancer will listen. For the sake of the tutorial, I've set that to "*" which makes the load balancer listen on all our interfaces. In a real world scenario, this might be undesirable and should be replaced with an interface that is accessible from the internet.

No comments:

Post a Comment