Monitoring HaProxy

This is a work in progress, i have been given the task to monitor a haproxy load balancer, and i plan to collect some of my best tricks here.

Hatop

This is a cool ncurses library that will give you an overview over which backends are considered "up" and which are down. You also have some different ways of interacting with haproxy, for an instance you can put a backend into maintenance mode.

First add the following to you "global" block in haproxy.cfg:


stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
 And then: 
#apt-get install hatop
 And finally: 
hatop -s /run/haproxy/admin.sock

Log files:

By default all the log-lines are stashed away in the same file, which means that up- and down messages ends up in the middle of a lot of POST and GET requests, so i made a few changes to syslog: 
$AddUnixListenSocket /var/lib/haproxy/dev/log

if $programname startswith 'haproxy' and $msg contains 'POST' then  /var/log/haproxy.log
if $programname startswith 'haproxy' and $msg contains 'POST' then ~
if $programname startswith 'haproxy' and $msg contains 'GET' then  /var/log/haproxy.log
if $programname startswith 'haproxy' and $msg contains 'GET' then ~
if $programname startswith 'haproxy' then  /var/log/haproxy-extra.log
stop
 And then i can monitor /var/log/haproxy-extra.log for all but the POST and GET requests, i realise that an error mssage containing either "POST" or "GET" will be lost, but so far i havent discovered any false-negatives.

Nagios

Im not quite done here yet, i found this module (https://exchange.nagios.org/directory/Plugins/Clustering-and-High-2DAvailability/check_haproxy_backend/details) , and gotten it to work: 
# ./check_haproxy_backend
HAPROXY OK: Proxy = website. Current sessions = 1. Max sessions = 17. Maximum limit = 200|scur=1 smax=17 slim=200
 I had to comment out "use warnings;" in the perl file (line 19) and install some modules: 
apt-get install libwww-perl libdbi-perl libdbd-mysql-perl libgd-gd2-perl libtext-csv-xs-perl
 But i havent done any long time testing yet ... im going to though :-)
Dette indlæg blev udgivet i Knowledge Base. Bogmærk permalinket.

Skriv et svar