distrib logger
For some time now David1 and I have been experimenting with clusters – mainly LAMP clusters. Monitoring many machines can be troublesome. At some point we needed something that gathers all log messages from all machines and combine them onto one, so that for example we can use grep to look something up.
We’ve tried many network filesystems and other workarounds but found nothing faster than NFS. Since all workers in a cluster have to see the same data as a source, NFS always has to be in sync – which is slow, especially when having many small reads and writes. We also found out that logging all those calls from Apache is very slow, produces a large amount of traffic and eventually blocks the interfaces for far more important data – simply unacceptable.
After some days I thought about having something which simply reads logs, pushes them to a server, which then rebuilds the logs in a local folder. The idea of distrib logger was born.
After half a day of hacking and testing, a first prototype of a client-server-application was successfully reading and rebuilding my local log files. It also preserved any directory structure, owner and mode bits.
After the second day I’ve had encryption, compression and a nifty command line interface up and running.
As of now it is installed at our development cluster, reading apache logs of 2 workers, pushing everything to the balancer which then recreates the directory structure and files as soon as data arrives. It’s using multiple threads for maintaining a queue, importing and exporting data.
We’ve also combined it with standard log rotation so that those aggregated files won’t get too large.
Grepping thru log files of multiple machines at once is finally as simple as grepping thru only one – no NFS, no SSH, no FTP – nice.
Required:
- Python 2.5 (including threading, hashlib, pickle, socket, bz2)
Optional:
- Python-mcrypt
Let’s take the example above: You have one file server and two machines with Apache installed. You have special access and error log files defined for every vhost. You want to see all logs combined (by vhost and separated by access/error) on the file server but don’t want to stress NFS for it or don’t want to use a network filesystem at all. Thus you’re fine with messages comming in every x seconds rather than in real time. Both Apache machines do have a host entry called “server” which points to the IP address of your file server.
On the file server you call:
./server.py -H all -P 9000 -L log/
On the Apache machines you call:
./forwarder.py -H server -P 9000 -L /var/log/apache2/
After some time you should see that the forwarders find files, forward every row to the server which then combines everything into the correct directory structure. This means that everything under /var/log/apache2/* on the forwarders will be replicated into log/ on the server but do contain combined entries of all forwarders. You might want to write your own init scripts and redirect stdout into something else.
You can download everything from here:
distrib_logger-1.0.tar.gz
–