Gathering System Statistics in Linux, Part One (Gentoo and CentOS)

When designing a system, one of the first things I typically like to do is setup tools for monitoring.  Without proper logs and statistics about what is going on a server, when something happens, you are usually left scrambling trying to figure out two things: what was the anomaly, and what caused it.

Using system monitoring, you can quickly gauge what is a normal capacity load for a server, by seeing over time the statistics.  A server may have issues, but without knowing what is normal, you can’t really tell how out of the ordinary it is.

Enter collectd, a service that runs on Linux to gather the statistics of not only the system, but some other programs as well, such as Apache and MySQL.  The upstream developers describe what collectd does better than I could, so I’ll just let them explain it:

collectd gathers statistics about the system it is running on and stores this information. Those statistics can then be used to find current performance bottlenecks (i.e. performance analysis) and predict future system load (i.e. capacity planning). Or if you just want pretty graphs of your private server and are fed up with some homegrown solution you’re at the right place, too ;).

Usually one graph says more than a thousand words, so here’s a graph showing the CPU utilization of a system over the last 60 minutes:

 

Since you need something to display these graphs, and not just generate them, I’ll go into explaining how to use CGP, a PHP-driven website application that displays them in part two of this howto.  For now, we’ll just stick to installing and configuring collectd.

Here at Digital Trike, we use Gentoo Linux as well as CentOS Linux, so I’ll be covering how to install them on both systems.  With Gentoo, all the packages necessary for collectd are included in the portage tree, while CentOS users will have to install a few packages from source.

First of all, let’s get the package installed.

Gentoo Linux

As of this writing, collectd is marked unstable for x86 and amd64 architectures.  That’s not going to pose a problem.  Generally speaking, there are many reasons why packages remain in an unstable state in the portage tree.  They can range from having many dependencies that are also marked unstable, or it could just be that no one has ever requested them to be stabilized.  Either way, don’t let the status offset you from installing the package in this case.

You will need to keyword the packages, and with recent versions of portage (>=2.1.1.11), this is easily done with some emerge foo.  Before we do that, though, you should update your make.conf file with the plugins that you would like to install.

Collectd ships with a number of plugins.  In this scenario, I’m going to focus just on the ones that would be applicable for a LAMP stack.  Add this to your make.conf file in /etc:

COLLECTD_PLUGINS=”apache cpu curl disk dns filecount fscache logfile mysql network processes uptime users swap syslog load csv conntrack interface memory netlink rrdtool rrdcached table tcpconns unixsock vmem df protocols”

This is actually a small number of the available plugins, but it is aligned to support the ones that CGP draws graphs for.  You can always use another frontend to display the statistics, but I’m going to stick to these two applications.

Next, emerge the application.  Since it is already masked, we can unmask it automatically using portage, instead of tracking down the dependencies ourselves:

emerge –autounmask –autounmask-write “=app-admin/collectd-5.0*”

You’ll probably get some output similar to this:

The following keyword changes are necessary to proceed:
#required by =app-admin/collectd-5.0* (argument)
>=app-admin/collectd-5.0.0-r2 ~amd64
#required by app-admin/collectd-5.0.0-r2[collectd_plugins_rrdtool], required by =app-admin/collectd-5.0* (argument)
>=net-analyzer/rrdtool-1.4.5-r1 ~amd64

At this point, go ahead and run dispatch-conf, etc-update, or  your choice of program to update portage configuration files, and approve the change.  Then you can re-run the same emerge command, and portage will install the packages for you.

CentOS Linux

As I mentioned before, for CentOS, you will need to install some programs from scratch.  While that may seem a little daunting if you have never done this before, in this case, the packages necessary are easy to compile and install with minimal fuss.  You will need to be logged in as user root to install new packages.

The first step you will need to do on a the system is install the development packages so that you can compile programs from source.  In CentOS, these are packaged in two group installs.  We will also need an additional development package for libpcap.  The command to install them using yum is:

yum groupinstall ‘Development Tools’
yum groupinstall ‘Development Libraries’
yum install libpcap-devel

The yum package manager will display the packages and it’s dependencies that will be installed and confirm that you want to proceed.

Once that is done, then you only have two packages to install: rrdtool, which gathers the system stats, and collectd.

First, download rrdtool from the maintainer’s website.  As a general rule of thumb, it’s best practice to choose the latest version available, provided it is considered stable by the developers.  In this case, the latest release as of this writing is 1.4.5.  You can download it using wget easily.

# wget http://oss.oetiker.ch/rrdtool/pub/rrdtool-1.4.5.tar.gz

Next, unpack the file, and run configure using /usr as the prefix, then just make and make install as normal.

# tar -zvxf rrdtool-1.4.5.gz
# cd rrdtool-1.4.5
# ./configure –prefix=/usr
# make
# make install

Normally, I would advocate installing external packages into /usr/local instead of /usr, but on some systems, I had difficulty using that prefix, so here I just use /usr instead.

At this point, rrdtool should be installed correctly.  You can verify that it installed by running “rrdtool –help” or “which rrdtool”.

Next, you’ll need to download collectd and install it.  The default configuration will work fine for plugins, installing a decent amount of them, so I’m not going to tinker with the configure script at all.

# wget http://collectd.org/files/collectd-5.0.0.tar.gz
# tar -zvxf collectd-5.0.0.tar.gz
# cd collectd-5.0.0
# ./configure –prefix=/usr
# make
# make install

Now that the OS-specific instructions are complete, we’re ready to move on to configuring some dependencies for collectd to run properly.

Setup up MySQL for monitoring

Since one of the nifty features of collectd is to monitor MySQL, I like to setup a separate user solely for this task.  This, of course, is optional, but it I find it a good policy to separate users with their specific privileges.

The MySQL user will need special privileges to access all the necessary data from the database server.  You can setup this user using the mysql console.

These commands will create the user, grant it’s privileges, and then reload the user access so the collectd user can start connecting:

# mysql -u root
mysql> CREATE USER ‘collectd’@’localhost’ IDENTIFIED BY ‘password’;
mysql> GRANT SELECT, PROCESS, SHOW DATABASES, SUPER ON *.* TO ‘collectd’@’localhost’;
mysql> FLUSH PRIVILEGES;

Setup Apache server-status

Apache 2.2, by default, ships with a module named mod_status which can display specific statistics about the server load that the web server is experiencing.

Gentoo Linux Apache

In Gentoo, this module is not enabled by default.  You will need to add “-D STATUS” to your file located at /etc/conf.d/apache2 and then restart the apache server.

# /etc/init.d/apache2 restart

CentOS Linux Apache

On CentOS, the module is installed by default, but you will need to enable its usage.  Edit /etc/httpd/conf/httpd.conf and uncomment the lines starting with <Location /server-status> until </Location>.  Change the “Allow from” directive from example.com to 127.0.0.1.  Restart the httpd server when finished.

# service httpd restart

You can test the installation after rebooting with curl:

$ curl http://localhost/server-status?auto

collectd: global configuration

Now that we have monitoring setup for Apache and MySQL, collectd will handle the rest.  It’s time to move on to configuring the collectd daemon.  You can find the config file in /etc/collectd.conf.

The configuration file is easy to understand, and when it comes to modules, is split into two parts: enable the module, configure the module.  Before that, though, let’s look at setting up the global configuration and the logging.

You’ll see at the top of the file the “global settings for the daemon”.  The defaults here are fine.  The only thing worth changing is the Hostname to the hostname of your box — you can run “hostname –fqdn” as any user if you’re not sure what it is.

In the next section, Logging, all I usually do is uncomment the line “LoadPlugin syslog”, since I prefer to have logging sent to the system log, which will then show up in “/var/log/messages” for both CentOS and Gentoo Linux.

collectd: plugins

Now let’s look at the plugins themselves.  Again, I’m only going to cover a small part of the ones that are available to look after a box running as a LAMP server.  You will need to uncomment each LoadPlugin line for the modules you want to run.  One thing collectd does that is really nice, is that if it the plugin was not built when you installed it, then it will be commented out with ##.  If it is available, it will only be commented out with one #.

Going down the list, these are the ones to enable, or uncomment:

Next, let’s look at setting up the plugins individually.  Again, like before, most of the default configuration is fine and won’t need any modification.  I’ll only be covering the ones that need to be setup. 

collectd plugins: apache

Documentation

Look for the line starting with <Plugin apache>.  Remember when you setup mod_status to give details about the server processes before?  This is where you’ll need that URL.  Just uncomment the URL line, and put that in there, and you’re finished with this module.

Here’s what mine looks like:

<Plugin apache>
  <Instance "local">
    URL "http://localhost/server-status?auto"
#    User "www-user"
#    Password "secret"
#    CACert "/etc/ssl/ca.crt"
  </Instance>
</Plugin>

collectd plugins: df

Documentation

This one monitors the free disk space on the system.  Just setup the Device, the MountPoint, and the FSType.  I like to set IgnoreSelected to true so that it doesn’t poll every disk.

<Plugin df>
        Device "/dev/root"
#       Device "192.168.0.2:/mnt/nfs"
        MountPoint "/"
        FSType "ext3"
        IgnoreSelected false
#       ReportByDevice false
#       ReportReserved false
#       ReportInodes false
</Plugin>

Finding the device name is pretty simple.  Just run ‘df’, and it’s most likely the first one listed.  On Gentoo, it is commonly set /dev/root.  On CentOS, it could be /dev/sda1.  On some VPS systems, it may be named something different, like /dev/vzfs for example.

collectd plugins: disk

Documentation

This is one you want to enable only if your system is the one hosting the harddrive.  In other words, in a system like a shared host or VPS, you won’t need this.

<Plugin disk>
        Disk "/^[hs]d[a-f][0-9]?$/"
        IgnoreSelected false
</Plugin>

The configuration for Disk is a regular expression.  It says that it will automatically find any device names like hda0 to hdf9, from sda0 to sdf9 and everything in between.  Chances are you can leave that alone, as that’s the normal naming scheme for Linux devices.

collectd plugins: interface

Documentation

This plugin will monitor your Ethernet devices.

<Plugin interface>
        Interface "eth0"
        IgnoreSelected false
</Plugin>

Again, on a VPS the device name may be different, such as venet0.  Run ifconfig to see what your device interface names are.

collectd plugins: mysql

Documentation

For this plugin, you can monitor as many database servers as you like, by adding one per section for <Database>.  In this scenario, I’m only monitoring the one database server running on the Linux server.  You can enable the MasterStats configuration if you like, but it is only necessary if you are running MySQL in replication.  If you don’t know what that means, then you’re not using it. 🙂

<Plugin mysql>
        <Database mysql>
                Host "localhost"
                User "collectd"
                Password "Secrets of the Universe with Philo"
                Database "mysql"
                # MasterStats true
        </Database>
</Plugin>

Make sure you setup your MySQL user before, as documented earlier.  Also, find another password. 🙂

collectd plugin: rrdtool

Documentation

You’ll have to scroll a few pages down after mysql to find and configure this one.  The DataDir directive needs to set where the rrdtool data is stored.

<Plugin rrdtool>
        DataDir "/var/lib/collectd/rrd"
#       CacheTimeout 120
#       CacheFlush   900
</Plugin>

collectd plugin: vmem

Documentation

For this one, I just disable verbosity.

<Plugin vmem>
        Verbose false
</Plugin>

And that’s the last plugin I configure!  Finally. 🙂

Setting up collectd as a service

The last thing we’ll need to do, for setting up collectd, is to get it to run as a service.  Or in other words, make it so it starts up automatically on boot.  Then start it up.

On Gentoo, this is a simple set of commands:

# rc-update add collectd default
# /etc/init.d/collectd start

For CentOS, you need to create an init script.  I have one I use already that you can download here, that I extracted from an RPM (md5sum: 6986adbc79b647399aa3db52d48aedc7).  To install it, you need to download it, set it as executable, and then tell CentOS to load it by default.  Finally, start it up!

# wget http://dev.digitaltrike.com/~steve/blog/centos/init.d/collectd -O /etc/init.d/collectd
# chmod +x /etc/init.d/collectd
# chkconfig collectd on

You can verify it got started to the inital boot processes by running ‘ntsysv’ or something similar.

Wrap-up

Well, congratulations, you made it this far.  Collectd should be running and gathering stats for you.  To take it from here, there are lots of frontends out there that you can use to see the stats that are collected.  In my next blog post, I’ll be discussing one of these, and how to set it up: CGP or Collectd Graph Panel.

Until then, have fun. 🙂

I’ll continue to keep this howto updated as necessary, and is reasonable. 🙂

Document version: 1.0.1.  Last update: 2011-10-21