Original computing articles by a systems administrator

OpenVZ Bean Counters Nagios Script

“OpenVZ is container-based virtualization for Linux. OpenVZ creates multiple secure, isolated containers (otherwise known as VEs or VPSs) on a single physical server enabling better server utilization and ensuring that applications do not conflict.”

For each of these containers or VEs, there are resource limits. The psuedo file system, /proc, tracks various process and kernel information. The OpenVZ kernel provides the file /proc/user_beancounters that tells us if any of these limits have been reached (amoung other information). This is important because a process may fail to start (i.e. tomcat) if the limits have been reached. I wrote a script in python designed to be executed on the OpenVZ host machine by Nagios.

The script parses /proc/user_beancounters and will exit with appropriate Nagios exit status if one of these limits has been reached. If you don’t want to run this script as root, I recommend compiling a shell script with shc to copy the bean_counters file, own it as a unprivilaged user, and then make that a setuid root script (Linux won’t usually allow setuid shell scripts, which is why shc can be used to compile it. Does anyone think if the script only copys the file to tmp that this might be dangerious?). This is what the script expects with its current configuration. The script is easy to modify to make it check for other parameters besides the fail count (failcnt) as well.

You can get the script here: nagios_vz_bean.py

One Response to OpenVZ Bean Counters Nagios Script

  1. Sourygna says:

    Hello,

    Seems to me that the script has two bugs.

    First a string/int type bug from cmd_parser.add_option (as default, add_option method enforces the “string” type).
    Here are the changes I’ve made :
    19,21c19,21
    < cmd_parser.add_option(‘-w’, action=’store’, dest=’warning_range’, type=”int”, nargs=1, default=10, help=”Set increment for warning response, default=10″)
    < cmd_parser.add_option(‘-c’, action=’store’, dest=’critical_range’, type=”int”, nargs=1, default=20, help=”Set increment for critical response, default=20″)
    cmd_parser.add_option(‘-w’, action=’store’, dest=’warning_range’, nargs=1, default=10, help=”Set increment for warning response, default=10″)
    > cmd_parser.add_option(‘-c’, action=’store’, dest=’critical_range’, nargs=1, default=20, help=”Set increment for critical response, default=20″)
    > cmd_parser.add_option(‘-f’, action=’store’, dest=’check_file’, nargs=1, default=’/tmp/bean_check.txt’,

    Second is a math bug. For instance I you have the case where error[1] = options.critical_range, the script will never report the error.
    So I’ve done :
    82c82
    = options.warning_range and error[1] if error[1] > options.warning_range and error[1] < options.critical_range:
    87c87
    = options.critical_range:

    > if error[1] > options.critical_range:

    Once changed, the script works great. Thanks!

Leave a Reply