Introduction
cMon was conceived when I wanted to have two things: statistics on the machines I run and alerts when those statistics do stupid things. It turns out there's several good tools for this. Nagios provides alerts when hosts are down, disks are full, etc. Then there's ganglia, cactus, cricket, and several others that provide statistics about the currently running machines. My problem is that these are the same problem handled, often poorly, by several different programs. Then I ran accross the need for High Availibility at work, and another set of programs that moniter when a machine is down. Then I realized that most queueing systems monitor the status of machines, and perform actions based on statistics.
So, I started thinking about what was good about each and what sucked. Some required configuration through the web (bad), but were very configurable (good). One was simple to install (good), but had to be altered when you actually wanted to do anything with it (duh). Ganglia had one really good idea: simplification through the use of multicast networking. However, it did it exactly backwards. Where cMon's cmetric sends a UDP packet out to a Multicast address where cmond instances may be listening, each machine in Ganglia is part of a multicast address. This means, every machine gets every metric sent out even though they don't care. Then, Ganglia's collector connects to a machine in each group to poll the data.
With cMon, each machine will send statistics to a multicast group. Only thos machines that actual care about stats will be listening to that group. To begin, metrics will be sent with cmetric, events with cevent and caught with cmond. The later will be able to take actions based on metrics and events.
While not really useful, the current code can be found at svn.lauricha.com