Everybody uses rrdtool. Everybody loves rrdtool.

I love rrdtool as well. Sometimes it even pleasantly surprizes me.

For example, it used to be the case that massive updates of a large number of RRD archives were a bit on the slow side (this operation is IO-bound, you see).

Quite unexpectedly I found that this inconvenience can be easily circumvented by throwing more RAM into the problem. By itself, this statement is not at all suprizing - after all, we all know about effects of disk caching. It is the magnitude of the improvement that simply astonished me: on a moderately busy box with primary purpose entirely different from doing RRD updates, I am now routinely getting around 18500 RRD acrhives updated in under 1 second. The large majority of the archives has 40+ data sources each, and the rest has 140+ data sources each. The machine has measly 8G of RAM.

Sometimes, when the box is busy doing something with its disks, it might take 3 minutes to finish the updates. In very rare instances it can take up to 15 minutes. But in more than 80% of cases it takes under 1 second.

The setup is such that the actual data collection is completely separated from the RRD update step. In fact, the collection step actually occurs on a different machine, for reasons I am not going to delve into in this post. The collector generates an "RRD update order" file, which is transferred to the target machine, where it is parsed by a simple Perl script, which subsequently executes the corresponding RRDs::update statements, with a small degree of parallelism in this last operation.

The application has been running for some months now, and every time I recall the numbers involved, I cannot help but say "wow!".