How big is “too big” for documents in CouchDB: Some biased and totally unscientific test results!

I have been storing documents somewhat heuristically in CouchDB. Without doing any rigorous tests, and without keeping track of versions and the associated performance enhancements, I have a general rule that tiny documents are too small, and really big documents are too big.

To illustrate the issues, consider a simple detector that collects data every 30 seconds. One approach is to create one document per observation. Over a day, this will create 2880 documents (except for those pesky daylight savings time days, of course). Over a year, this will create over a million documents. If you have just one detector, then this is probably okay, but if you have thousands or millions of them, this is a lot of individual documents to store, and disk size becomes an issue.

Without doing careful tests, I found that the compaction approach taken by CouchDB seems to compact per document. Because compaction algorithms often achieve their gains by eliminating duplicated chunks of text, a very small document won’t give the algorithm much chance to remove duplicates, while a very large document will have lots of duplicates. This is especially true if your documents are verbose and look like:

{ 'volume':10, 'occupancy':0.01, 'timestamp':'2007-01-16 08:02:30 UTC' }

The compact algorithm can’t do much with just one such entry in a document. But if a document contains hundreds or thousands of such entries, the compactor can usually do a pretty good job of making those verbose labels only a minor hit on the total disk storage size.

Given that I have years of old data laying around, I next tried to store a year of data per document. I figured this would give the most bang for my compaction buck, and indeed it did. However, the documents were too big. The new problem I faced was that CouchDB views would mysteriously fail with OS process timeout errors.

My middle ground choice was to use one day per document, which also fit fairly well with my usage patterns. While there were some cases in which I wanted to be able to draw just a few hours of a day out of the database, most of the time I wanted at least a day.

One day of data for my application consists of about 800 to 900 KB. If I run gzip on a single day’s JSON file, the size will shrink by a factor of 4. I created one database per detector, and each database (after compaction) eats up only about 150 to 200 MB.

Another issue with large documents is view generation. Because I really want details on each observation, my views emit one row per 30s. This means each document is sliced up into thousands of little rows in the view. Here I suspect that I have no choice but to swallow the resulting file size. A typical view before compaction is about 49MB, and after view compaction, that same view is still 43MB, which isn’t much of a reduction.

The last issue that is only somewhat related to the size of the document is the performance penalty from using a non-trivial reduce function. Typically, my views would have a map that would do something (apply some model, etc) to the raw data, and then emit one row per observation. Those rows would then be picked up by the reduce function and collated in some way. Until yesterday, my semi-standard reduce function would find the minimum, the maximum, the summed value, and the total count of values, and return that four element array.

While I know that using the built-in functions is faster, I used to believe that the built-in Erlang reduce functions (_sum, _count, and _stats) could only be used on single value outputs, and about three-quarters of my views needed to dump arrays of numbers, not just a single value. However, the hard cold empirical fact that my view was too slow led me to try something else. Because I have thousands of databases, I’ve written a node.js program to apply views and trigger view generation. Over the long MLK weekend, I ran the program applying 3 different views to a single year of data, and when I checked in on the process Tuesday morning none of the 3 view-application jobs had finished.

On a whim, I checked out the latest version of the CouchDB source, grepped for _sum and looked at the code. I don’t know Erlang from Lisp, but it certainly seemed like there was a special case or two set up to handle lists of numbers, so I removed my long reduce and rereduce code and replaced it with _sum. And it worked!

For the record, my views emit as keys some identifying characteristics, and as values the output of some function run on the raw data. The output without running reduce looks like this:

["2007-01-01", "1201558", "2007-01-01 00:19:30 UTC"]  [0.000003690180399121373, 0.000006166469216729739, 0.000007345562901578842, 0.000005175674086671691]
["2007-01-01", "1201558", "2007-01-01 00:20:00 UTC"]  [0.000003627454559477772, 0.000006325518959705681, 0.00000759499639553572, 0.000005182026970214938]
["2007-01-01", "1201558", "2007-01-01 00:20:30 UTC"]  [0.000003599300956965342, 0.00000656187359400101, 0.000006868382743588068, 0.000005119872197890806]

I emit both the day and the timestamp because that way I can group by days in my reduce, get the output from each timestamp if I do not run reduce. My reduce function is simple the plain, vanilla _sum function. Reducing the above view output with group_level = 2 gives:

["2007-01-01", "1201558"]  [0.01011209205149, 0.01385255704737369, 0.02016985702928209, 0.009153296503432225]
["2007-01-02", "1201558"]  [0.01065079552287105, 0.01443181962454101, 0.01961418238906556, 0.01053126869622315]
["2007-01-03", "1201558"]  [0.01132393733829414, 0.01471054680281484, 0.01937133599383705, 0.0092336843625161]

While the views still take a few minutes each to apply, they are definitely faster than the old way. In just 18 hours I was able to apply a single model to 3 years of data, whereas with the old view I wasn’t able to apply three views to one year of data even after a long weekend of computer time. Not quite apples and apples, but the new approach is going much faster.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.