Just a quick post so that I remember to elaborate on this later. I have found that whenever I have a large project to do in CouchDB I go through several iterations of designing the documents and the views.
My latest project is typical.
- First design was to push in really big documents. The idea was to run map reduce copy the reduce output to a second db, and map reduce that for the final result. But the view generation was too slow, I never got around to designing the second db, and the biggest documents triggered a bug/memory issue.
I have an application that is taxing my PostgreSQL install, and I’ve been taking a whack at using CouchDB to solve it instead.
On the surface, it looks like a pretty good use case, but I’m having trouble getting it to move fast enough.
In a nutshell, I am storing the output of a multiple imputation process. At the moment my production system uses PostgreSQL for this. I store each imputation output, one record per row. I have about 360 million imputation stored this way.
Each imputation represents an estimate of conditions at a mainline freeway detector. That is done in R using the excellent Amelia package. While the imputation is done for all lanes at the site, because I am storing the data in a relational database with a schema, I decided to store one row per lane. Continue reading
The replicator database in couchdb is cool, but one needs to be mindful when using it.
I like it better than sending a message to couch db to replicate dbx from machine y to machine z, because I can be confident that even if I happen to restart couch, that replication is going to finish up.
The problem is that for replications that are not continuous, I end up with a bunch of replication entries in the replicator database. Thousands sometimes. Until I get impatient and just delete the whole thing.
For the way I use it, the best solution is to write a view into the db to pick off all of the replications that are not continuous and that have completed successfully, and then do a bulk delete of those documents. But I’m never organized enough to get that done.
Here’s hoping such a function finds its way into Futon some day.