Contour Line

March 6, 2009

A lot of data is a lot of data

Filed under: couchdb, research, transportation — jmarca @ 10:35 am

I can’t seem to get an efficient setup going for storing loop data in couchdb.  On the surface it seems pretty simple—every loop is independent of every other loop, so every observation can be a document.  But for this application this is more limiting than I first thought.  The problem is that after storing just a few days worth of data, the single couchdb database expands to 35GB.  I tried running my carefully crafted map/reduce to get mean and variance stats, and the process went out to lunch for days before I killed it off.

(more…)

Blog at WordPress.com.