Started testing out TokyoCabinet and TokyoTyrant last Friday, and got my initial test program running this morning. The documentation is pretty good, but I’m still floundering about a little bit. Not sure what parameters to pass to the b+ tree database file to make it work well for my data; not sure how to set up multiple databases for sharding; etc etc. On the plus side, my Perl code that loads the data is running at about 50% CPU, so it is doing something rather than waiting around for writes. On the down side, now I have to write a small program to check on the progress of those writes to make sure that I am actually writing something!
Update. I am comparing storing in TokyoTyrant with storing in CouchDB. CouchDB it turns out is faster for me out of the box because of the way Erlang takes advantage of the multi-core processor. Tokyo Tyrant server just maxes out one core, and so my loading programs wait around for the server to process the data. CouchDB, on the other hand, will use up lots more cores (I’ve seen the process go about 400% in top). So loading a year of data with one data reading process per month simultaneously, TokyoTyrant is only up to day 6 of each month, while my CouchDB loader programs are all up to about day 14 in each month.
I’m sure there is a way to set up TokyoTyrant to use multiple CPUs, but I can’t find it yet.