Contour Line

December 22, 2009

PUT problem solved

Filed under: Uncategorized — jmarca @ 12:50 am

I had a problem linking up dojo/xhrPut and Catalyst::Controller::REST. As always, the answer was in the documentation, but I didn’t see it.

Catalyst::Controller::REST docs say that:

The HTTP POST, PUT, and OPTIONS methods will all automatically deserialize the contents of $c->request->body based on the requests content-type header. A list of understood serialization formats is below.

And the docs for dojo/xhrPut point to those for dojo/xhrGet for parameters, which include:

headers

A JavaScript object of name/string value pairs. These are the headers to send as part of the request. For example, you can use the headers option to set the Content-Type, X-Method-Override, or Content-Encoding headers of the HTTP request.

This parameter is optional

So all I had to do in my javascript code is

	    var xhrArgs = {
		url: ajaxurls.sort + '/' + editing.id,
		putData: dojo.toJson(data),
		handleAs: "json",
		headers: {'Content-Type':'application/json'},
		load: function(data){
		    // don't really need to do anything here
		    // uncomment for testing
                    // console.log("new sort order put to server");
		},
		error: function(error){
		    alert('warning, edits were not saved properly.  Proceed with caution');
		}
	    }
	    //Call the asynchronous xhrPost
	    var deferred = dojo.xhrPut(xhrArgs);

And the controller magically started to work as expected. Hooray, git commit and all that, but it is time to go to sleep and actually get things working some other day.

December 15, 2009

It’s a razor thin line, but obvious which side you’re on

Filed under: Uncategorized — jmarca @ 9:45 am

So yesterday was the first day of Secret Santas at Grace’s school. Same dreadful drill as when we were kids…pick names out of a hat, get somebody you’re not friends with, and then try to think up gifts all week long. What with Nutcracker rehearsals and performances, I didn’t hear about it until Sunday night, but the girls and I had made cookies over the weekend so that seemed like an appropriate gift, fitting the “small, home made” type of requirements. So we got our act together, and Grace gave a decent gift. (more…)

December 10, 2009

Finally on the air

Filed under: civil war history — jmarca @ 2:49 pm

My mother-in-law just got notified that her interview with Tavis Smiley is finally going to get aired Dec 11, and will subsequently be posted forevermore on-line!

Hello Ms. Tomblin,

I just wanted to reach out to you and let you know that your interview
with Tavis will be running on our show this weekend. You can likely
hear it on your local public radio station, or hear it on demand at our
website at www.tavissmileyradio.com.
It should be available by 12 noon PT on Friday (tomorrow).

I’m remembering the old days when we would tape interviews and songs off the of the radio on warbly cassette tapes. Good times.

A Flash-related browser crash ate my bug report to MooseX::Declare

Filed under: code, perl — jmarca @ 1:56 pm

ggghhhhaaaa. I hate flash. I really like http://proquest.safaribooksonline.com/, or rather, I used to love it, but now they’ve switched to Flash and it is hateful hateful hateful. But I still can’t stop using it because the information is so awesome and handy and because UCI has an account and it is right there waiting for me whenever I have a question. But then *bang* one page too many and firefox just blinks off my desktop.

(more…)

December 2, 2009

Tedious but necessary

Filed under: code, couchdb, research, transportation — jmarca @ 3:17 pm

I’ve found that I prefer making things to maintaining things. My wife will testify that tidying up is not my forte, but that I don’t mind the most laborious cooking task.
(more…)

November 11, 2009

RJSONIO to process CouchDB output

Filed under: couchdb, research, transportation — jmarca @ 2:23 pm

I have an idea.  I am going to process the 5 minute aggregates of raw detector data I’ve stored in monthly CouchDB databases using R via Rcurl and RJSONIO.  So, even though my data is split into months physically, I can use Rcurl to pull from each of the databases, and then use RJSONIO to parse the json, then use bootstrap methods to estimate the expected value and confidence bounds, and perhaps more importantly, try to estimate outliers and unusual events. (more…)

November 2, 2009

Tokyo Tyrant Throwing a Tantrum

Filed under: couchdb, tokyocabinet — jmarca @ 9:32 pm

Well, last Friday I posted “So, slotting 4 months of data away.  I’ll check it again on Monday and see if it worked.”

It didn’t.  Actually I checked later that same day and all of my jobs had died due to recv errors.  I’ve tried lots of hacky things but nothing seems to do the trick.  From some Google searching, it seems that perhaps it is a timeout issue, but I can’t see how to modify the perl library to allow for a longer timeout.

So, I wrote a little hackity hack thing to stop writing for 5 seconds, make a new connection, and go on writing.  Now it only crashes out of the loop if that new connector also fails to write.  And I also don’t crash until I save my place in the CSV file, so I don’t repeat myself.  So I’m not getting a complete failure, but it is still super slow.

While the documentation for Tokyo Tyrant and Tokyo Cabinet is super great, it seems to be thin on documentation and use cases/examples for stuffing a lot of data into the table db at once.

Interesting probably unrelated fact.  The crashing only started when I recomputed my target bnum, and boosted it from 8 million to 480 million.

Anyway, I had time today to tweak the data load script, and also to finalize my CouchDB loading script.  Having started two jobs each, and with tokyo tyrant started first, it looks like couchdb is going to finish first (The January job is running three days completed to every one in Tokyo Tyrant job;  the March jobs are closer together, but that Tyrant job started about an hour before everything else).

I guess there is still a way for Tokyo Tyrant to win this race.  I am planning to set up a map/reduce type of view on my CouchDB datastore to collect hourly summaries of the data.  It might be that computing that view is slow, and that computing similar summaries on the Tokyo Cabinet table is faster.  We’ll see.

 

October 30, 2009

Tokyo Tyrant is cool

Filed under: couchdb, tokyocabinet — jmarca @ 10:30 pm

Just to have a recollection of this later, some notes.

setting up tokyo tyrant instances, one per month.  I expect about 4 million records a day, so that is 120 million a month, so I set bnum to 480 million, which seems insane, but worth a shot

One thing I noticed was that in shifting from one day tests to one month populate, and with the bump up of bnum from 8 million (2 times 4 million) to 480 million, I’m noticing a significant speed drop on populating the data from four simultaneous processes (one for each of 4 months).

There is write delay of course, and that may be all of it, since the files are big now.

Perhaps there is a benefit from wider tables, rather than one row per data record?  Like one row per hour of data per sensor, or one row per 5 minutes, etc?

Also, as I wrapped up my initial one-day tests, I got some random crashes on my perl script stuffing data in.  Not sure why.  Could be because I was tweaking parameters and stuff.

One final point, the size of the one day of data in tokyo cabinet is about the same as the size of one day of data in couchdb.  I was hoping to get a much bigger size advantage (smaller file).  The source data is about 100M unzipped csv file, and it balloons to 600 M with bnum set at 8 million in a table database.  Of course, it isn’t strictly the same data… I am splitting the timestamp into parts so I can do more interesting queries without a lot of work (give me an average of data on Mondays in July; Tuesdays all year; 8 am to 9 am last Wednesday, etc.

So, slotting 4 months of data away.  I’ll check it again on Monday and see if it worked.

And by the way, I’m sure I’m not the best at this because I haven’t used it much, but it is orders of magnitude faster to use the COPY command via DBIx::Class to load CSV data into PostgreSQL.  Of course, I don’t want to have all of that data sitting in my relational database, but I’m just saying…

 

 

October 26, 2009

Putting stuff away

Filed under: couchdb, tokyocabinet — jmarca @ 8:34 am

Started testing out TokyoCabinet and TokyoTyrant last Friday, and got my initial test program running this morning.  The documentation is pretty good, but I’m still floundering about a little bit.  Not sure what parameters to pass to the b+ tree database file to make it work well for my data; not sure how to set up multiple databases for sharding; etc etc.  On the plus side, my Perl code that loads the data is running at about 50% CPU, so it is doing something rather than waiting around for writes.  On the down side, now I have to write a small program to check on the progress of those writes to make sure that I am actually writing something!

Update.  I am comparing storing in TokyoTyrant with storing in CouchDB.  CouchDB it turns out is faster for me out of the box because of the way Erlang takes advantage of the multi-core processor.  Tokyo Tyrant server just maxes out one core, and so my loading programs wait around for the server to process the data.  CouchDB, on the other hand, will use up lots more cores (I’ve seen the process go about 400% in top).  So loading a year of data with one data reading process per month simultaneously, TokyoTyrant is only up to day 6 of each month, while my CouchDB loader programs are all up to about day 14 in each month.

I’m sure there is a way to set up TokyoTyrant to use multiple CPUs, but I can’t find it yet.

October 23, 2009

Related to someone getting more and more almost famous by the day

Filed under: civil war history — jmarca @ 2:14 pm

Well my mother-in-law’s interview with Tavis Smiley still hasn’t been broadcast (perhaps they are saving it for February?), but she got a very good review from the Washington Times dated Oct 8,2009.  Of course, the internet being the internet, it has totally fallen off the front page of the book review section and even the Military History section, but lives on in the hard disk cache in the sky.  If you google “Escaped Slaves and the Union Navy” you get right to the review page by Gordon Berg.

It is interesting to me that it takes a third book to start getting positive buzz that goes beyond friends and acquaintances.   While the topic helps a little bit in that with Obama in the White House people are taking a fresh look at black history in our nation, I don’t think that is entirely all of it.  Her book on “G.I. Nightingales” was also pretty good, and should have been just as popular, but didn’t get the buzz.  Nor is it just that after three books one’s writing is bound to improve.  Perhaps it is just that with three books reviewers are more likely to review a book, and the publisher is more likely to get more traction marketing the book.

Maybe the next book will be optioned by Hollywood, then we’ll really be related to somebody famous!

Or maybe I will write four books on transportation engineering and get a movie made.

Or maybe one of the girls will finally write the book with the title “The Moon is the Nightime Sun” that they’ve been on about since they were 5…

Next Page »

Blog at WordPress.com.