Consistency isn’t a design goal for CouchDB

Okay, so figure 2.1 of the couch db book says that consistency isn’t a goal of CouchDB.  So my prior post worrying about the fact that there are no foreign keys or FK constraints, etc., could result in inconsistent statement isn’t something I should worry about.  Instead, I should expect that data from the database may be internally inconsistent, one record to the next, and try to minimize my reliance upon the DB to maintain consistency.

I understand that the figure probably doesn’t refer to consistency in the same way that I am, but so what?  If I have data in a postgresql db, then I can make sure that the city of Orange is in Orange County which is in District 12 by using join tables.  The join table data will always be consistent across db nodes, and will prevent me from making false statements about what district the city of Orange is in.  At the same time, as a side effect, these foreign keys allow me to do joins in queries that let me get all of the cities in District 12.  Or all of the VDS detectors inside of a city, and so on.

In CouchDB, the solution for the second problem, the query all detectors in D12, etc., is to stuff the path data you might want to search on into the document.  This is bad from a design standpoint, because it forces the app to maintain the consistency of each node’s path data.  Apps make mistakes.  And frankly, I don’t need that sort of path searching capability.  I would like to select every node in district 12, but not by city or by county or whatnot.  If I really want to build the full tree, the best way is to store just the parent node, and then rebuild the tree with a series of recursive queries.  The join-table side effect of consistent databases isn’t available, so I need to stop trying to use it.

possibly inconsistent data

One of the things I am trying to figure out with couchdb is how to structure data so that it can’t be internally inconsistent, what is that, normalized, I guess.

So suppose I have Caltrans District, County, and City.  All of which are cleanly delimited, etc etc.  In a relational database, I’d enforce consistency by using foreign key constraints, so District 12 links to Orange County, and there can only be one link from a county to a district, etc.  But in couchdb you don’t get foreign keys.  So if I want to include data on the district, etc, I have to shove it into the document.  But that means I can make mistakes, and no one will stop me.

So I can have one document that says:

{
  'City' : 'Costa Mesa',
  'County': 'Orange',
  'District': 12
}

and another that says

{
  'City' : 'Newport Beach',
  'County': 'Orange',
  'District': 7
}

Even though the county of Orange should never be understood to be in District 7. Putting just the one-level-up doesn’t help either, because then I can’t sort on

[District,County,City]

And while I am  on the subject of sorting, I can’t yet figure out how to get a numerical sort of districts.  They are called 1, 2, 3, … , 12, but sorting them on District_id in the view and I get “1”, “10”, “11”, etc  alpha sorting, not numeric ordering.  I figure I’ll get that one sorted eventually.  I saw something that said to sort on dates, so I suppose it is a similar hack, or writing javascript to convert text to numbers in the view function before emitting the key.

Knitting diamonds in the round, hm?

Blog stats are funny.  My incomplete and possibly incorrect posting for my original diamond lace hat is by far the most popular thing I’ve written (popular being a relative term, with only like 300 views).  Looking at the stats, it is mostly one off google searches for diamond lace knitted in the round, etc. etc.  So I really will make an effort to post the actual chart that I used for my second hat, which came off without any glitches and flew off my needles in two evenings.

I’m actually working right now on a cabled hat (v2) and am taking notes on the decreases.  The cables themselves are pretty easy (using Barbara Walker’s second knitting treasury as source, and her advice that fisherman’s sweaters are vertical cable samplers—so this hat is just a cable sampler).

But first I’ve got another monster project  (scarf) to finish.  My goal is 60 rows a night, but I’ve only done 18 and then 24.  At 24 rows a night, I will finish by Christmas, but with no time to spare.