Half baked post here, mostly just a note to self.
I want to store data and time together, like in the description of Datomic. In face I *have* been doing this for some time, but I want to be more formal about it.
First, all my attempts within PostgreSQL suck. I end up with massive tables that need to be reduced after the fact and that don’t really do what I want.
CouchDB’s document approach is interesting, in that you can store time
stamp in the doc, and/or use the timestamp as part of the doc id, but
it is also annoying…for a case like vds, I just want to record that
the detector started at some point and stopped at some other point,
but since the stop isn’t announce, I need to record every time the
Suppose I have a batch of VDS detectors, announced by a file that says that as of that day, all of those detectors are on. So the next time a new file is received, only the detectors in the file are still on, all the ones not in the file are implicitly off.
But if I am also storing data from all of those detectors at the same time, then in theory at least I should know whether or not the detector is on for any given day.
Okay, but what about the other metadata, like the segment length, that might change based on other changes in the system but that leave the core facts about the detector (it is on the freeway, at a point in space, collecting data for a bucha lanes) intact? Those facts need to be written down each time they change, and if there isn’t a new entry in the new detector listing, then that means that the detector is off and the last state is the penultimate state, and the last day that the detector might have been active and with characteristics is the last day it collected data.
I’m trying to talk myself out of having to write a date stamp to “all the detectors not in this file that do not yet have a final active date”, which seems stupid and inefficient. And I don’t think I have to do that. The facts as of the last update are true until those facts are changed. The fact that the detector is not in the new detector list is irrelevant because I know that it isn’t collecting data anymore.
To get the descriptive facts about a detector for a particular day’s data, all I need to do is get the document from the detector state db that is less than or equal to the day in question.