Booknotes 6.
I’m writing a guide to building your own database server. Here’s what I’ve been doing:
Progress
This week I’ve hit the milestone of having a complete sample implementation!
Grouping and aggregation
Grouping and aggregation were the most complex parts to implement. This complexity comes from how grouping changes the structure of the data it acts on and how that interacts with type checking and the operations that run after the grouping.
I looked at a few databases to see how they implement grouping. SQL database implementations tend to have speed as their goal rather than the simplicity and readability of the code which can make them hard to follow. H2 is one of the few that isn’t written in a low level C-like language.
An insight that helps implement aggregation is how databases that process data in rows usually use an accumulator to calculate the result a row at a time rather than trying to process a whole column of data at once.
The SQL standard
I have wanted to take a look at the SQL standard for a while to clarify some behaviour, but didn’t want to pay the hundreds it costs to purchase the standard. I went to the Library at the IET (The Institution of Engineering and Technology) which lets you view many standards through their network. They are open to the public for free and was a nice place to work from for the day.
The standard was also helpful in confirming the exact meaning of certain terms and making it clear where implementations have the freedom to do their own thing.
HYTRADBOI conference
I spent Friday watching talks from HYTRADBOI. Since the talks were about databases and programming languages, I can’t imagine a more relevant conference for me.
My favourite talks were “A quick ramp-up on ramping up quickly” (about the tradeoff between optimizing for initial run time and later runs in javascript engines), and “A case for feminism in programming language design”.
I also learnt about the pipe-syntax that is now available to use in BigQuery and the process of building things by throwing them away.
My lightning talk on the odd bits of SQL is also available.
Updated chapter list
Going through the sample implementation again has helped me break up the guide into sensible chunks. I’ve updated the chapter list on the landing page for the guide. Most of my time in the next few weeks will be writing the remaining chapters for part one of the guide.