Using Metakit as an embedded database is quite robust for a number of reasons:
- commits are saved using Stable Storage, which "flips" a single page
- if anything bad happens before a commit completes, changes don't persist
- the original data file is mapped into memory in read-only mode
- after commit, the (possibly larger) file is remapped, again read-only
- so stray pointer writes anywhere in the application cannot damage the file
That leaves the following possibilities for damaging datafiles:
- catastrophic disk hardware failures
- a bug in the Metakit software
- opening the file r/w twice and committing both
- shutting power off, causing the disk to write a partial or damaged block
- currupting in-memory data managed by Metakit and then doing a commit
The power-off damage can be prevented with a UPS (i.e. avoiding shutdowns from happening in an uncontrolled manner).
In-memory data corruption
The last case is the one where client-server databases have a clear advantage: if the database code is in a separate process and if all requests are properly checked for validity, then memory corruption can never damage the db.
With Metakit it is possible for the calling code to write into memory managed by Metakit, in a way which causes a subsequent commit to fail... slightly! - anything from changing a byte so a wrong value is written, to altering memory in such a way that the commit completes, but with a damaged data structure on disk.
A solution
The solution is to run the commit code in a separate process. One could call it a "half client/server" approach: clients can still access/read data as before, at maximum speed from their memory-mapped files in their own address space, but they must pass all modification requests and commits on to a backend process.
This offers a range of benefits beyond the memory protection guarantee:
- with a second process, dual-processor cores can run both ends in parallel
- the front end need not be Metakit (I'm now actually using Ratcl instead)
- the write stream could be logged, producing an accurate database journal
- multiple-readers/single-writer are now feasible, with the backend to arbitrate
- different Tcl/Tclkit executables can be used, even an 8.4 / 8.5 mix
- the back-end could handle more tasks, such as stored procedures and locking
- with changes saved up, isolation (ACID's "I") can be fully supported
Full timing tests have not been performed, but basic socket-based I/O within a single machine is very fast (I see up to 6 Kreq/seq sync and 50 Kreq/seq async on a Core 2 Duo Mac).
Implementation
It turns out that this is all very easy to implement in pure Tcl: roughly 150 lines of code, evenly split between frontend and backend, is all it takes.
Here's how this is being used in this first experimental version:
- a "backer" package provides three entry points: connect, send, and call
- connect sets up the back-end (more on this later)
- send asynchronously sends and executes a command on the backend
- call is a synchronous version, which waits and returns the result
The trick is to separate the different mk::* commands into different categories:
- commands which do not change the database are not needed in the backend
- changes which can be sent asynchronously (mk::set, mk::row append, ...)
- changes which require waiting, in particular "mk::file commit"
It all depends a bit on whether the frontend will use Metakit or Ratcl calls.
Using Metakit
With Metakit in both processes, the basic idea is to open the same datafile in both. In the front-end, it is opened read-only. In the backend: read-write.
When making changes, you basically apply the same mk::* commands to both. The nice detail is that all change commands sent to the backend can be sent asynchronously, which is substantially more quicker.
On commit, the following must be done:
- send a synchronous "mk::file commit" to the backend
- wait for the result (errors get thrown back to the frontend if they occur)
- changes are now on disk
- then, in the frontend, an "mk::file rollback" (!)
- what happens, is that the frontend will resync itself with the on-disk state
- which is exactly what we want: we're updating it to the latest changes
While I have not yet explored this scenario, it looks like it should be simple.
Using Ratcl
With Ratcl, a similar approach is taken but a bit more work is needed since changes can not be done in the same way. There are two choices here:
- do not make changes in Ratcl, just send appropriate changes to the backend
- (in this case, the changes will not appear in the frontend until commit)
- make changes in Ratcl, using Ratcl's operators, and use different ones for MK
- (this is more work, you have to implement database change code twice)
For now, I've only tried the first approach, making chnages by "sending" them to the backend, and assuming I'll get to see the effects after the next commit.
It's trivial to do this. I've defined a few helper procs for this:
proc db {args} {
global Db Datafile
if {![info exists Db]} {
set Db [view $Datafile open]
}
uplevel [list view $Db get 0 {*}$args]
}
proc mkdo {cmd args} {
backer send mk::$cmd {*}$args
}
proc commitdb {} {
global Db
unset -nocomplain Db
backer call mk::file commit db
}
The db proc is a convenience command for Ratcl, it opens (and re-opens!) the datafile whenever it needs to. So for example, a datafile with people names would be accessed as follows in "plain" Ratcl:
set Datafile myfile.db
view $Datafile open | get 0 people | loop { puts "name: $(name)" }
With the "db" utility definition, this becomes:
db people | loop { puts "name: $(name)" }
The 'mkdo proc is again a convenience. It gets used to replace ordinary command. So for example this:
mk::row append db.people name Joe
becomes this when using the backend:
mkdo row append db.people name Joe
Note that mkdo calls cannot return a value since they are asynchronous. If a return value is needed, you'll need to use this instead:
set row [backer call mk::row append db.people name Joe]
It's worth trying to avoid that, since it prevents some paralellism.
And finally, when it's time to commit the changes:
commitdb
That's it. The side-effect of commitdb is that it undefines the Db global variable storing the database view for Ratcl. So on next use, "db" will re-open the datafile and automatically pick up all the changes (opening a datafile is very fast in Ratcl).
The Ratcl-front / Metakit-back approach looks very promising so far. It not only separates functionality, it actually makes it unnecessary to have Metakit in the frontend. I'm currently using an 8.5 Tclkit Lite build for the frontend and a "classical" 8.4 Tclkit for the backend, and so far it's going nicely.
The backend
The backend process needs to be managed. The current code makes this quite easy for a very specific scenario: running one backend for one frontend: it creates the backend process whenever the frontend starts, and stops it when it exits. All combinations are handled, i.e. when either side crashes, and there is logic in the frontend to transparently restart its backend if ever needed.
The backend server socket is bound to the loopback interface by default, so that outside access is prevented. Because that still leaves acces open within the same machine and since the backend has r/w access to datafiles, a simple passkey check is enforced when clients connect.
Still some rough edges to work out, but not bad for a couple of hours work!
-jcw, 2007-06-17
- 2007-06-17
Created
- 2007-06-17 jcw
(Changed: area more)
- 2007-06-17 jcw
(Changed: desc)
- 2007-06-23
(Changed: stat desc)
