''As of October2007, I'm exploring some fundamental enhancements to Ratcl -jcw'' ** Missing values ** Until now, Ratcl has not really dealt with missing values - or in some hackish way at best. Metakit did not support missing values at all. Without wishing to open a can of worms on this hotly-debated issue amoung database experts, there are some problems in Ratcl with this approach: * inserting a view in another with different columns leads to undefined content * only inner joins can be ungrouped ("flattened") * the inability to implement a decent SQL layer on top of Ratcl That first problem is fairly serious: you can't insert "empty space" in a view first, and then fill in the cells with data. In particular, you can't easily insert N empty slots by specifying the number "N" as the view to be inserted. In terms of planned change propagation, such a capability would simplify things by allowing this operation to be implemented in two separate valid steps. As for the other two problems: SQL is a de-facto standard, like it or not. The planned enhancement to Ratcl is to fully support missing values (sometimes these are called "NULL values", which is actually a contradiction in terms...). Not as an add-on in fact, but as a design property which is deeply integrated into Ratcl's column-wise data structures. It turns out that there are some very nice vectorized tricks to be played when not all items exist. Hopefully, some advantages will become apparent later in terms of performance and memory use. The assumption in this new design is that in the most common case there are either a few missing values or that vectors are mostly empty, i.e. very sparse. To allow missing values, columns need to be defined with a lowercase type. This matches the uppercase convention used in Metakit, which does not support missing values. For all types I, S, etc. you can define columns using "i", "s" instead to indicate that this column can track the presence / absence of values. The default column type remains "S", i.e. strings with no missing value support. This nicely matches Tcl's EIAS (Everything Is A String) mantra. Note that for type "s", the empty string and the absence of a string will use distinguishable internal representations (similarly for binary data and subviews). A new test operator "empty" will be added to test for the absence of a value in the specified row/column position of a view, as well as an "unset" operator. When extracting a row in tagged format, i.e. using "view $v get $row", missing values will be omitted as well as their corresponding tags, following the idea that missing values are best represented in Tcl as dicts with missing entries. ** Mutable views ** The concept of mutable views has so far been hard to fit into Tcl. On the one hand Ratcl is fully value-based for its views, which is great for automatic garbage collection. On the other hand, Tcl does not allow "values" to change, only containers, i.e. variables and arrays. The whole point of sharing data internally hinges on the principle of COW (Copy On Write). The solution is to see nested views not as data collections but as ''recipes''. A view value never changes, even in the presence of mutable views. What changes is contained in a variable, with derived views using variable ''references'' to pick up changes - in a way this is simply an indirection: the view does not contain mutable data, it contains a reference to a variable which contains that data. When a mutable view is "changed", a modified value is stored in the variable. This means that "view $v set ..." becomes illegal (ignoring subviews for now), instead "view @v set ..." must be used. Note that when the set is performed, the string "@v" does not change - so value immutability is maintained. Any number of views can be built on top, using "@v" to refer to the contents in $v. When $v changes, it is ''not'' the derived view that changes - that derived view is indeed just a recipe which can be used to get at the real data. When at some point a "view ... get ..." is done on something which might contain a mutable view, then the dereferencing will take care of using the latest data. The view-as-recipe approach does not prevent the use of some serious caching and elaborate change propagation and invalidation logic. This is of course the whole point: views ''act'' like values with automatic tracking and cleanup, but when used as views they are still just a facade in front of efficient internal data representations and vector-oriented algorithms. Recent changes to the Ratcl core show that the whole mechanism work as intended. Some refinements are still needed to deal with subviews and memory-mapped files, but these all appear to be solvable. One very nice outcome of the recent experiments is that Tcl's "value explosion" problem can probably be contained. If a file is mapped through a variable and always used through a reference, then all views built on top will have smallish list representations since only the actual operators are part of the recipe, not the (potentially huge) dataset mapped into memory from file. ** Translucency ** There is a lot more to be said about mutable views. In the latest experiments, they have gained an extremely powerful capability: the ability to "overlay" an underlying view structure and to ''selectively'' replace parts of that original view. This means that mutable views are in essence a differential mechanism: they maintain an elaborate data structure which only stores ''differences'' with the original view. So one way to look at mutable views is as being simply a "change set". There are lots of things you can do with change sets, as all version control systems illustrate. For one, you can save them - and stack them. Such a stack of saved change sets is equivalent to a transaction log. Since everything in Tcl can be a string, so can mutable views. Such strings / change sets can be saved, sent around, inspected, even modified. There is no need to adopt any database format to implement a fully transactional database in pure Tcl once this new mechanism in Ratcl has been exposed. Even so, the Metakit file format will remain considerably more efficient and provide memory- mapped on-demand loading, which can scale much further. So what this means is that you can stick with plain-text storage during development, and then switch to the binary Metakit format later for better performance and lower memory use. You could also merge change sets, or "reverse" them - i.e. generate change sets which have the same effect as the originals, but applied in the reverse order. So you could have a database with the latest state, and a set of change sets which let you go back in time - applying each one takes you to a prior version. Translucency has been implemented in experimental form. It is fairly tricky in combination with missing values. You have to be able to "unset" a value even though it exists in the original, and you have to be able to do so without any change to that original view. So internally, the translucency implemented for mutable views must handle filled cells, missing cells, and transparent cells. In addition to dealing with row insertions and deletions. All without a single change to the original - translucent mutable views form a separate ''layer''. Mutable views and translucency bring the goal of ACID-compliant transactions a major step forward. ''Onwards!''