The Metakit package is a very exciting product. Yeah, well, I mean that... :o)
Things that have worked out well include:
- Performance has been going up in big jumps over the various releases
- On-the-fly restructuring is working out nicely, and is transacted like everything else
- The headers are small, internals are very nicely tucked away (as a result, very substantial changes were feasible without affecting the APPI that much)
- Things like new datatypes and columns-with-a-gap optimizations were added with full backward compatibility
- Datafiles are compatible, the latest release still reads 1.0 files just fine
- Portability has turned out to be excellent, from 16- to 64-bit platforms
- Support for memory mapped files was added later on, though it now plays a central role
- The class hierarchy is very flat, there are very few virtual members
- Modularity is good, apps which do not call all functions will be considerably smaller than those who do
- The quality seems to work out ok, very few bugs tend to come up nowadays
- The "strategy" class is extremely flexible for non-standard I/O contexts
There are also some things which have not been addressed, or are still weak:
- There is no multi-user support, other than many-readers-no-writer
- Threading support is very basic: one thread per open storage object
- Though performance is amazing in some cases, it doesn't really scale well enough beyond say 100 Mb datafiles
- On platforms without MMF, even that is too optimistic, a "few dozen Mb" on the Mac is probably a more reasonable limit
- Likewise, performance degrades with say 100,000 rows, or 10,000 subviews, or even well before that in the case of string fields
- Datafile opening performance is not optimal (proportional to file complexity), and string fields have to be scanned
- Commits should use a lower granularity so small changes imply quick commits, and do too much too soon (free space management could be delayed)
- The file format is good once open, but could be improved to support more refined "staged" opening (this would dramatically speed up file opens)
- Inserts/deletes in views with >50 properties is not optimal (there's a fixed parameter set to 50, it should be adaptive)
- There is a good opportunity to introduce B-trees transparently
- Need better (read: more fundamental) support for compression and encryption
- Large strings are still copied more often than needed (i.e. those straddling a 4k boundary)
- Memo fields need an API to read/write portions, and could easily be extended to also allow inserts/deletes of data bytes in any position
- Data should be aligned on file, so MMF works better
- Remove two known limitations of 32-bite file addressing
- Expand adaptive integer sizing to 1..64 bits, instead of just 1..32 bits
- Better support for shrinking files, also add explicit reorganization calls
- Add locking (as implemented experimentally in Mk4tcl)
- Add hierarchical/heterogenous data storage (see experimental e4Tree module)
- The Mk4tcl offset-data trick should be incorporated into Metakit
Last but not least, there are some very fundamental issues which need to be addressed:
- The "attached" vs. "unattached" dichotomy must be resolved and removed
- Cache coherence with multi-user access must be solved (propagate diffs)
- Change propagation for sorting is flawed, it needs to be fixed to handle all cases (this requires a fundamental rewrite of the notification mechanism)
- Change propagation for the newer view operators must be implemented
- A few more operations are needed to offer full relational functionality.
A lot of the above - as well as some brand new functionality - will be addressed in the next generation Metakit software. I have quite a few answers to many (but definitely not all!) of the above issues