[Metakit] Corrupted database over network - WinNT
Pat Thoyts
patthoyts at users.sourceforge.net
Tue Feb 13 21:29:42 CET 2007
<Hector at coollector.com> writes:
>Thank you very much for your help, Pat.
>
>Does it mean you made the simple test I described and nothing happened ?
>
>What did you change to c4_FileStrategy ?
I asked and it turns out I am not being given permission to post the
code for my c4_Win32FileStrategy implementation. Quite frankly I'm
rather disgusted with my bosses about this decision. If we are to
build upon an open source foundation the very least we could do is
share information freely. At least I can describe what was done
without explicit reference to the code. Maybe later I can clue them up
a bit.
We implemented a file format based upon metakit for storing
application specific data and discovered that when accessed over
network shares we sometimes obtained corrupt matakit databases. The
symptoms were generally swapped columns (the size data would appear in
the timestamp column and vice versa for instance). After a lot of
searching I was able to show the problem is to do with the handling of
memory mapped files over the network share.
I then examined carefully the way the data looked in the
ResetFileMapping function when saving. Metakit doesn't use memory
mapping the way that most applications do and I think this confuses
the windows network layer that handles this. Metakit maps the file as
a read-only memory mapping and when it saves it re-serializes into the
underlying file. It then drops the current mapping and re-maps the
file. What I observed was that when it re-mapped it got the original
data and did not in fact re-map the newly written data at all. I found
it useful to dump the mapped memory image to a sequence of temporary
files to compare the various products when testing this.
Now metakit by default doesn't provide much control over how a file
gets accessed. You pass in a filename and a mode (readonly or
readwrite) and it gets opened for you. If you look in src/fileio.c you
can find the DataOpen function and see it is using the C library to
open the file and uses fseek/fprintf/fread to manipulate the file. I
wrote a new implementation that eliminates the use of the C library
functions and instead allows us to use CreateFile,ReadFile, WriteFile
and SetFilePosition instead. The primary advantage of this is that
instead of having metakit open a filename we use CreateFile to open
the file and then have metakit attach to the file handle. In use it
goes something like:
HFILE hFile = CreateFile(....);
c4_Strategy *pStrategy = new c4_Win32FileStrategy();
if (!pStrategy->DataOpen(hFile, true)) {
return STG_E_ACCESSDENIED;
}
c4_Storage *pStorage = new c4_Storage(*m_pStrategy, false, 1);
This permits us to specify much more completely the Win32 flags and
permissions used to open the file. It also lets us provide our own
file prefix before the metakit data section which is another reason we
use this method.
Using the above we can open the file with the dwAccessMode set to 0
for network files or FILE_SHARE_READ for local files. We can also add
some flags -
FILE_ATTRIBUTE_NORMAL|FILE_FLAG_RANDOM_ACCESS |FILE_FLAG_WRITE_THROUGH
seems about right. We also use FILE_FLAG_OVERLAPPED but I don't think
that has anything to do with the corruption issues.
The key point for me was taking absolute control of how the file was
opened. If you have exclusive access to the file then you don't need
to worry about anyone else - virus checker or not - until you close
the file handle. Some of the other flags (overlapped or random access)
may or may not contribute to solving the problem, but I think I tested
a number of combinations before settling with exclusive access.
--
Pat Thoyts http://www.patthoyts.tk/
PGP fingerprint 2C 6E 98 07 2C 59 C8 97 10 CE 11 E6 04 E0 B9 DD
More information about the Metakit
mailing list