Friday, June 1, 2007

XPe tip #49: Why HORM restricted to EWF-RAM?

There was an interesting question posted to the newsgroup recently that I attempted to answer. Thought I want to add the ideas posted to my blog as well.

The question was why HORM (feature of XPe SP2) is limited to EWF-RAM mode. HORM stands for Hibernate-Once-Resume-Many embedded enabling feature.
In Fp2007 to the ewfmgr console tool (as well as to EWF API) a couple of new commands/functions have been added that are directly related to the HORM and basically allow you to activate and deactivate the feature at run time.

You will also find a lot of internals on how HORM works in this thread where Slobodan and me discussed the HORM feature implementations way before it was called HORM :-)

Now back to the original question. Why we are limitted to RAM overlay when HORM is used?
I guess the main "limitation" there of EWF Disk mode vs EWF RAM Reg is that by default the overlay data is "persistent" in Disk mode.
The major reason for using EWF RAM for HORM scenarios was that the changes you do while you work with the image that are redirected to RAM are going to get lost on reboot. This way whatever OS does about the hibernation file (hiberfil.sys) will be lost as well and you can always go back to the original state of the device - this is basically the main purpose of the HORM.
With EWF Disk the changes are redirected to Disk overlay (another hidden partition on the disk) and EWF will pick them up at the next boot until you clean the overlay or commit the changes (there is also a minor difference in behavior depending on what restore level you selected for the disk mode but it is irrelevant to this discussion).

So, imagine the following scenario:

1) HORM is enabled. Hibernation file holds a valid state of the system saved at some point of time.

2) EWF is enabled too. Disk overlay is used. Initially the cache is *clean* (not really possible with EWF Disk mode when you enable HORM but just for the sake of this explanation).

3) You boot the image and it boots fast from the hiberfil.sys file.

4) You work. FS (actually disk level) writes are redirected to Disk overlay, no files on the protected partition are actually changed on the disk including the hiberfil.sys.

5) You reboot the system (gracefully or not doesn't matter since you are EWF protected).

6) The system boots from the same (old!) hiberfil.sys file. This means it will restore all the driver and application states to whatever states they had in memory at the moment the golden image was hibernated. EWF driver also has some (old) state there where it remembers what was the Disk overlay content at that time, i.e. it thinks the overlay is *clean* (see step#2).
However, your actual Disk Overlay is not clean since it was modified in an earlier OS session (see step#4). Now you get a discrepancy between the disk overlay and its state in RAM (EWF state machine). This will likely lead to an exception within the EWF driver and, since EWF is a kernel driver, to a BSOD.

The situation is even worse if we go real and don't assume the *clean* initial state of the EWF overlay. Since at the time we hibernated the image the EWF was already enabled (it must be enabled) you already got some changes to the disk (often unintentional such as logs, registry hive changes, etc.) that were redirected to the overlay.

Another issue could be that if you need to commit the changes (and stop the HORM) you won't be able to do that. The actual commit in Disk overlay mode occurs at the next boot time after you issued the commit command. At boot EWF driver reads all the changes off the hidden data partition and applies them to the disk. Obviously, this happens way later than the OS (the loader and the kernel) loads the RAM image off the hiberfil.sys (i.e. restores the system from the hibernation file). Since the hiberfil.sys file is the old one the EWF driver there (its state machine) doesn't know the commit command was issued and you will never get out of the loop.

The only way to test the above scenario is to use one of the "hacks" we discussed with Slobodan in the newsgroup (see the link above). Otherwise, HORM will be disabled as reported by the ewfmgr. I also discovered that hibernation won't work at all while EWF Disk Mode is protecting (enabled) the system partition that holds the hiberfil.sys file.

The actual bug, in my opinion, is there in the latest Enhanced Write Filter component UI in TD. There Microsoft added the checkbox for HORM - very nice feature - but hesitate to disable the option when the settings are changed to Disk Overlay usage.


