Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

treatment of on-disk segments as "what was written by programs" can cause areas of 0 to not be written by bmaptool copy #3

Open
codyps opened this issue Mar 9, 2024 · 0 comments

Comments

@codyps
Copy link

codyps commented Mar 9, 2024

see intel#75 for the original issue. Migrating here because all issues on that repo were closed.

Content of a comment I left explaining the issue:

  1. I (well, wic internally) was using bmaptool create /file/on/zfs/disk.img.
  2. The target file (on zfs) stores a full disk image, which included some squashfs & raw binary images (but could include anything)
  3. When using the bmap created by the above, the raw images which were all zero bytes were not written when using bmaptool copy (with the bmap generated from the disk.img that was formed on the zfs filesystem).

So:

  1. It's entirely possible other segments of the file that are written to zero (or determined to be zero by reading and not written) by, for example, file system generators (squashfs) or filesystem drivers mounted in loopback (ext4, etc) would also not be captured by a bmaptool create, and those zero'd segments would then not be copied by bmaptool copy.
  2. bmaptool essentially wants to distinguish between "don't care" bytes and all other bytes, but the filesystem apis in use (FIEMAP, seek HOLE/DATA) don't expose this info, they only expose runs of zeros which may or may not be "don't care" bytes.
  3. To resolve this, bmaptool probably needs to do something like have a fuse fs or blockdev or a network blockdev which tracks reads/writes to a file to determine which bytes are actually don't cares (never read or written). Determining this after the fact generally is reliant on filesystem driver/generator implementation details.
  4. Even a fuse fs (or other mechanism to capture all reads/writes) may not be sufficient for filesystem generator programs which know they are creating a new file and thus may presume default content for unwritten regions of that file. iow: one would need to examine the implementation of fs generator programs (those which take a filesystem tree and generate a block image file) to ensure they don't include those assumptions (and to be clear: assuming things about unwritten space in a new file is permitted by the standards, so having them not make that assumption to support bmaptool would be an additional requirement).

To use bmaptool successfully today, one needs to audit both the filesystem driver (and vfs, etc) for where they are storing the image file they want to run bmaptool create on, and they also need to audit all the programs/drivers/etc they're using to write data into that file to ensure the expected behavior (that holes/unallocated regions always correspond to "don't care" bytes).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant