Skip to content
This repository has been archived by the owner on Apr 20, 2022. It is now read-only.

Support for reading snapshot metadata (WIP) #34

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

hakanai
Copy link
Contributor

@hakanai hakanai commented May 8, 2018

Many unknown values, still investigating.

  • unknown_0 = 1548
  • unknown_8 = 1570
  • unknown_16 = timestamp for event?
  • unknown_24 = same timestamp
  • unknown_32 = 185
  • unknown_40 = 0x40000002 (this thing again)
  • unknown_44 = 0x00000000

@tempelmann
Copy link
Contributor

So, have you figured out how to locate the snapshots? Is there an anchor you can find, with its own btree? Or are snapshots just mixed-in with the main tree, and with different tags or versions?

@hakanai
Copy link
Contributor Author

hakanai commented May 8, 2018

They're present in the same omap tree alongside all the other versions, so you get, for instance, a (oid=1026,xid=67) mapped to one block and a (oid=1026,xid=81) mapped to another block.

So essentially, all my existing logic for locating even the most recent copy of objects is wrong if there are any snapshots present, so I'm glad I tried testing this. I had a TODO on the code where I'm mapping oid to block number about the version number being important, and it's more important than I thought.

I just don't know...how to do the lookup efficiently without just reading it all into a giant hashmap up-front.

(The naïve implementation is: read all the rows, and take the highest xid which is not higher than the xid of the snapshot. How many entries can be in an omap table anyway? :))

@tempelmann
Copy link
Contributor

tempelmann commented May 8, 2018

Have you had a look at the apfs-fuse code? Maybe you can learn some tricks from it

@hakanai
Copy link
Contributor Author

hakanai commented May 8, 2018

Solving part of the confusion - omap_key doesn't actually contain oid - the oid is in the header already parsed.

Actually, in my own model, the omap keys aren't mixed in with the file table keys, but I've been unable to figure out whether they should or should not be. What I do know is that history keys have the exact opposite structure of omap keys, meaning that under the structure currently in this ksy file, both of those would be kind=0x0.

After removing the oid from omap_key, the table looks like:

0 [NodeEntry]: (OMAP) #1028 ID v67 → Blk 1561, len 4096
1 [NodeEntry]: (OMAP) #1028 ID v83 → Blk 1819, len 4096
2 [NodeEntry]: (OMAP) #1030 ID v67 → Blk 1569, len 4096
3 [NodeEntry]: (OMAP) #1030 ID v82 → Blk 1789, len 4096
4 [NodeEntry]: (OMAP) #1031 ID v67 → Blk 1558, len 4096
5 [NodeEntry]: (OMAP) #1031 ID v83 → Blk 1821, len 4096
6 [NodeEntry]: (OMAP) #1033 ID v64 → Blk 1532, len 4096
7 [NodeEntry]: (OMAP) #1033 ID v83 → Blk 1816, len 4096
8 [NodeEntry]: (OMAP) #1034 ID v60 → Blk 1487, len 4096
9 [NodeEntry]: (OMAP) #1034 ID v83 → Blk 1825, len 4096
10 [NodeEntry]: (OMAP) #1035 ID v67 → Blk 1568, len 4096
11 [NodeEntry]: (OMAP) #1035 ID v84 → Blk 1834, len 4096
12 [NodeEntry]: (OMAP) #1037 ID v67 → Blk 1567, len 4096
13 [NodeEntry]: (OMAP) #1037 ID v84 → Blk 1828, len 4096
14 [NodeEntry]: (OMAP) #1038 ID v67 → Blk 1566, len 4096
15 [NodeEntry]: (OMAP) #1038 ID v73 → Blk 1697, len 4096
16 [NodeEntry]: (OMAP) #1041 ID v67 → Blk 1562, len 4096
17 [NodeEntry]: (OMAP) #1041 ID v83 → Blk 1820, len 4096

So all the oid values are grouped together and the xids are in numerical order, meaning you can still binary search if that helps performance.

@hakanai
Copy link
Contributor Author

hakanai commented May 8, 2018

The snapshot metadata table is a bit of a weird one though.

* 0 [NodeEntry]: (SNAPSHOT_INFO) #68  -> TODO "com.apple.TimeMachine.2018-05-08-140522"
    * keyOffset = 0x0 = 0
    * keyLength = 0x8 = 8
    * dataOffset = 0x5A = 90
    * dataLength = 0x5A = 90
    * keyHdr [KeyHdr]: (SNAPSHOT_INFO) #68
        * keyLow = 0x44 = 68
        * keyHigh = 0x10000000 = 268435456
        * objId = 0x44 = 68
        * kind = SNAPSHOT_INFO (0x1 = 1)
    * key [EmptyKey]
    * val [SnapshotInfoVal]: TODO "com.apple.TimeMachine.2018-05-08-140522"
        * unknown8 = 0x622 = 1570
        * unknown16 = 0x152C92ED860EC800 = 1525755922625775600
        * unknown24 = 0x152C92ED860EC800 = 1525755922625775600
        * unknown32 = 0xB9 = 185
        * unknown40 = 0x40000002 = 1073741826
        * unknown44 = 0x0 = 0
        * nameLength = 0x28 = 40
        * name = com.apple.TimeMachine.2018-05-08-140522
* 1 [NodeEntry]: (SNAPSHOT_NAME) #4563402750 "com.apple.TimeMachine.2018-05-08-140522" -> #68
    * keyOffset = 0x8 = 8
    * keyLength = 0x32 = 50
    * dataOffset = 0x62 = 98
    * dataLength = 0x8 = 8
    * keyHdr [KeyHdr]: (SNAPSHOT_NAME) #4563402750
        * keyLow = 0xFFFFFFFF = 4294967295
        * keyHigh = 0xBFFFFFFF = 3221225471
        * objId = 0x10FFFFFFE = 4563402750
        * kind = SNAPSHOT_NAME (0xB = 11)
        * key [SnapshotNameKey]: "com.apple.TimeMachine.2018-05-08-140522"
            * nameLength = 0x28 = 40
            * name = com.apple.TimeMachine.2018-05-08-140522
        * val [SnapshotNameVal]: #68
            * snapshotId = 0x44 = 68

What's weird is this:

* keyLow = 0xFFFFFFFF = 4294967295
* keyHigh = 0xBFFFFFFF = 3221225471
* objId = 0x10FFFFFFE = 4563402750

I would have thought that objId would be 0x0FFFFFFFFFFFFFFF.

If I change the calculation to this:

    value: key_low | ((key_high & 0x0FFFFFFF) << 32)

Now I get -1, even though the top 4 bits wouldn't be set. :) I know JavaScript is bad at arithmetic, but I somehow thought that kaitai IDE would have to have worked around that in order to do anything correctly at all.

* Fixing contents of `omap_val` - it doesn't contain an oid at all (but the way the values were packed hid the problem really well!)
* Maths for computing `obj_id` for a file key have to use `|`, not `+`, otherwise negative values break, and negative values are used in the snapshot info block.
* Naming a value which is the reference to the volume superblock for the snapshot.
* Marking some other unknown stuff which is clearly obj references as `ref_obj`.
@hakanai
Copy link
Contributor Author

hakanai commented May 23, 2018

More stuff that is semi-known:

  • unknown_0 is a reference to some other block which is always present but which I haven't found any purpose for reading yet.
  • unknown_20 if interpreted as a block number points at a HISTORY root node. I wasn't sure whether that made sense for volumes or not.

@tempelmann
Copy link
Contributor

I am now working on updating my code to use the latest ksy files. Is this pull request still valid, or have later changed superseded this?

@cugu
Copy link
Owner

cugu commented Apr 26, 2020

I did some refactoring so that these PRs got conflicts and I never tested them, but they might still be valid.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants