Skip to content

02 Know your files in Hex

Boban Spasic edited this page Apr 25, 2023 · 3 revisions

Hex? What is hex?
It is another system of numbers, where the basis is not the number 10, but the number 16. The numbers are: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E and F. Why hex numbers? Well, it goes pretty nice hand-in-hand with binary numbers. For example, let's take an 8-bit binary number 00001111. It translates to hex-number $0F, where the $0 comes from the upper nibble (0000), and the $F from the lower nibble (1111). An 16-bit binary number translates to 4-positions hex-number, like $A5B6. Another nice feature, the hex-numbers lines up for a nice-looking visual presentation.

I'll use the program ImHex here, because it has a nice feature - you can define your data structures, and ImHex will paint your structures in different colors, so it is easier to work with your binary files. Be warned - ImHex crashes a lot when you write your own patterns, and you have errors inside the pattern.

So, let's take a look at one VMEM file in ImHex: ImHex_VMEM_01

You can arrange your view in ImHex as you like. In my screenshot, on the left side is the VMEM data, and on the right side is my pattern definition for a VMEM, which gives the nice, colorful painting of the data on the left pane.
In my pattern, I have defined the header and I have defined every voice in the VMEM as one block, where just the voice name is a separate sub-block. So, you can check the header for correctness. In the pattern struct in the right panel, you can see which values are expected. The sc byte is usually $01, but it can also be anything up to $0F.

So, which corruptions are we looking for:

  • are all the voice names inside the colored blocks? Btw. the voice name is at the end of the data block for one voice. After the voice name is the beginning of the next voice in the VMEM.
  • is there any other string inside the data-part of the voice (outside the voice name)?
  • are there null-bytes ($00) inside the voice name? I also saw $7F bytes inside the voice names (another corruption)

So, if the alignment inside the file does not match the normal VMEM alignment, you either have a bank with missing bytes, or eventually a bank with inserted bytes. You can either delete bytes until you align the data, or inset bytes until you align the data. As we can see, you'll get at least one corrupted voice in the bank this way, but you'll save the rest that you got aligned.

Take a look at the following bank: ImHex_VMEM_02

The first voice seems to be OK, but the second one has missing bytes, and all the voices after that are misaligned. Insert so many bytes somewhere inside the 2nd voice, until you get the name of the 2nd voice aligned to the voice-name sub-block. Sure, the 2nd voice is corrupted, but there is a hope that the 3rd voice and the rest is reconstructed by this small operation.

And here is my pattern script for ImHex:

//Yamaha DX7 Bank dump
//Usual file extension is .syx

struct VMEM_HEADER {
  u8 SysEx;   //0xF0
  u8 Yamaha;  //0x43
  u8 Ch_S;
  u8 F;		  //0x09
  u8 MSB_7;   //7-bit MSB data size = 20
  u8 LSB_7;   //7-bit LSB data size = 00
};

struct VOICE {
  u8 data[118];
  char voicename[10];
};

struct VMEM_FOOTER {
  u8 Checksum;
  u8 SysEx_End;  //0xF7
};

struct VMEM_FILE {
  VMEM_HEADER header;
  VOICE voices[32];
  VMEM_FOOTER footer;
};

VMEM_FILE vmem @ 0x00;

Load your file, copy & paste my script into the pattern editor, and click on "Play" (little triangle under the console window). I still didn't figured out how to save and load my own patterns in ImHex, so I am also doing copy & paste to the pattern editor every time.

Clone this wiki locally