Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[split #103] edit text #107

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

[split #103] edit text #107

wants to merge 1 commit into from

Conversation

xBZZZZ
Copy link

@xBZZZZ xBZZZZ commented Jan 9, 2023

Splitting #103.
This one is about edit text.
I think this is last one.

@nekitdev
Copy link
Member

I need confirmation as to why the last bytes can be garbage.

@nekitdev
Copy link
Member

The save is not padded at all, therefore closing this PR.

@xBZZZZ
Copy link
Author

xBZZZZ commented Jan 11, 2023

I need confirmation as to why the last bytes can be garbage.

I only got 0 or 1 or 2 bytes of garbage in CCLocalLevels.dat. I think they are garbage because save file can load without them.

here are CCLocalLevels.dat with garbage:

size % 4link
1https://gdccdated.glitch.me/s/CCLocalLevels_windows_1byte_of_garbage.dat
2https://gdccdated.glitch.me/s/CCLocalLevels_windows_2bytes_of_garbage.dat

The save is not padded at all

If you are talking about MacOS, you are wrong (maybe your AES decryptor automatically removes padding).

If above sentence is wrong please send last 32 bytes of your save file (the without padding one).

@nekitdev
Copy link
Member

Those bytes are likely Base64 padding, and no, I'm not talking about macOS saves.

@xBZZZZ
Copy link
Author

xBZZZZ commented Jan 12, 2023

Those bytes are likely Base64 padding

Base64 is padded with = (character code 0x3D).
Base64 encoded string length is always divisible by 4.

Here are last 2 bytes of https://gdccdated.glitch.me/s/CCLocalLevels_windows_2bytes_of_garbage.dat in hex:

0b 7f

here they are xored with 0x0b:

00 74

(0x74 is t character)

Neither of these are = character.

Also NUL (0x00) character is not base64 character.

@Cvolton
Copy link
Collaborator

Cvolton commented Jan 12, 2023

The code for calculating Windows save file sizes is as follows
size_t length = (-(inLength != (inLength / 3) * 3) & 4) + (inLength * 4) / 3;
which translated to something more readable is
size_t length = ((inLength % 3 == 0) ? 0 : 4) + ((inLength * 4) / 3)
where inLength is the amount of bytes before encoding with base64.

Basically if the length before encoding is divisible by 3, the final save file size is just (inLength * 4) / 3, otherwise it adds another 4 characters on top of that. One is then used for the last character of the base64 segment itself and up to 2 characters can be used for the equal signs. Any of the allocated but unused characters would then be replaced with garbage in the final file. The last character after the actual content should always be 0x0B in the final result, as that's the null terminator after XOR. Anything following that would be random leftover bytes from the memory.

@Cvolton
Copy link
Collaborator

Cvolton commented Jan 15, 2023

I am re-opening this PR and #106, as this makes the observation that "If file size is not divisible by 4, last file_size % 4 bytes are garbage." correct.

@Cvolton Cvolton reopened this Jan 15, 2023
@Cvolton
Copy link
Collaborator

Cvolton commented Jan 15, 2023

In fact, the included base64 padding (the equal signs) is mandated by the standard just to make sure the length of the final content is divisible by 4. I believe the cited reason for that is easy concatenation of multiple base64 strings of various lengths without the need to decode them

@xBZZZZ
Copy link
Author

xBZZZZ commented Jan 15, 2023

easy concatenation of multiple base64 strings of various lengths without the need to decode them

b64encode(b'Geometry') + b64encode(b'Dash') is R2VvbWV0cnk=RGFzaA== which is not valid base64 string

@Cvolton
Copy link
Collaborator

Cvolton commented Jan 15, 2023

The ability to decode concatenated strings like that depends on the decoder, it's true that a lot of encoders drop the ability to recover strings concatenated like this, however many do include this ability, for example the website https://www.base64decode.com/ or the GNU base64 utility.

[cvolton@hpgaming ~]$ echo R2VvbWV0cnk=RGFzaA== | base64 -d
GeometryDash

I have no idea how the implementation in GD itself behaves in this regard, not that it really matters for this purpose

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants