Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v2.0: Supports deserializing accounts lt hash in snapshots (backport of #2994) #3014

Closed
wants to merge 3 commits into from

Conversation

mergify[bot]
Copy link

@mergify mergify bot commented Sep 27, 2024

Problem

The accounts lattice hash will be serialized to/deserialized from snapshots. In order to maintain snapshot compatibility between adjacent versions (edge <-> beta, and beta <-> stable), we must be able to deserialize (and ignore) the new field in adjacent version before we serialize the new field into snapshots.

Summary of Changes

Add support for deserializing the accounts lt hash in snapshots.

Note, the field is ignored (if present) for now.

Note 2, this PR will be backported to v2.0.

Justification to Backport

We're adding a new field to the snapshot. We also must guarantee that snapshots can be read by adjacent versions (e.g. a snapshot created by v2.1 must be loadable by v2.0). In order to maintain compatibility, v2.0 must be able to read this new field (and ignore it). v2.1 will be the first version to write to this new field.


This is an automatic backport of pull request #2994 done by [Mergify](https://mergify.com).

@mergify mergify bot requested a review from a team as a code owner September 27, 2024 19:20
@mergify mergify bot added the conflicts label Sep 27, 2024
Copy link
Author

mergify bot commented Sep 27, 2024

Cherry-pick of 690fad0 has failed:

On branch mergify/bp/v2.0/pr-2994
Your branch is up to date with 'origin/v2.0'.

You are currently cherry-picking commit 690fad08d4.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	modified:   Cargo.lock
	modified:   programs/sbf/Cargo.lock
	modified:   runtime/Cargo.toml
	new file:   runtime/src/serde_snapshot/types.rs

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   accounts-db/src/accounts_hash.rs
	both modified:   runtime/src/serde_snapshot.rs

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

(cherry picked from commit 690fad0)

# Conflicts:
#	accounts-db/src/accounts_hash.rs
#	runtime/src/serde_snapshot.rs
@willhickey
Copy link

Rebased to pick up the increased CI timeout from #3017

HaoranYi
HaoranYi previously approved these changes Sep 29, 2024
Copy link

@HaoranYi HaoranYi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

@bw-solana
Copy link

If some future version adds lattice hash to snapshot and some node doesn't have this code, what happens?

I thought it would just never get read in but we would otherwise continue to function. I don't understand how that's different from reading/deserializing it and dropping

jeffwashington
jeffwashington previously approved these changes Sep 30, 2024
Copy link

@jeffwashington jeffwashington left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@@ -46,6 +46,7 @@ regex = { workspace = true }
serde = { workspace = true, features = ["rc"] }
serde_derive = { workspace = true }
serde_json = { workspace = true }
serde_with = { workspace = true }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally we aren't introducing dependencies in backports to what will imminently be stable branch. is this strictly necessary?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dependency is already used in the monorepo, just not in the accounts-db crate. It likely is not strictly necessary. Would it be preferable to do a manual implementation vs use serde_with? I'll code that up and see what it looks like.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit 1910669 removes the use of serde_with.

The code is based on the discussion in serde-rs/serde#1937, and also the code from expanding the original serde_with macros.

I've done additional testing on this commit by creating a snapshot with master that serializes a Some to this new snapshot field, and then ensured this PR can deserialize-and-ignore the field and load from the snapshot successfully.

I think overall I'm more in favor of the version using serde_with, but I do understand the concerns in general of backports adding dependencies.

@brooksprumo
Copy link

If some future version adds lattice hash to snapshot and some node doesn't have this code, what happens?

Then the node will panic during snapshot loading, due to encountering an unexpected field.

I thought it would just never get read in but we would otherwise continue to function. I don't understand how that's different from reading/deserializing it and dropping

We have to be explicit about all fields, even if we intend to ignore them.

pub [u16; 1024],
);

impl<'de> Deserialize<'de> for SerdeAccountsLtHash {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yuck. I do not like to review this. @t-nelson thoughts?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we aren't willing to review logic, we have no business using a dependency that to hide it from us. especially the case for a backport

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this case the dependency was already used by runtime, just not by accounts-db, as I understand it. So, yes, we can review it, but it sure feels nicer reviewing the simple thing than this. But, you certainly have a point. The fact we're adding a dependency feels more like a code modularization byproduct than it does a new risk.

Copy link

@HaoranYi HaoranYi Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I review, "serde_as" procedure macro is already being used for packet class, which is u8[1232]. And here we are just using it again in a similar way for u16[1024]. Therefore, I think it should be fine.

#[serde_as]

Also, looking at the source repo for "serde_as", there is already sufficient tests coverage of "serde_as" for the fixed size array case that we used here.
https://github.com/jonasbb/serde_with/blob/f1b79f27cac7545513c4f14f7c295db2b843cc4f/serde_with/tests/serde_as/lib.rs#L321

Here, instead of using "serde_as", we basically implement just deserialize specially for the u16[1024] by hand, which I have reviewed and believe it is correct. And I trust that books has already tested it manually. But as far as the test for this manually written the code goes, the coverage may not be as good as those tests already in the "serde_as" crate. Beside, with "serde_as" we also get the 'serialize' for free.

Therefore, IMHO, I prefer use "serde_as" in this case (less code). But I am fine with the hand implementation too. They don't make much difference.

@willhickey
Copy link

I'd like to get another v2.0 release out (to get the gossip changes soaking in testnet)...

if v2.0.12 doesn't include this change that just means that nodes running v2.0.12 can't load snapshots produces by v2.1, right? And the solution would be upgrading to a later version of v2.0 that does include this change?

If those assumptions are correct I'm going to tag a release soon, and this can go in the next release (probably on Friday).

@brooksprumo
Copy link

if v2.0.12 doesn't include this change that just means that nodes running v2.0.12 can't load snapshots produces by v2.1, right?

I haven't pushed any changes into master w.r.t. writing the new field to the snapshot, so v2.0 won't be broken. I won't write the new field in master until this backport PR is merged.

So feel free to tag a new release; this PR does not need to hold that process up. Thanks for checking!

Copy link

@jeffwashington jeffwashington left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@brooksprumo
Copy link

After a discussion on the backport review meeting, it was decided to not backport this PR. It was the combination of how soon v2.0 is going to stable/mnb PLUS not having an approved SIMD for the feature itself PLUS the original backport touching a cargo.lock file.

Assuming we continue with ~3 releases per year, timing may work out just fine. We may also investigate putting the accounts lt hash value in a sysvar (or other account) instead of in the snapshot.

@brooksprumo brooksprumo closed this Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants