Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

svm: advance nonce for fee-only transactions sooner #2741

Merged
merged 15 commits into from
Sep 11, 2024

Conversation

2501babe
Copy link
Member

@2501babe 2501babe commented Aug 26, 2024

Problem

as part of SIMD-83, transaction loading needs to track state changes made by previous transactions in the same batch. RollbackAccounts is a struct used to manage state changes that must be committed in the case of transaction processing failure. successful transactions, on the other hand, hold the data to be committed on their loaded transaction accounts struct

presently, RollbackAccounts is set up with the prior nonce account state and the expected fee-payer account state. for failed transactions, the nonce data here is not advanced until state changes are being committed, necessitating that the account saver mutate TransactionProcessingResult objects

Summary of Changes

store the already advanced nonce data in RollbackAccounts, in line with how the fee-payer is handled. this means transaction post-processing can reuse the account saver method collect_accounts_to_store without cloning TransactionProcessingResult because the account saver will no longer need mutable references to them

@2501babe 2501babe force-pushed the 20240826_txbatchnonce branch 2 times, most recently from 88a5da5 to 9768c4e Compare August 27, 2024 07:41
@2501babe 2501babe force-pushed the 20240826_txbatchnonce branch 2 times, most recently from 558e361 to aee85d9 Compare August 27, 2024 16:48
@2501babe
Copy link
Member Author

50a1e08 is the important commit, and all code changes are isolated to there

its a relatively small change on its own. code that advances nonces has been deleted from the account saver, so it now accepts the transaction processing results as immutable references (my goal for this, so i can reuse the functions inside the transaction processing loop without clone() or fear). instead we advance the nonce when setting up rollback accounts. we keep nonce and fee-payer separate as already done but overwrite fee-payer data if a fee-paying nonce account is used. also this all means commit_transactions() in bank.rs doesnt need a blockhash anymore

the rest of the commits are either fixing existing unit tests, or (most of the loc) improving the new svm integration test setup to handle nonce transactions and the new fee-only transaction state. we test all possible nonce outcomes with and without the feature

@2501babe 2501babe marked this pull request as ready for review August 27, 2024 19:34
@2501babe 2501babe requested a review from jstarry August 27, 2024 19:34
Copy link

@jstarry jstarry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks awesome, haven't looked at tests too closely yet but like what I see there too so far

core/src/banking_stage/consumer.rs Outdated Show resolved Hide resolved
runtime/src/bank.rs Show resolved Hide resolved
svm/src/account_saver.rs Outdated Show resolved Hide resolved
svm/src/rollback_accounts.rs Outdated Show resolved Hide resolved
svm/src/transaction_processor.rs Outdated Show resolved Hide resolved
@2501babe
Copy link
Member Author

had to rebase for conflicts so the incremental diff is not useful but most changes are in 15ee97b

i moved nonce advancing into the check transaction step. thought a lot about the cleanest way to do this, since i didnt want to break the encapsulation of NonceInfo by adding new methods to muck around with the data potentially bypassing the authority check. pretty happy with how it looks, i solved that question by returning a triple from load_message_nonce_account and deferring construction NonceInfo until after we advance the nonce, so we dont create one just to take it apart and make a new one again

also removed the unnecessary timing metric and transaction result mut method

runtime/src/bank/check_transactions.rs Outdated Show resolved Hide resolved
runtime/src/bank/check_transactions.rs Outdated Show resolved Hide resolved
svm/src/rollback_accounts.rs Show resolved Hide resolved
runtime/src/bank/check_transactions.rs Outdated Show resolved Hide resolved
self.load_message_nonce_account(message)
{
let lamports_per_signature = nonce_data.get_lamports_per_signature();
let next_nonce_state = NonceState::new_initialized(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This optimization can come later if we want it but I think we can delay the serialization of the advanced nonce by including the advanced NonceData in the NonceInfo and only serialize that data into the account shared data when NonceInfo::account is called. I think this would make the same fee payer / nonce account case a little cleaner too because we wouldn't need to copy the nonce account's data over during rollback accounts creation, we could delay it until nonce.account() is called when we collect accounts for storage.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im going to leave this as is for now but may end up doing it in #2702 because ill need to reverify nonces mid-batch

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me. And just in case it wasn't fully clear, delaying the serialization will be nice because if the transaction succeeds, we won't need to waste any time serializing a rollback nonce account that is never used.

svm/src/transaction_processor.rs Outdated Show resolved Hide resolved
svm/tests/mock_bank.rs Outdated Show resolved Hide resolved
svm/tests/integration_test.rs Outdated Show resolved Hide resolved
@2501babe
Copy link
Member Author

ok added the new tests and other cleanups

let (nonce_address, mut nonce_account, nonce_data) =
self.load_message_nonce_account(message)?;

let lamports_per_signature = self.get_lamports_per_signature();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't the correct lamports_per_signature value. Instead of getting it from the fee calculator, it needs to come from the blockhash queue. You should grab the value from the hash_queue in check_age

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for catching this, i used an unwrap as this is identical to last_blockhash_and_lamports_per_signature() but let me know if youd rather i zip and match on an option

btw what is the difference between the lamports_per_signature on the bank and the one on the blockhash queue? im not familiar with how variable fees would function if they were ever turned on/employed

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw what is the difference between the lamports_per_signature on the bank and the one on the blockhash queue? im not familiar with how variable fees would function if they were ever turned on/employed

In register_recent_blockhash we would have recorded the derived lps for the blockhash of the current bank's parent block. Then in the current bank's constructor (_new_from_parent) we derive a new lps based on the signature count of the parent block's signature count (FeeRateGovernor::new_derived). So it's important that we are using the lps value from the parent bank when populating the nonce, not the value from the current bank.

Now, let's look into how lps could change inside the fee rate governor.. I pulled the fee rate governor fields from a mainnet beta snapshot:

[snapshot-tool/src/main.rs:200:13] bank.fee_rate_governor = FeeRateGovernor {
    lamports_per_signature: 5000,
    target_lamports_per_signature: 10000,
    target_signatures_per_slot: 20000,
    min_lamports_per_signature: 5000,
    max_lamports_per_signature: 100000,
    burn_percent: 50,
}

Based on the logic for FeeRateGovernor::new_derived it looks like a parent bank would need to have over 10,000 signatures in order to increase the desired lps beyond 5000. So this code definitely is enabled but...

Also it's important to call out that for fee calculation we don't really use lps from the FeeRateGovernor anymore, we get it from FeeStructure (see how get_fee_for_message_with_lamports_per_signature only uses lps from fee rate governor to disable fees for tests). We only use the fee rate governor's lps value for storing in the nonce account right now and that's why it's important to get correct for consensus right now. But we should clean up all uses of the fee rate governor altogether at some point.

Btw, in the past we did have a separate mechanism for congestion driven fees but it was buggy and removed. Discussion here: https://discord.com/channels/428295358100013066/488758828330909706/1201405954256818176

bank.store_account(&nonce_pubkey, &nonce_account);

let nonce_account = bank.get_account(&nonce_pubkey).unwrap();
let current_lamports_per_signature = bank.get_lamports_per_signature();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this test should be using the lps value from last_blockhash_and_lamports_per_signature

@apfitzge
Copy link

apfitzge commented Sep 4, 2024

Haven't looked at the code yet. Have some quick thoughts based on PR description:

advance nonces at the same time as fees are assesed in RollbackAccounts. this means transaction post-processing can reuse the account saver methods without cloning loaded transactions because the account saver will no longer need mutable references to them

I think loading and advancing early only works because we force nonce ixs to be the first instruction, let's make sure we have some sort of assertion on relevant code.
If it's not ensured to be the first instruction, advancing early would mean its a consensus change because ixs before the nonce ix would see an already advanced state.

If we advance early, wrt SIMD83 we must be careful that we do not advance if we hit some sort of non-recordable error, i.e. tx gets dropped.
Right now that could be specific other accts failing to load, in that case we'd need to make sure the nonce state is not already advanced in whatever batch acct storage/map we use, since subsequent txs should see the original nonce account state.

@jstarry
Copy link

jstarry commented Sep 5, 2024

I think loading and advancing early only works because we force nonce ixs to be the first instruction, let's make sure we have some sort of assertion on relevant code.

I'm not sure I follow what you mean here. This change is for advancing nonces ahead of time in case the transaction fails. It has no effect on processing successful transactions (besides wasting a bit of work to serialize the rollback nonce account).

runtime/src/bank/check_transactions.rs Show resolved Hide resolved
Ok(CheckedTransactionDetails {
nonce: Some(nonce),
lamports_per_signature: nonce_data.get_lamports_per_signature(),
lamports_per_signature,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is technically wrong but currently harmless (because this lps value is only used for turning off fees inside validate_transaction_fee_payer). Let's play it safe and keep the old logic of using the previous nonce data lps value rather than changing it to the blockhash queue value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok i believe we should be perfectly in line with master now. the nonce data is updated with the lps value from the tip of the blockhash queue. but the previous nonce data lps value is assigned in the check details, which is ultimately only compared to 0 in svm to see if fees are in play

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool that matches my understanding as well, thanks for making the change

@2501babe
Copy link
Member Author

2501babe commented Sep 5, 2024

I think loading and advancing early only works because we force nonce ixs to be the first instruction, let's make sure we have some sort of assertion on relevant code. If it's not ensured to be the first instruction, advancing early would mean its a consensus change because ixs before the nonce ix would see an already advanced state.

when it comes to "is this a valid nonce transaction" ie will check_age() allow the pseudo-blockhash to validate, the function get_durable_nonce() on SanitizedMessage takes the first instruction, checks if its a valid SystemInstruction::AdvanceNonceAccount, and if so returns the pubkey of the nonce account. no other instruction is consulted. the current version of this pr also stores the advanced nonce in RollbackAccounts here (a previous version advanced the rollback nonce during transaction loading, and the version in master advances the rollback nonce while preparing to commit)

i vaguely remember we might have discussed a mechanism to skip executing the advance_nonce_account() system instruction processor, if thats what you mean by "loading and advancing early", but i actually cant find any code that does this. checking with std::backtrace, my integration tests here have eight nonce transactions that succeed, and in every case they execute the real instruction via InvokeContext::process_executable_chain()

If we advance early, wrt SIMD83 we must be careful that we do not advance if we hit some sort of non-recordable error, i.e. tx gets dropped. Right now that could be specific other accts failing to load, in that case we'd need to make sure the nonce state is not already advanced in whatever batch acct storage/map we use, since subsequent txs should see the original nonce account state.

ill update the description to be clearer, this does not modify the accounts store in any way prior to commit_transactions(). the change is just: previously, RollbackAccounts held the old nonce, and the account saver would advance the nonce right before committing. this changes it so we hold an already advanced nonce in RollbackAccounts (similar to how we hold a debited fee-payer). the point of all this is just so collect_accounts_to_store in account saver no longer mutates the TransactionProcessingResult objects it takes, so we can reuse the collect function in svm in between transaction executions

@2501babe 2501babe force-pushed the 20240826_txbatchnonce branch 2 times, most recently from 3dac928 to 4131482 Compare September 6, 2024 08:12
jstarry
jstarry previously approved these changes Sep 9, 2024
these pass against master as-written
previously rollback accounts carried the fee payer that was to be committed, but an untouched nonce
now, the account saver can use the nonce and fee payer from rollback accounts identically, possibly merging them for a fee-paying nonce
@2501babe
Copy link
Member Author

@jstarry i had to fix merge conflicts again because these files keep changing in master, if you wouldnt mind approving again once ci passes

@2501babe 2501babe added the automerge automerge Merge this Pull Request automatically once CI passes label Sep 10, 2024
Copy link

mergify bot commented Sep 10, 2024

automerge label removed due to a CI failure

@mergify mergify bot removed the automerge automerge Merge this Pull Request automatically once CI passes label Sep 10, 2024
@2501babe
Copy link
Member Author

(noting that the failure is just the partitions job failing to curl something from github)

@2501babe 2501babe added the automerge automerge Merge this Pull Request automatically once CI passes label Sep 10, 2024
Copy link

mergify bot commented Sep 10, 2024

automerge label removed due to a CI failure

@mergify mergify bot removed the automerge automerge Merge this Pull Request automatically once CI passes label Sep 10, 2024
@2501babe 2501babe added the automerge automerge Merge this Pull Request automatically once CI passes label Sep 11, 2024
@mergify mergify bot merged commit f0a77e9 into anza-xyz:master Sep 11, 2024
45 checks passed
@2501babe 2501babe deleted the 20240826_txbatchnonce branch September 11, 2024 12:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automerge automerge Merge this Pull Request automatically once CI passes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants