-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(sequencer): rewrite check_tx to be more efficient and fix regression #1515
base: main
Are you sure you want to change the base?
Conversation
577d0c4
to
0de80ec
Compare
0a4f435
to
56ff638
Compare
parked.add( | ||
timemarked_tx, | ||
match parked.add( | ||
timemarked_tx.clone(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think you can do let id = timemarked_tx.id()
before this, then use that later on in the function and remove the clone()
if let Some(rsp) = insert_into_mempool(mempool, &state, signed_tx, metrics).await { | ||
return rsp; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function returning an Option
that's only set if there's an error response is somewhat confusing, would prefer for this function to return a Result<(), response::CheckTx>
so the error variant is the error response
same for check_removed_comet_bft
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it, feels much better that way
/// Returns a [`response::CheckTx`] if the transaction fails any of the checks. | ||
/// Otherwise, it returns the [`SignedTransaction`] to be inserted into the mempool. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can use a Result<SignedTransaction, response::CheckTx>
as this function's return
I think this PR looks good, just a few smaller comments!
how does this work? with cometbft recheck? |
Hmmm. I think I misheard why we wanted to allow rechecks to add to the appside mempool. I checked out the CometBFT code and didn't see it persisting the transactions anywhere that was accessible, (I wonder if I missed it). In the scenario where CometBFT doesn't persist, then the allowing recheck would only be useful if CometBFT doesn't crash if the sequencer app crashses. I'll do some more research |
345fcab
to
af24538
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will follow @Fraser999 approval, but I have made some comments here and there.
I want to especially call attention to the bit which is acquiring a write-lock in a loop: can you leave a comment there why that's acceptable?
} | ||
|
||
#[tokio::test] | ||
async fn receck_adds_non_tracked_tx() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
async fn receck_adds_non_tracked_tx() { | |
async fn recheck_adds_non_tracked_tx() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is still a spelling issue here.
}; | ||
|
||
#[tokio::test] | ||
async fn future_nonce_ok() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe?
async fn future_nonce_ok() { | |
async fn future_nonces_are_accepted() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use a more descriptive test description, like what I suggested or something else.
@@ -1,3 +1,6 @@ | |||
#[cfg(test)] | |||
mod tests; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very mild nit: modules are usually sorted after imports. It's odd that rustfmt does not do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually move this below the imports please. We are following the official Rust style guide: https://doc.rust-lang.org/stable/style-guide/items.html
@@ -109,6 +109,10 @@ impl TimemarkedTransaction { | |||
pub(super) fn cost(&self) -> &HashMap<IbcPrefixed, u128> { | |||
&self.cost | |||
} | |||
|
|||
pub(super) fn id(&self) -> [u8; 32] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this using our new TransactionId
?
That's something for a followup, but we should use that newtype instead of the inner array.
Also (@Fraser999 effort because references are 8 bytes on the stack whereas this array is 32 bytes on the stack):
pub(super) fn id(&self) -> [u8; 32] { | |
pub(super) fn id(&self) -> &[u8; 32] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm blocking this so I can have a second pass over this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have gone over the PR and left some comments. Reading the PR description this patch attempts to tackle 3 things at the same time:
-
avoid rerunning expensive checks that have already been performed once. I believe this new functionality is covered by the changes in
crate::service::mempool
. It is enabled by the functionscheck_removed_comet_bft
andis_tracked
. Theis_tracked
makes use of the newcontained_txs
hashset. Question: is this an optimization to avoid usingpending
andparked
mempool fields? Clearly the mempool needed to track the transactions contained therein before - so what's the reason to have an extra index of transactions? It would be nice to see that explained in the patch description. It would furthermore be useful to understand how this new index and the previous caches will be kept in sync. -
fix a regression for
CheckTxKind::Recheck
. Grepping this PR I only findrecheck
mention in tests. Looking at thefn handle_check_tx
function therequest::CheckTx
is immediately deconstructed and theCheckTxKind::Recheck
never actively considered. It hence looks to me as if we don't discriminate between checks of new transactions and rechecked of transactions existing in the mempool. Given that this is a regression called out in both the patch title and text, it would be nice to have this explained. -
introduce a way to maintain state on sequencer restarts. This part seems to be completely orthogonal to the goals 1. and 2. All related code should be removed from this patch and moved into a new PR. Having said that, it seems that the index of checked transactions (the hashset mentioned in the PR description) was introduced to enable the main goal of the PR (namely skipping unnecessary checks summarized under 1.). Enabling restart logic thus seems to be a useful side effect? Either way, it is confusing to the reviewer in how far this should be made part of the patch, and why it wasn't moved to a new patch. The patch description should be amended accordingly and better explained.
The semantics of adding or removing transactions from the mempool-tracked txs is weird. Looking at the code that makes use of add_from_contained_txs
and remove_from_contained_txs
it looks like you would be better served by creating a custom guard like this (I have intentionally left out the metrics stuff to keep the example minimal; add them as necessary):
struct TxLock<'a> {
mempool: &'a Mempool,
txs: RwLockWriteGuard<'a, HashSet<[u8; 32]>>,
}
impl<'a> TxLock<'a> {
fn add(&mut self, id: [u8; 32]) {
self.tx.insert(id);
}
fn remove(&mut self, id: [u8; 32]) {
self.tx.remove(id);
}
}
impl Mempool {
async fn lock_txs(&self) -> TxLock<'_> {
TxLock {
mempool: self,
txs: mempool.contained_txs.write().await,
}
}
}
|
||
/// Removes a transaction from the mempool's tracked transactions. | ||
/// Will increment logic error metrics and log error if transaction is not present. | ||
fn remove_from_contained_txs(&self, tx_hash: [u8; 32], contained_txs: &mut HashSet<[u8; 32]>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My arguments for the mempool bits above also apply here: this is not removing a transactions from the mempool tracked transactions. It's mutating an argument. This method seems to be a vehicle to access the metrics.
Change or remove please.
} | ||
|
||
#[tokio::test] | ||
async fn receck_adds_non_tracked_tx() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is still a spelling issue here.
@@ -1,3 +1,6 @@ | |||
#[cfg(test)] | |||
mod tests; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually move this below the imports please. We are following the official Rust style guide: https://doc.rust-lang.org/stable/style-guide/items.html
|
||
// tx is valid, push to mempool with current state | ||
// generate address for the signed transaction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This address construction is not necessary I believe (but fixing it is the scope of this PR).
Please remove this comment and create a followup issue to just us the address bytes to fetch all relevant information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracked: #1620
} | ||
RemovalReason::LowerNonceInvalidated => { | ||
return Err(response::CheckTx { | ||
code: Code::Err(AbciErrorCode::LOWER_NONCE_INVALIDATED.value()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intent behind info
was to give a human readable description of the abci code. So please use AbciErrorCode::LOWER_NONCE_INVALIDATED.to_string()
or, if you need a more granular reason, add a new abci code.
Alternatively, provide more background information to the log
line.
info: "transaction removed from app mempool due to lower nonce being \ | ||
invalidated" | ||
.into(), | ||
log: "Transaction removed from app mempool due to lower nonce being \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lower case. Also, log and info seem to provide the same information. That doesn't seem useful.
return Err(response::CheckTx { | ||
code: Code::Err(AbciErrorCode::INVALID_NONCE.value()), | ||
info: "transaction removed from app mempool due to stale nonce".into(), | ||
log: "Transaction from app mempool due to stale nonce".into(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lower case - also info and log seem to have the same text. Doesn't seem useful.
if let Err(error) = | ||
pending.add(ttx, current_account_nonce, ¤t_account_balances) | ||
{ | ||
// remove from tracked | ||
// note: this branch is not expected to be hit so grabbing the lock inside |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Convention is to use upper case XXX
, TODO
, FIXME
, NOTE
etc for comments like this.
// note: this branch is not expected to be hit so grabbing the lock inside | |
// NOTE: this branch is not expected to be hit so grabbing the lock inside |
if let Err(error) = | ||
pending.add(ttx, current_account_nonce, ¤t_account_balances) | ||
{ | ||
// remove from tracked |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If code doing something as simple as this needs to be commented, then its either not self evident (leave a comment tagged with XXX
or NOTE
, see below), or adjust the called methods to make it evident.
tx_hash: [u8; 32], | ||
mempool: &AppMempool, | ||
metrics: &Metrics, | ||
) -> Result<(), response::CheckTx> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could just return a RemovalReason
here and instead implement a new trait IntoCheckTxResponse
on it so that you don't need to do the conversion inline.
state: &S, | ||
signed_tx: SignedTransaction, | ||
metrics: &'static Metrics, | ||
) -> Result<(), response::CheckTx> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have noted it for RemovalReason
, but it also applies here: all the error-to-response conversions could be made a lot neater by adding a trait IntoCheckTxResponse
and moving the conversions there instead of having them inlined in the function body.
@@ -18,6 +21,7 @@ use astria_core::{ | |||
}, | |||
}; | |||
use astria_eyre::eyre::WrapErr as _; | |||
use bytes::Bytes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this import used anywhere here?
Summary
Previously
handle_check_tx()
's processing ofCheckTx::Recheck
would always run all stateless checks and attempt insertion. This PR modified the function to only run the checks on transactions that have not been removed or are not contained inside of the app side mempool.This PR additionally adds a HashSet of transaction hashes that are contained in the app's Mempool to the state of the app Mempool. This is used in the scenario where the sequencer restarts-- CometBFT persists transactions but our app mempool does not. The cache is used to allow transactions to re-enter the mempool on restarts.
This PR also restores the original behavior of
CheckTX
to not kick out transaction needlessly on rechecks, this regression was added when upgrading the mempool in #1323.Background
A lot of the functionality from the original
handle_check_tx()
has moved into the app side mempool itself, including the balance and nonce checks. All of the remaining checks will not change between rechecks and only need to be ran when considering a transaction for insertion into the appside mempool.Testing
Ran locally and added unit tests to ensure future nonces can be added and rechecks do not remove transactions needlessly.
Metrics
check_tx_removed_stale_nonce
check_tx_duration_seconds_check_tracked
,mempool_tx_logic_error
check_tx_duration_seconds_check_nonce
tocheck_tx_duration_seconds_fetch_nonce
Related Issues
closes #1481