Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor MDL BF aborting through ha_abort_transaction() #307

Draft
wants to merge 1 commit into
base: 10.6
Choose a base branch
from

Conversation

temeo
Copy link

@temeo temeo commented Apr 5, 2023

BF aborting through ha_abort_transaction() has proven to be
problematic. The InnoDB implementation of wsrep_abort_transaction
is racy with respect to THD/trx object access because THD mutexes
cannot be held over whole BF abort process to avoid deadlocks when
wsrep_abort_transaction calls back to server.

Change the logic for BF aborts which originate from MDL in the
following way:

  • First check if the BF abort should progress by calling
    wsrep-lib bf_abort() which checks the wsrep transaction
    and Galera side transaction states.
  • If the BF abort should progress, call ha_abort_transaction()
    without releasing LOCK_thd_data in between.
  • Remove call back to server in InnoDB wsrep_abort_transaction().

This makes the wsrep transaction state check and InnoDB
transaction kill atomic, while simultaneously removing the
need to call back to server from InnoDB.

TODO: Is THD::wsrep_aborter needed anymore? Always calling
wsrep-lib bf_abort() guards against multiple BF aborts.
Maybe still needed for InnoDB.

Change for MDEV-25717: Introduce separate sync point in
wsrep_abort_thd() to control thread execution during MDL
induced BF abort.

Change for galera_create_table_as_select: CTAS may now fail
also with ER_QUERY_INTERRUPTED due to use of THD::awake().

@temeo temeo marked this pull request as draft April 5, 2023 09:52
{
victim_trx->lock.set_wsrep_victim();
}
victim_trx->mutex_unlock();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the point where:
Can victim transaction move from ACTIVE to COMMITTED_IN_MEMORY so that flag is still there ? or it is cleared during commit ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The transaction state is changed to TRX_STATE_COMMITTED_IN_MEMORY in trx_t::commit_state(). This is protected by TMTrxGuard and happens before the end of trx_t::commit_in_memory() where the lock.was_chose_as_deadlock_victim is set to false.

If the lock.was_chosen_as_deadlock_victim is set just before commit, the commit can still proceed as the flag is not checked in commit code path, and the flag is cleared at the end of trx_t::commit_in_memory(). With this fix the flag cannot be set to true after the transaction has passed trx_t::commit_in_memory().

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Add comment in code which clarifies this.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After code reading I can see that you are correct.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one more possible race when autocommit non-locking transaction changes its state to TRX_STATE_NOT_STARTED in commit_in_memory(). Not sure what to do about it...

Copy link

@janlindstrom janlindstrom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just have one question but overall this looks very promising and better what we have now.

@temeo
Copy link
Author

temeo commented Apr 5, 2023

--thread-handling=pool-of-threads must be tested.

BF aborting through ha_abort_transaction() has proven to be
problematic. The InnoDB implementation of wsrep_abort_transaction
is racy with respect to THD/trx object access because THD mutexes
cannot be held over whole BF abort process to avoid deadlocks when
wsrep_abort_transaction calls back to server.

Change the logic for BF aborts which originate from MDL in the
following way:
* First check if the BF abort should progress by calling
  wsrep-lib bf_abort() which checks the wsrep transaction
  and Galera side transaction states.
* If the BF abort should progress, call ha_abort_transaction()
  without releasing LOCK_thd_data in between.
* Remove call back to server in InnoDB wsrep_abort_transaction().

This makes the wsrep transaction state check and InnoDB
transaction kill atomic, while simultaneously removing the
need to call back to server from InnoDB.

MTR test changes:
- MDEV-25717: Introduce separate sync point in
  wsrep_abort_thd() to control thread execution during MDL
  induced BF abort.
- galera_create_table_as_select: CTAS may now fail
  also with ER_QUERY_INTERRUPTED due to use of THD::awake().
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants