Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(SpokePool): Introduce opt-in deterministic relay data hashes #583

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

nicholaspai
Copy link
Member

@nicholaspai nicholaspai commented Aug 20, 2024

Motivation

We should allow a depositor to pre-compute their relay data hash so that they can work with a filler to more quickly fill their deposit while avoiding re-org risk (which could change the ordering of V3FundsDeposited events and change a depositor's expected depositId) and race conditions with another innocent or malicious depositor who front-runs their depositId.

The advantage of using the global depositId as a nonce today is gas cost savings since we don't need to store a new value in a new slot on each deposit. We should not add gas cost to deposits.

Secondly, incrementing the global depositId after each deposit guarantees that each depositV3() produces a unique deposit. This means that a depositor does not have to worry about accidentally producing a relay hash collision that means their deposit would be unfillable. Fortunately, unfillable deposits are refunded on the origin chain to the depositor but this possibility is undesirable for many depositors.

Therefore, any implementation of deterministic relay hashes should ideally be opt-in.

Implementation

Introduces a new function: unsafeDepositV3() which gives the msg.sender 12 bytes with which to set a deposit nonce. It is marked unsafe because there is no protection given by the contract that the resultant relay data hash produced will be unique. The 12 byte deposit nonce is combined with the msg.sender's address, allowing each msg.sender to only track their own deposit nonces to avoid collisions. This also protects depositors from being griefed by malicious actors who might try to front-run their unsafe relay data hashes.

unsafeDepositV3() is slightly more expensive than depositV3 from a gas-cost perspective because it adds a keccak256 + abi.encodePacked operation but it removes a SSTORE

The existing depositV3() function's implementation is untouched.

Changes to V3RelayData struct

Changes the depositId parameter in the V3RelayData struct from uint32 to uint256. In storage, this parameter is stored as a uint256 so at the low-level, this changes nothing. Moreover, in practice until now, depositId's were always set as uint32 meaning that the first 24 bytes of the uint256 depositId slot are currently set to 0.

This change is made because this PR introduces a new function: unsafeDepositV3() which allows someone to set a depositId with non-zero bytes in the first 24 bytes.

Implications for clients

Any logic that assumes that depositId's will always be monotonically increasing now needs to be changed. For example, the queryHistoricalDepositForFill function performs a binary search on V3FundsDeposited events' depositId values to locate historical fills. This clearly will no longer work following this PR, so this corresponding change in the relayer client skips this function in all cases unless the deposit was sent via a legacy function where all deposit ID's are produced "safely": across-protocol/relayer#1769

## Motivation

We should allow a depositor to pre-compute their relay data hash so that they can work with a filler to more quickly fill their deposit while avoiding re-org risk (which could change the ordering of V3FundsDeposited events and change a depositor's expected `depositId`) and race conditions with another innocent or malicious depositor who front-runs their depositId.

The advantage of using the global `depositId` as a nonce today is gas cost savings since we don't need to store a new value in a new slot on each deposit. We should not add gas cost to deposits.

Secondly, incrementing the global `depositId` after each deposit guarantees that each `depositV3()` produces a unique deposit. This means that a depositor does not have to worry about accidentally producing a relay hash collision that means their deposit would be unfillable. Fortunately, unfillable deposits are refunded on the origin chain to the depositor but this possibility is undesirable for many depositors.

Therefore, any implementation of deterministic relay hashes should ideally be opt-in.

## Implementation

Introduces a new function: `unsafeDepositV3()` which gives the msg.sender 12 bytes with which to set a deposit nonce. It is marked unsafe because there is no protection given by the contract that the resultant relay data hash produced will be unique. The 12 byte deposit nonce is combined with the msg.sender's address, allowing each msg.sender to only track their own deposit nonces to avoid collisions. This also protects depositors from being griefed by malicious actors who might try to front-run their unsafe relay data hashes.

`unsafeDepositV3()` is no more expensive than `depositV3` from a gas-cost perspective.

The existing `depositV3()` function's implementation is untouched.

## Changes to V3RelayData struct

Changes the `depositId` parameter in the `V3RelayData` struct from uint32 to uint256. In storage, this parameter is stored as a uint256 so at the low-level, this changes nothing. Moreover, in practice until now, depositId's were always set as uint32 meaning that the first 24 bytes of the uint256 `depositId` slot are currently set to 0.

This change is made because this PR introduces a new function: `unsafeDepositV3()` which allows someone to set a `depositId` with non-zero bytes in the first 24 bytes.
@nicholaspai
Copy link
Member Author

Some off chain work related to this pr:

It will be necessary to change all offchain code that tries to query deposits using deposit Id since we can no longer assume that deposit id is monotonically increasing for each deposit

* all parameters to this function along with this chain's chainId(). Relayers are only refunded for filling
* deposits with deposit hashes that map exactly to the one emitted by this contract.
* @param depositNonce The nonce that uniquely identifies this deposit. This function will combine this parameter
* with the msg.sender address to create a unique uint256 depositNonce and ensure that the msg.sender cannot

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the deposit is being made via a 4337 bundler, then the msg.sender will be the bundler address?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super familiar with 4337 but that sounds right. I think that might be a feature that allows the bundler to ensure that its own deposit hashes are all unique. The alternative instead of msg.sender is using the depositor address but that allows the msg.sender to manipulate the hashes and try to front-run honest msg.senders.

// this is unlikely and we also should consider not checking it and incurring the extra gas cost:
// if (msg.sender == address()) revert InvalidUnsafeDepositor();

uint256 depositHash = uint256((uint160(msg.sender) << 12) | depositNonce);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: not sure if "hash" is the correct word here. Probably "id" would be better, like in "speedUpV3Deposit"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point

uint256 outputAmount,
uint256 destinationChainId,
address exclusiveRelayer,
uint256 depositNonce,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "nonce" is overloaded as "uint96" provided by the user and also as "uint256" (msg.sender + value provided by the user). Perhaps "id" would be fitting here as well

@nicholaspai
Copy link
Member Author

I've updated the algorithm to derive the depositId in unsafeDepositV3 by removing the left shift and OR operations in favor of more simply packing together the msg.sender's address and the input depositNonce, keccak256 hashing the packed bytes, and converting that bytes32 into a uint256. I believe this preserves the properties we want (every depositor can generate their own unique deposit hashes and not get griefed by other EOA's)

@nicholaspai nicholaspai marked this pull request as ready for review September 24, 2024 15:55
// Increment count of deposits so that deposit ID for this spoke pool is unique.
// @dev Implicitly casts from uint32 to uint256 by padding the left-most bytes with zeros. Guarantees
Copy link
Member Author

@nicholaspai nicholaspai Sep 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something to verify for the reviewer: We want to guarantee that any caller of depositV3 will never produce a uint256 depositId that could collide with one produced by a caller of unsafeDepositV3. I think this is true because numberOfDeposits is a uint32 while the way unsafeDepositV3 produces a depositId is by packing the 20 byte address of the msg.sender with the uint96 depositId which guarantees that the resultant 32 byte value when casted into a uint256 will always be greater than the MAX_UINT32 if the address is non-0

UPDATED: packing the address with a depositId input by the caller and then hashing the resultant bytes. The chance that a resultant hash collides with a uint32 is less than the chance that the first 28 bytes of the hash are equal to 0 which is extremely small. I don't think this is a realistic possibility we need to guard against.

The keccak256 hashing of the packed 32 bytes removes the guarantee that unsafe and safe deposits can't collide. This is because the resultanthash might produce a number less than uint32.max
Copy link
Contributor

@mrice32 mrice32 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments! Looks awesome

// this is unlikely and we choose to not check it and incur the extra gas cost:
// e.g. if (msg.sender == address(0)) revert InvalidUnsafeDepositor();

uint256 depositId = uint256(bytes32(abi.encodePacked(msg.sender, depositNonce)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could flip the ordering and require that depositNonce > 0, making every custom nonce a very large number regardless of address. That would mean that any address would work, which would remove any concerns around precompiles that are at address 0x1, 0x2, 0x3 (not sure if anybody does this, but would prefer to avoid building our noncing around it).

Thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

require that depositNonce > 0,

Ideally I was trying to avoid an additional require statement here but I agree this would be a good algorithm. This is why I wanted to avoid require(msg.sender != address(0))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment is irrelevant now that we hash the packed data: 19a3380

// we might want to consider explicitly checking that the msg.sender is not the zero address. However,
// this is unlikely and we choose to not check it and incur the extra gas cost:
// e.g. if (msg.sender == address(0)) revert InvalidUnsafeDepositor();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OOC, what's the reasoning behind not wrapping all this in a hash? Simplicity for the integrator to predict the deposit hash?

Hash advantages:

  • We can use full 256 bit depositNonces (might improve cross compatibility with things like permit2 or other noncing systems that may be used concurrently).
  • No matter how we concat the data, there is no concern about creating collisions with normal depositIds.
  • We could add more parameters that get baked into the depositId. For instance, this could be called by a contract that passes its msg.sender as the depositor. Adding the depositor to the depositId hash would mean that different depositors wouldn't collide.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no concern about creating collisions with normal depositIds.

Is this true? What if the hash produces some value where the first 28 bytes are 0 (miraculously)? Wouldn't that then translate into a valid uint32 value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise I totally agree with all the reasons FOR hashing this

Copy link
Contributor

@mrice32 mrice32 Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did the math on this.

The probability of finding 28 bytes of leading zeros is 1 - (1 - (1/2^224))^num_tries.

You get to about 0.3% after 1e65 tries (screenshot below). For reference, Bitcoin produces about 2e28 hashes per year, meaning it would take about 5e36 years for the entire bitcoin network to have a 0.3% chance of cracking this.

Screenshot 2024-09-26 at 10 26 23 AM

contracts/SpokePool.sol Outdated Show resolved Hide resolved
@nicholaspai
Copy link
Member Author

@mrice32 just one question: do you think we should add more params to the hashed struct?

We could add more parameters that get baked into the depositId. For instance, this could be called by a contract that passes its msg.sender as the depositor. Adding the depositor to the depositId hash would mean that different depositors wouldn't collide.

/**************************************
* INTERNAL FUNCTIONS *
**************************************/

function _depositV3(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be copied from the existing depositV3 logic

@mrice32
Copy link
Contributor

mrice32 commented Oct 1, 2024

@mrice32 just one question: do you think we should add more params to the hashed struct?

We could add more parameters that get baked into the depositId. For instance, this could be called by a contract that passes its msg.sender as the depositor. Adding the depositor to the depositId hash would mean that different depositors wouldn't collide.

Yeah, I think msg.sender and depositor should both be in there to handle interactions from both regular users and passthrough contracts.

Comment on lines 1104 to 1106
* @dev msg.sender and depositor are both used as inputs to allow passthrough depositors to create unique
* deposit hash spaces for unique depositors.
* @param sender The msg.sender address used as input to produce the deposit ID.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wording is a bit counter-intuitive for me because it's relying on the implementation details of another function (unsafeDepositV3()). i.e. it's confusing that msg.sender is referenced in the comment, but sender is an input argument and can be any address.

Is relayer a candidate variable name? It implies that the relayer's address is needed to produce the deterministic deposit ID. In context of a relayer filling via a contract it would be the contract address, not the EOA, so idk whether that introduces a possible additional source of confusion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think msgSender maybe is a better variable name

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because maybe an aggregator also wants to use the unsafeDeposit method but there would be no natural "relayer" to fill in, right? Maybe i'm over thinking it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants