Skip to content

Conversation

@karlfloersch
Copy link
Contributor

@karlfloersch karlfloersch commented Jun 6, 2021

Description
We have observed that when a transaction is stuck in the mempool, the replacement transactions fail to be applied and would return UNPREDICTABLE_GAS_LIMIT with the error that the starting index in the batch is incorrect.

The reason the starting index was seemingly incorrect was because eth_estimateGas has blockTag=pending by default.
That meant that when we attempted to resubmit our transaction, while estimating gas, it would attempt to reapply the same batch on top of the pending transaction. We explicitly do not want this behavior.

This PR changes the batch submitter submission logic in such a way that before submitting the transaction it populates the transaction fields including the gasLimit. This way we explicitly specify a gasLimit, we don't call estimateGas, and the transaction resubmission can occur.

Fixes OP-834

@karlfloersch karlfloersch requested a review from annieke as a code owner June 6, 2021 01:50
@changeset-bot
Copy link

changeset-bot bot commented Jun 6, 2021

🦋 Changeset detected

Latest commit: e2985bf

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@eth-optimism/batch-submitter Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions github-actions bot added 2-reviewers A-op-batcher Area: op-batcher labels Jun 6, 2021
@karlfloersch karlfloersch force-pushed the fix/broken-estimate-gas-when-pending-tx-in-mempool branch 2 times, most recently from 34cd37a to b979d38 Compare June 6, 2021 02:01
@codecov-commenter
Copy link

codecov-commenter commented Jun 7, 2021

Codecov Report

Merging #1024 (082fe1b) into develop (e615144) will not change coverage.
The diff coverage is n/a.

❗ Current head 082fe1b differs from pull request most recent head 3fc6772. Consider uploading reports for the commit 3fc6772 to get more accurate results
Impacted file tree graph

@@           Coverage Diff            @@
##           develop    #1024   +/-   ##
========================================
  Coverage    86.05%   86.05%           
========================================
  Files           49       49           
  Lines         1936     1936           
  Branches       307      307           
========================================
  Hits          1666     1666           
  Misses         270      270           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e615144...3fc6772. Read the comment docs.

Copy link
Contributor

@snario snario left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Questions:

  • Can you clarify (maybe in the code) how we are getting around the gas estimation error being thrown despite still making a call to this.signer.provider.estimateGas? Does an explicit call to this method not "rebase" on pending transactions?
  • Does the old appendSequencerBatch override now become unused? If so we should delete this code.
  • Does this now have the effect of making this code unable to submit multiple batches in series assuming each of them will succeed? Do we want this behaviour, even?

Requests:

  • We need to add a test case or two for these scenarios that would fail before this branch is merged

const nonce = await this.signer.getTransactionCount()
// Generate the transaction we will repeatedly submit
const tx = await this.chainContract.populateTransaction.appendQueueBatch(
99999999
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This 9999999 constant really needs a comment or some kind of explanation, it's unclear why this number exists here in the code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, we could delete this code path completely since queue batch append isn't currently allowed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth keeping around for posterity sake until we split the CTC and can delete it completely. We don't want to forget that in our current design this codepath is absolutely required for the eventual proper functioning of the system.

l1tipHeight,
})

const nonce = await this.signer.getTransactionCount()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this PR, but why are we even doing this if we are using populateTransaction; shouldn't that automatically be handling the nonce selection for us?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it definitely should. Just removed it.

I left it in to avoid changing lines of code that I didn't have to, but I think removing the manual nonce setting is a good improvement.

data: '0x' + methodId + calldata,
data: getEncodedCalldata(batch),
...options,
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this method now unused?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow yes it is! I'm thinking it could make sense to leave it in in case we ever tried to call appendSequencerBatch and it fails because it doesn't accept any parameters. But you can let me know if you want me to just delete it -- I'm happy to

l1tipHeight,
})

const nonce = await this.signer.getTransactionCount()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting the nonce both outside and inside of customPopulateTransaction.appendSequencerBatch seems like its prone to a race condition. Note that I do think its very unlikely in practice, but the function does take an optional second argument which may be able to take the nonce.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes agreed, I just deleted the nonce logic

public customPopulateTransaction = {
appendSequencerBatch: async (
batch: AppendSequencerBatchParams,
options?: TransactionRequest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The options aren't really used at all here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good catch, removing the options.

from: await this.signer.getAddress(),
data,
})
const value = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value looks unused?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I just set it to zero explicitly. Probably could leave it out entirely but was just making zero value explicit. Let me know if you want me to remove that line

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its not returned?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah you're right -- deleted it

@snario
Copy link
Contributor

snario commented Jun 14, 2021

I'd just re-assert that we should add a test case before merging.

@tynes
Copy link
Contributor

tynes commented Jun 23, 2021

How important is this? Its been open for awhile and is almost there

@snario
Copy link
Contributor

snario commented Jun 23, 2021

It has caused extended issues on mainnet production in the past. It is a priority

@karlfloersch
Copy link
Contributor Author

karlfloersch commented Jun 28, 2021

@snario how would you suggest testing this? The options I see are as follows:

  1. Refactor _submitAndLogTx to be a library which handles the contractFunction logic, then unit test that functionality independently.
  2. Add an integration test for the tx resubmission
  • This would be a pretty meaty project so I don't think that this is too feasible. It would require excessive mocking & we would be much better served doing a larger scale refactor that allows us to 1) easily mock, 2) easily unit test, 3) compose unit tested functions together.

This is another example of where the batch submitter's structure is extremely difficult to test. Does option (1) suffice you think? If so I'll go ahead and add it -- should be a pretty quick refactor

@snario
Copy link
Contributor

snario commented Jun 28, 2021

This is another example of where the batch submitter's structure is extremely difficult to test. Does option (1) suffice you think? If so I'll go ahead and add it -- should be a pretty quick refactor

👍

@tynes
Copy link
Contributor

tynes commented Jun 29, 2021

This is going to need to have its commits cleaned up now that we do a rebase/merge workflow

@karlfloersch karlfloersch force-pushed the fix/broken-estimate-gas-when-pending-tx-in-mempool branch from 3fc6772 to 41ea3ad Compare July 2, 2021 19:26
@karlfloersch
Copy link
Contributor Author

@snario / @tynes would love re-review on this!

[batch, startBlock]
)
const batchSizeInBytes = remove0x(tx).length / 2
const batchSizeInBytes = remove0x(calldata).length / 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is utility in ethers called hexDataLength we could use here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally everything was Buffers instead of strings or even uint8 arrays so that we didn't have to deal with constant prefix slicing

Comment on lines +55 to +57
const APPEND_SEQUENCER_BATCH_METHOD_ID = keccak256(
Buffer.from('appendSequencerBatch()')
).slice(2, 10)
Copy link
Contributor

@snario snario Jul 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also can do

const { Interface, Fragment } = ethers.utils;
Interface.getSighash(Fragment.from('function appendSequencerBatch()'));

const getEncodedCalldata = (batch: AppendSequencerBatchParams): string => {
const methodId = APPEND_SEQUENCER_BATCH_METHOD_ID
const calldata = encodeAppendSequencerBatch(batch)
return '0x' + remove0x(methodId) + remove0x(calldata)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ethers.utils.hexConcat(methodId, calldata)


const receipt = await batchSubmitter.submitNextBatch()
expect(sequencer.getGasPrice).to.have.been.calledOnce
expect(sequencer.getGasPrice).to.have.been.calledTwice
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it call twice now?

Comment on lines +81 to +84
expect(called.sendTransaction).to.be.true
expect(called.waitForTransaction).to.be.true
expect(called.beforeSendTransaction).to.be.true
expect(called.onTransactionResponse).to.be.true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

You could also do more checks here on the inputs; e.g., is the expected nonce present

@snario
Copy link
Contributor

snario commented Aug 2, 2021

We need to merge this

@tynes
Copy link
Contributor

tynes commented Aug 2, 2021

We need to merge this

What can we do to help get this over the line?

@karlfloersch karlfloersch force-pushed the fix/broken-estimate-gas-when-pending-tx-in-mempool branch from 41ea3ad to e2985bf Compare August 5, 2021 00:31
@karlfloersch karlfloersch merged commit a8c9cd6 into develop Aug 5, 2021
@karlfloersch karlfloersch deleted the fix/broken-estimate-gas-when-pending-tx-in-mempool branch August 5, 2021 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-op-batcher Area: op-batcher

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants