Skip to content

DO NOT MERGE: synthetic parallel execution test framework #4817

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

graydon
Copy link
Contributor

@graydon graydon commented Jul 5, 2025

This adds a special mode you can expose to live traffic from mainnet or testnet (online using run or, more commonly, offline using catchup) to test out the new p23 parallel-execution code path for soroban phases.

The way it works is that just before running a sequential soroban phase, it:

  • Synthesizes a fake parallel phase using the same phase-building path used in p23 txset nomination
  • Runs that phase on a captured throwaway copy of the pre-state of the phase
  • Captures the results of that (txresults and txmetas) into some buffers

It then proceeds to run the normal sequential phase as usual, and compares the captured parallel results with the sequential results, logging any differences as errors.

Its behaviour is controlled by two environment variables:

  • STELLAR_TEST_PARALLEL_EXECUTION must be set to a nonzero number, which will be used as the parallelism factor for the synthesized parallel phase. So setting STELLAR_TEST_PARALLEL_EXECUTION=4 will make and run a 4-way parallel phase on 4 threads.
  • STELLAR_COMPARISON_TOLERANCE is an optional but recommended comma-separated list of difference types to tolerate and not report as errors. Currently I recommend running with STELLAR_COMPARISON_TOLERANCE=event_topics,fees though other options are possible (browse the code). This is necessary because there are some small observable differences between p22 and p23 executions, both arising from minor protocol changes and also from the very fact of running in parallel (eg. fees go way down).

So overall, you probably want to run something like:

$ STELLAR_COMPARISON_TOLERANCE=event_topics,fees STELLAR_TEST_PARALLEL_EXECUTION=4 \
  ./src/stellar-core --conf ~/stellar-mainnet.cfg --console catchup current/1000

To help diagnose differences, it will also write them to some organized files in the filesystem, under the directory parallel-tx-diffs. For example my version just wrote these files:

parallel-tx-diffs/ledger-58007058
parallel-tx-diffs/ledger-58007058/tx-81
parallel-tx-diffs/ledger-58007058/tx-81/tx-envelope.json
parallel-tx-diffs/ledger-58007058/tx-81/meta-parallel.json
parallel-tx-diffs/ledger-58007058/tx-81/summary.txt
parallel-tx-diffs/ledger-58007058/tx-81/meta-sequential.json
parallel-tx-diffs/ledger-58007058/tx-61
parallel-tx-diffs/ledger-58007058/tx-61/tx-envelope.json
parallel-tx-diffs/ledger-58007058/tx-61/meta-parallel.json
parallel-tx-diffs/ledger-58007058/tx-61/summary.txt
parallel-tx-diffs/ledger-58007058/tx-61/meta-sequential.json
...

Copy link
Contributor

@dmkozh dmkozh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks sensible overall, I think the main issue is weird/incorrect fee bump handling.

@graydon

This comment was marked as outdated.

@graydon graydon force-pushed the re-exec branch 5 times, most recently from a866e06 to 7b2af24 Compare July 14, 2025 22:21
@graydon graydon changed the title DO NOT MERGE: sketch of parallel-tx re-execution for testing. DO NOT MERGE: synthetic parallel execution test framework Jul 15, 2025
@graydon graydon mentioned this pull request Jul 18, 2025
6 tasks
@graydon graydon mentioned this pull request Jul 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants