Skip to content

Conversation

@h4ck3rk3y
Copy link

@h4ck3rk3y h4ck3rk3y commented May 12, 2023

Why this should be merged

This replaces the current run script with Kurtosis which is dockerized & cleans up after itself.

How this works

Installs Kurtosis in the CI and then runs test using Kurtosis. This forces the node thats spinning up to expose ports on 9650, so that we don't have to edit the existing testsuite too much

How this was tested

This is a test; CI runs fine on my fork.

How is this documented

Not sure where we document such things

@h4ck3rk3y
Copy link
Author

@aaronbuchwald Hey! I took over Tedi's project to replace the script with Kurtosis as he stopped working part time a few days ago.

@h4ck3rk3y h4ck3rk3y requested a review from aaronbuchwald May 22, 2023 21:20
firstNodeId = "node-0"
)

func SpinupAvalancheNode() (string, func(), error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would it take to change this from SpinupAvalancheNode to spin up a network of N nodes and how would you specify configs of each individual node?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To run multiple nodes; we'll have to change the args to include node_count like

forceExposeOn9650        = `{"test_mode": true, "node_count": 5}`

This would run 5 different nodes. Note Kurtosis by default tries to expose ephemeral ports but we're using the "test_mode": true to force public ports to be a certain number(first node rpc gets 9650, second gets 9652 and so forth). Ideally we remove the test_mode and pass around the ephemeral port generated inside these tests.

Re config - At the moment all nodes start with the same config except for bootstrap information. The second bootstraps from the first one; the third from the first two. What other config would you like to tweak? It shouldn't be too hard to set it; but I wanted to get something minimal out

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool that makes sense. AvalancheGo requires uses two ports the HTTP port (https://docs.avax.network/nodes/maintain/avalanchego-config-flags#--http-port-int) and the Staking port (https://docs.avax.network/nodes/maintain/avalanchego-config-flags#--staking-port-int).

If it's easier for Kurtosis to use ephemeral ports, you may just want to assign those two ports via these flags.

As in the documentation, the staking port is where the node listens for incoming p2p connections and the http port is used to run an API server.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we have one container per Avalanche Node inside the package. Within the Docker network the ports are set to the defaults 9650 and 9651; Kurtosis then produces (by default) an ephemeral port on localhost. That is the port that the tests interact with; with test_mode true I am forcing Kurtosis to produce local port mappings that are expected

So 9650,9651 on the first node will map to 9650 and 9651 on localhost
For the second avalanche node they map to 9652 and 9653
.. so on and so forth!

I am using test_mode as the tests seem to hardcode

var (
	DefaultLocalNodeURI = "http://127.0.0.1:9650"
	NodeURIs            = []string{DefaultLocalNodeURI, "http://127.0.0.1:9652", "http://127.0.0.1:9654", "http://127.0.0.1:9656", "http://127.0.0.1:9658"}
)

and I didn't want to change your code base too much! There's not much extra effort for Kurtosis here!

Perhaps I can make the load test a multi node load test based on what I see here

tests/load/load_test.go
56:		rpcEndpoints := make([]string, 0, len(utils.NodeURIs))
57:		for _, uri := range []string{utils.DefaultLocalNodeURI} { // TODO: use NodeURIs instead, hack until fixing multi node in a network behavior

Copy link
Author

@h4ck3rk3y h4ck3rk3y May 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the load test slightly and now I pass utils.NodeURIs & the test seems to pass.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reverted the multi node load test as it flaked on one run - https://github.com/h4ck3rk3y/subnet-evm/actions/runs/5093958012/jobs/9157181604

victorcolombo added a commit to kurtosis-tech/kurtosis that referenced this pull request May 23, 2023
## Description:
Currently Starlark run remote package, run package and run script
blocking calls on the SDK have the following behavior:
1. If an error happens on the APIC side, we return `(*StarlarkRunResult
== nil, error != nil)`
2. If an error happens on the Starlark side, we return
`(*StarlarkRunResult != nil, error == nil)`, with either
`StarlarkRunResult.[InterpretationError, ValidationErrors,
ExecutionError] != nil`
3. If no errors happen on the Starlark side, we return
`(*StarlarkRunResult != nil, error == nil)` with all
`StarlarkRunResult.[InterpretationError, ValidationErrors,
ExecutionError] == nil`

This behavior is not very ergonomic, given that most people are only
interested if the Starlark succeeded or not, they should be able to do
this with a simple `err != nil` check, which is the Go idiomatic way of
dong it.

If the user wants to investigate further the interpreted instructions,
the breakdown of which phase failed, etc, this will still be accessible
via the `StarlarkRunResult`.

## Is this change user facing?
YES
<!-- If yes, please add the "user facing" label to the PR -->
<!-- If yes, don't forget to include docs changes where relevant -->

## References (if applicable):
<!-- Add relevant Github Issues, Discord threads, or other helpful
information. -->
ava-labs/subnet-evm#649
@h4ck3rk3y h4ck3rk3y requested a review from aaronbuchwald May 24, 2023 10:21
Comment on lines 58 to 73
fmt.Println(fmt.Printf("Destroying enclave with id '%v'", enclaveId))
if err = kurtosisCtx.StopEnclave(ctx, enclaveId); err != nil {
fmt.Printf("An error occurred while stopping the enclave with id '%v'\n", enclaveId)
}
if err = kurtosisCtx.DestroyEnclave(ctx, enclaveId); err != nil {
fmt.Printf("An error occurred while cleaning up the enclave with id '%v'\n", enclaveId)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should these errors be propagated somewhere?

I think failure to teardown should be treated as unexpected behavior and cause the test to fail. If this causes tests to flake intermittently, then I'd consider it a bug.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense! I am now asserting that the teardown has no errors.

const (
isPartitioningEnabled = false
enclaveIdPrefix = "test"
avalancheStarlarkPackage = "github.com/kurtosis-tech/avalanche-package"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this running a static version of AvalancheGo/Subnet-EVM? For CI, we would want to build a Docker image of AvalancheGo that has the VM binary present in the plugin directory and launch a node running that otherwise it seems that this will not actually be testing the latest code.

Copy link
Author

@h4ck3rk3y h4ck3rk3y May 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added image building as a step in the CI test and I pass the built image in the avalanche package now using the :test tag via the avalanchego_image argument


# This script assumes that an AvalancheGo and Subnet-EVM binaries are available in the standard location
# within the $GOPATH
# The AvalancheGo and PluginDir paths can be specified via the environment variables used in ./scripts/run.sh.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this comes with new requirements (install kurtosis cli), would it be possible to add or link instructions that are necessary to run the e2e tests with this change?

Copy link
Author

@h4ck3rk3y h4ck3rk3y May 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added instructions about installing Kurtosis & restarting the engine. Let me know what you think and if there's a better spot to put this apart from this script.

@h4ck3rk3y h4ck3rk3y requested a review from aaronbuchwald May 26, 2023 21:33
Signed-off-by: Gyanendra Mishra <[email protected]>
@aaronbuchwald
Copy link
Collaborator

ceyonur added a commit that referenced this pull request Mar 4, 2025
* add regression test

* apply fix

* use cmp

* reorder equal params

* improve readability by removing extra var (#650)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

No open projects
Archived in project

Development

Successfully merging this pull request may close these issues.

2 participants