Skip to content

Conversation

@DracoLi
Copy link
Contributor

@DracoLi DracoLi commented Aug 20, 2025

Why this should be merged

Fix race for when a data file handler is evicted but another thread is still using that handler. The eviction tests caught this issue. See example, example 2

Resolves #4194

How this works

The eviction tests caught this since it simulated high evictions during concurrent calls. The simple solution here is to just retry opening the data file if we get an os.ErrClosed error during data file read/write, which should be rare.

How this was tested

Tests

Copilot AI review requested due to automatic review settings August 20, 2025 13:50
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a race condition in the blockdb file eviction system where a data file handler could be evicted while another thread is still using it, causing errors during concurrent operations.

  • Implemented retry logic with os.ErrClosed detection for data file operations
  • Updated getDataFileAndOffset to return the file index for eviction management
  • Modified test cases to properly simulate file access errors

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
x/blockdb/database.go Added retry loops with os.ErrClosed handling for read/write operations and updated file access method signature
x/blockdb/database_test.go Removed unnecessary type parameters from LRU cache instantiation
x/blockdb/writeblock_test.go Updated test case to use file permission changes instead of file closure for error simulation

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

)
return nil, fmt.Errorf("calculating body offset would overflow for block at height %d: %w", height, err)
}
for {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the read body and header methods will be moved soon.

@DracoLi DracoLi requested a review from yacovm August 20, 2025 14:55
@DracoLi DracoLi force-pushed the dl/blockdb-file-race branch from 1348eba to 0920fcf Compare August 20, 2025 17:43
@DracoLi DracoLi requested review from a team and joshua-kim and removed request for a team August 20, 2025 18:12
@joshua-kim joshua-kim enabled auto-merge August 21, 2025 16:39
@joshua-kim joshua-kim added this pull request to the merge queue Aug 21, 2025
Merged via the queue into master with commit b939be4 Aug 21, 2025
55 of 57 checks passed
@joshua-kim joshua-kim deleted the dl/blockdb-file-race branch August 21, 2025 18:39
@github-project-automation github-project-automation bot moved this to Done 🎉 in avalanchego Aug 21, 2025
joshua-kim added a commit that referenced this pull request Sep 3, 2025
commit 66ca7dc
Author: rodrigo <[email protected]>
Date:   Tue Sep 2 16:49:55 2025 -0400

    feat(load): add firewood flag (#4235)

commit 274541b
Author: Suyan Qu <[email protected]>
Date:   Tue Sep 2 14:10:48 2025 -0500

    feat: add parameters to disable metrics (#4214)

commit e03af84
Author: aaronbuchwald <[email protected]>
Date:   Tue Sep 2 12:30:07 2025 -0400

    Add timeout parameter to C-Chain re-execution jobs (#4223)

commit 0e20485
Author: aaronbuchwald <[email protected]>
Date:   Tue Sep 2 11:58:29 2025 -0400

    Comment out schedule trigger for re-execution on w/container (#4234)

    Signed-off-by: aaronbuchwald <[email protected]>
    Co-authored-by: Copilot <[email protected]>

commit 847eba1
Author: aaronbuchwald <[email protected]>
Date:   Fri Aug 29 14:38:11 2025 -0400

    Add back empty schedule entry for reexecute w/ container job (#4230)

commit a958b8a
Author: aaronbuchwald <[email protected]>
Date:   Fri Aug 29 12:56:06 2025 -0400

    Add newline to workflow dispatch (#4229)

    Signed-off-by: aaronbuchwald <[email protected]>

commit 7ec2258
Author: aaronbuchwald <[email protected]>
Date:   Thu Aug 28 17:11:38 2025 -0400

    Push benchmark re-execute results on master workflow dispatch (#4224)

commit 34f983e
Author: aaronbuchwald <[email protected]>
Date:   Thu Aug 28 15:33:12 2025 -0400

    Disambiguate source vs exec variable names in reexecute tasks (#4200)

    Signed-off-by: aaronbuchwald <[email protected]>
    Co-authored-by: Copilot <[email protected]>

commit 99578a2
Author: aaronbuchwald <[email protected]>
Date:   Thu Aug 28 12:52:31 2025 -0400

    Write grafana link to logs and github step summary (#4219)

commit 814300c
Author: aaronbuchwald <[email protected]>
Date:   Thu Aug 28 12:37:05 2025 -0400

    Remove firewood entry from PR triggers due to flakes (#4227)

commit 40fbcd5
Author: rodrigo <[email protected]>
Date:   Thu Aug 28 00:24:54 2025 -0400

    refactor(load): simulator contract (#4181)

commit 6195e1f
Author: rodrigo <[email protected]>
Date:   Wed Aug 27 17:31:51 2025 -0400

    chore: remove unzip mention (#4226)

commit 59e88f3
Author: aaronbuchwald <[email protected]>
Date:   Wed Aug 27 11:17:27 2025 -0400

    Remove schedule trigger for w/ container job that evaluates to empty matrix (#4218)

commit c2563d1
Author: Stephen Buttolph <[email protected]>
Date:   Tue Aug 26 19:07:47 2025 -0400

    Update versions for v1.13.5 (#4217)

commit a0ccd66
Author: aaronbuchwald <[email protected]>
Date:   Tue Aug 26 12:34:54 2025 -0400

    Add support for passing config and predefined configs to VM re-execution tests (#4180)

commit cc3242f
Author: Joshua Kim <[email protected]>
Date:   Mon Aug 25 18:49:28 2025 -0400

    Dynamically update mempool gossip request rate limit (#4162)

    Signed-off-by: Joshua Kim <[email protected]>
    Co-authored-by: Stephen Buttolph <[email protected]>

commit f2e3273
Author: Draco <[email protected]>
Date:   Mon Aug 25 15:00:56 2025 -0400

    Add ability to create zstd compressor with compression level (#4203)

commit 441f441
Author: Joshua Kim <[email protected]>
Date:   Mon Aug 25 12:11:07 2025 -0400

    Remove buf lint action (#4189)

    Signed-off-by: Joshua Kim <[email protected]>

commit 4bcb221
Author: Stephen Buttolph <[email protected]>
Date:   Sat Aug 23 15:43:06 2025 -0400

    Update block + validator + pgo checkpoints to 2025-08-23 (#4205)

commit b18ffc1
Author: rodrigo <[email protected]>
Date:   Fri Aug 22 16:34:53 2025 -0400

    Add s5cmd progress bar (#4204)

commit 2100bee
Author: Sam Liokumovich <[email protected]>
Date:   Fri Aug 22 11:52:31 2025 -0400

    Rename Engine Types (#4193)

    Signed-off-by: Sam Liokumovich <[email protected]>
    Co-authored-by: Copilot <[email protected]>

commit 33727a8
Author: Joshua Kim <[email protected]>
Date:   Fri Aug 22 11:00:12 2025 -0400

    Count throttled requests as hits (#4199)

    Signed-off-by: Joshua Kim <[email protected]>

commit b939be4
Author: Draco <[email protected]>
Date:   Thu Aug 21 14:22:54 2025 -0400

    fix: blockdb file eviction race issue (#4186)

commit 778ccfe
Author: aaronbuchwald <[email protected]>
Date:   Thu Aug 21 11:40:03 2025 -0400

    Add config option for AWS S3 read only credential duration (#4192)

commit ae41355
Author: Stephen Buttolph <[email protected]>
Date:   Wed Aug 20 16:46:49 2025 -0400

    Add redundant import alias linting (#4191)

    Signed-off-by: Stephen Buttolph <[email protected]>
    Co-authored-by: Copilot <[email protected]>

commit a3b5c6a
Author: Stephen Buttolph <[email protected]>
Date:   Wed Aug 20 10:53:24 2025 -0400

    Make Draco the codeowner of the blockdb (#4187)

commit a24ac68
Author: queryfast <[email protected]>
Date:   Wed Aug 20 22:18:21 2025 +0800

    refactor: replace []byte(fmt.Sprintf) with fmt.Appendf (#4161)

    Signed-off-by: queryfast <[email protected]>

commit 7aa6a17
Author: Sam Liokumovich <[email protected]>
Date:   Tue Aug 19 14:39:40 2025 -0400

    Rename height field to numBlocks (#4184)

commit 7d7e1fe
Author: aaronbuchwald <[email protected]>
Date:   Tue Aug 19 13:24:59 2025 -0400

    Add optional step to archive post-reexecution state to S3 (#4172)

    Signed-off-by: aaronbuchwald <[email protected]>
    Co-authored-by: Copilot <[email protected]>

commit ebe0558
Author: aaronbuchwald <[email protected]>
Date:   Tue Aug 19 12:11:34 2025 -0400

    Change cache path to tmp included in gitignore (#4183)

commit e5593be
Author: Draco <[email protected]>
Date:   Tue Aug 19 12:01:43 2025 -0400

    Block Database (#4027)

commit 940b96f
Author: Sam Liokumovich <[email protected]>
Date:   Tue Aug 19 11:36:37 2025 -0400

    Storage Component For Simplex (#4122)

    Signed-off-by: Sam Liokumovich <[email protected]>

commit 6d7e2dc
Author: Nicolas Arnedo Villanueva <[email protected]>
Date:   Tue Aug 19 16:59:58 2025 +0200

    `config/config.md:` Added Env Variable representation of flags + improved UI design (#4110)

    Signed-off-by: Meaghan FitzGerald <[email protected]>
    Signed-off-by: Nicolas Arnedo Villanueva <[email protected]>
    Co-authored-by: Meaghan FitzGerald <[email protected]>
    Co-authored-by: Stephen Buttolph <[email protected]>

commit 81f13b2
Author: Draco <[email protected]>
Date:   Mon Aug 18 13:59:43 2025 -0400

    feat: add eviction callback in LRU cache (#4088)

commit 4f5acfc
Author: Jonathan Oppenheimer <[email protected]>
Date:   Mon Aug 18 13:16:44 2025 -0400

    Migrate predicate package from evm repos (#4147)

    Signed-off-by: Jonathan Oppenheimer <[email protected]>
    Co-authored-by: Copilot <[email protected]>
    Co-authored-by: Stephen Buttolph <[email protected]>
    Co-authored-by: Joshua Kim <[email protected]>

commit 335e79f
Author: Kendra Karol Sevilla <[email protected]>
Date:   Mon Aug 18 18:45:52 2025 +0200

    chore: fix typo (#4179)

    Signed-off-by: Kendra Karol Sevilla <[email protected]>

commit 7275b02
Author: yinwenyu6 <[email protected]>
Date:   Mon Aug 18 22:29:03 2025 +0800

    chore: fix function name (#4178)

    Signed-off-by: yinwenyu6 <[email protected]>

commit 3b0c595
Author: yacovm <[email protected]>
Date:   Mon Aug 18 16:28:29 2025 +0200

    Fix typo in comment - PChainHeight context (#4176)

    Signed-off-by: Yacov Manevich <[email protected]>

commit 96f30d1
Author: rodrigo <[email protected]>
Date:   Fri Aug 15 02:15:44 2025 -0400

    feat(load): add token test (#4171)

commit e285ce0
Author: Sam Liokumovich <[email protected]>
Date:   Thu Aug 14 13:52:41 2025 -0400

    Use EmptyVoteMetadata in Simplex Proto Messages (#4174)

commit 5c72544
Author: aaronbuchwald <[email protected]>
Date:   Wed Aug 13 10:34:58 2025 -0400

    Move C-Chain benchmark to custom action and add ARC + GH runner triggers (#4165)

commit 3b0f8d4
Author: rodrigo <[email protected]>
Date:   Tue Aug 5 20:14:38 2025 -0400

    refactor(load): remove context from test interface (#4157)

commit a893a61
Author: Juan Leon <[email protected]>
Date:   Tue Aug 5 14:36:59 2025 -0400

    Add @joshua-kim as CODEOWNER to testing-related packages (#4118)

    Signed-off-by: Juan Leon <[email protected]>

commit be28a8b
Author: Galoretka <[email protected]>
Date:   Mon Aug 4 22:39:41 2025 +0300

    chore: fix a typo in gossip,go (#4154)

    Signed-off-by: Galoretka <[email protected]>

commit b876d78
Author: aaronbuchwald <[email protected]>
Date:   Mon Aug 4 11:58:22 2025 -0400

    Separate re-execution job params for PR from schedule (#4151)

commit 752e12f
Author: Stephen Buttolph <[email protected]>
Date:   Fri Aug 1 16:23:01 2025 -0400

    Update coreth to v0.15.3-rc.5 (#4153)

commit 3ba5246
Author: Joshua Kim <[email protected]>
Date:   Fri Aug 1 14:59:24 2025 -0400

    fix metrics tests (#4146)

    Signed-off-by: Joshua Kim <[email protected]>

commit 0cb887b
Author: Afounso Souza <[email protected]>
Date:   Fri Aug 1 16:37:53 2025 +0200

    Typo fix (#4150)

    Signed-off-by: Afounso Souza <[email protected]>

commit 110807a
Author: rodrigo <[email protected]>
Date:   Thu Jul 31 22:06:40 2025 -0400

    docs: load (#4132)

commit 24a051a
Author: Jonathan Oppenheimer <[email protected]>
Date:   Thu Jul 31 19:06:15 2025 -0400

    uplift: Add combined metrics package from evm repositories (#4135)

    Signed-off-by: Jonathan Oppenheimer <[email protected]>
    Co-authored-by: Stephen Buttolph <[email protected]>

commit d9b512e
Author: rodrigo <[email protected]>
Date:   Thu Jul 31 11:52:39 2025 -0400

    Parameterize values in transfer tests (#4144)

commit 6947e4c
Author: rodrigo <[email protected]>
Date:   Wed Jul 30 12:27:45 2025 -0400

    feat(load): add trie stress test (#4137)

Signed-off-by: Joshua Kim <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Flaky Test: BlockDB TestFileCache_Eviction

5 participants