-
Notifications
You must be signed in to change notification settings - Fork 7
refactor(sealing): separation of the sealing logic from the implementation of the active faction #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
fc03520
to
d5ca37c
Compare
42acd6e
to
008f0f9
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #6 +/- ##
==========================================
- Coverage 71.42% 71.13% -0.29%
==========================================
Files 200 198 -2
Lines 18143 18162 +19
==========================================
- Hits 12958 12920 -38
- Misses 4464 4523 +59
+ Partials 721 719 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
📝 WalkthroughWalkthroughLarge refactor that extracts sealing and index-building logic into new packages Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Pre-merge checks❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal). Please share your feedback with us on this Discord post. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
📝 WalkthroughWalkthroughThis change overhauls the sealing and block management logic for fractions. It removes legacy disk block writing and sealing code, introduces new modular sealing abstractions, updates type usage to a shared Changes
Estimated code review effort🎯 5 (Critical) | ⏱️ ~70+ minutes
Possibly related PRs
Suggested reviewers
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. ✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (5)
frac/sealed/token/table_loader.go (1)
108-112
: Use Go-doc style for exported struct commentThe added comment won’t be picked up by
go doc
because it doesn’t start with the identifier’s name.
Consider re-phrasing.-// `token.TableBlock` represents how `token.Table` is stored on disk +// TableBlock represents how token.Table is stored on diskfrac/sealed/block_offsets.go (1)
8-11
: Consider relocating IDsTotal field.The TODO comment suggests
IDsTotal
might be better placed in the Info block. This could improve data organization and reduce coupling.frac/sealed/sealing/sealer.go (1)
44-46
: Use short variable declaration for cleaner codeThe static analysis tool correctly identifies that this can be simplified.
- indexSealer := NewIndexSealer(params) - if err = indexSealer.WriteIndex(indexFile, src); err != nil { - return nil, err - } + indexSealer := NewIndexSealer(params) + if err := indexSealer.WriteIndex(indexFile, src); err != nil { + return nil, err + }frac/sealed/sealing/blocks_builder.go (1)
66-67
: Improve error messages with contextThe error messages could be more specific to aid debugging.
if !has { - bb.lastErr = errors.New("empty token blocks") + bb.lastErr = errors.New("sealing: empty token blocks provided") return }if needMoreEntries { - bb.lastErr = errors.New("fields and tokens not consistent") + bb.lastErr = errors.New("sealing: fields and tokens not consistent - unexpected end of token blocks") return }if _, has = nextBlock(); has { - bb.lastErr = errors.New("fields and tokens not consistent") + bb.lastErr = errors.New("sealing: fields and tokens not consistent - excess token blocks after processing all fields") }Also applies to: 91-92, 109-110
frac/active_sealing_source.go (1)
278-280
: Simplify error assignmentUse direct assignment as suggested by static analysis.
-if err = util.CollapseErrors([]error{src.lastErr, err}); err != nil { +if err := util.CollapseErrors([]error{src.lastErr, err}); err != nil { return err }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (43)
cmd/distribution/main.go
(4 hunks)cmd/index_analyzer/main.go
(2 hunks)cmd/seq-db/seq-db.go
(2 hunks)consts/consts.go
(1 hunks)disk/block_former.go
(0 hunks)disk/blocks_stats.go
(0 hunks)disk/blocks_writer.go
(0 hunks)disk/index_block_header.go
(1 hunks)frac/active.go
(4 hunks)frac/active_docs_positions.go
(2 hunks)frac/active_index.go
(2 hunks)frac/active_sealer.go
(0 hunks)frac/active_sealer_test.go
(0 hunks)frac/active_sealing_source.go
(1 hunks)frac/common/info.go
(2 hunks)frac/common/seal_params.go
(1 hunks)frac/disk_blocks.go
(0 hunks)frac/disk_blocks_producer.go
(0 hunks)frac/disk_blocks_writer.go
(0 hunks)frac/fraction.go
(2 hunks)frac/seal_stats.go
(0 hunks)frac/sealed.go
(7 hunks)frac/sealed/block_info.go
(2 hunks)frac/sealed/block_offsets.go
(1 hunks)frac/sealed/preloaded_data.go
(1 hunks)frac/sealed/sealing/blocks_builder.go
(1 hunks)frac/sealed/sealing/blocks_builder_test.go
(1 hunks)frac/sealed/sealing/index.go
(1 hunks)frac/sealed/sealing/sealer.go
(1 hunks)frac/sealed/sealing/stats.go
(1 hunks)frac/sealed/token/table.go
(1 hunks)frac/sealed/token/table_entry.go
(1 hunks)frac/sealed/token/table_loader.go
(1 hunks)frac/sealed_index.go
(2 hunks)frac/sealed_loader.go
(2 hunks)fracmanager/config.go
(2 hunks)fracmanager/fraction_provider.go
(4 hunks)fracmanager/proxy_frac.go
(6 hunks)fracmanager/sealed_frac_cache.go
(5 hunks)fracmanager/sealed_frac_cache_test.go
(14 hunks)fracmanager/sealer_test.go
(5 hunks)fracmanager/searcher_test.go
(2 hunks)tests/setup/env.go
(5 hunks)
💤 Files with no reviewable changes (9)
- frac/disk_blocks_producer.go
- frac/disk_blocks_writer.go
- disk/blocks_writer.go
- frac/active_sealer_test.go
- frac/seal_stats.go
- frac/disk_blocks.go
- disk/block_former.go
- frac/active_sealer.go
- disk/blocks_stats.go
🧰 Additional context used
🧬 Code Graph Analysis (19)
cmd/index_analyzer/main.go (1)
frac/sealed/block_info.go (1)
BlockInfo
(15-17)
cmd/distribution/main.go (2)
frac/common/info.go (1)
Info
(22-41)frac/sealed/block_info.go (1)
BlockInfo
(15-17)
fracmanager/fraction_provider.go (3)
frac/sealed.go (3)
NewSealed
(64-95)Sealed
(25-54)NewSealedPreloaded
(126-172)frac/common/info.go (1)
Info
(22-41)frac/sealed/preloaded_data.go (1)
PreloadedData
(10-16)
cmd/seq-db/seq-db.go (1)
frac/common/seal_params.go (1)
SealParams
(3-12)
frac/sealed/block_info.go (1)
frac/common/info.go (1)
Info
(22-41)
fracmanager/sealed_frac_cache_test.go (4)
frac/common/info.go (1)
Info
(22-41)tests/common/util.go (3)
GetTestTmpDir
(69-71)RecreateDir
(49-52)RemoveDir
(42-47)consts/consts.go (1)
FracCacheFileSuffix
(63-63)fracmanager/sealed_frac_cache.go (1)
NewSealedFracCache
(33-44)
frac/common/info.go (1)
seq/seq.go (1)
ID
(12-15)
frac/sealed/sealing/blocks_builder.go (5)
frac/sealed/seqids/blocks.go (3)
BlockMIDs
(10-12)BlockRIDs
(32-35)BlockParams
(57-59)frac/sealed/token/table_loader.go (2)
TableBlock
(109-111)FieldTable
(113-116)frac/sealed/token/table_entry.go (1)
TableEntry
(4-12)seq/seq.go (3)
ID
(12-15)MID
(17-17)RID
(18-18)seq/doc_pos.go (1)
DocPos
(19-19)
frac/active_docs_positions.go (2)
seq/seq.go (1)
ID
(12-15)seq/doc_pos.go (1)
DocPos
(19-19)
frac/sealed_index.go (1)
frac/common/info.go (1)
Info
(22-41)
frac/fraction.go (1)
frac/common/info.go (1)
Info
(22-41)
frac/sealed/sealing/stats.go (1)
frac/common/info.go (1)
Info
(22-41)
frac/active_index.go (1)
frac/common/info.go (1)
Info
(22-41)
frac/active_sealing_source.go (8)
frac/common/seal_params.go (1)
SealParams
(3-12)frac/common/info.go (1)
Info
(22-41)seq/doc_pos.go (2)
DocPos
(19-19)PackDocPos
(21-34)frac/active.go (1)
Active
(28-60)seq/seq.go (3)
ID
(12-15)MID
(17-17)RID
(18-18)consts/consts.go (2)
IDsPerBlock
(14-14)SdocsFileSuffix
(55-55)bytespool/writer.go (3)
AcquireWriterSize
(18-30)FlushReleaseWriter
(32-42)Writer
(13-16)util/errors_collapser.go (1)
CollapseErrors
(25-27)
fracmanager/sealed_frac_cache.go (1)
frac/common/info.go (1)
Info
(22-41)
frac/sealed_loader.go (1)
frac/sealed/block_offsets.go (1)
BlockOffsets
(8-11)
fracmanager/searcher_test.go (1)
frac/common/info.go (2)
Info
(22-41)NewInfo
(43-57)
frac/sealed/sealing/sealer.go (6)
frac/common/info.go (1)
Info
(22-41)frac/common/seal_params.go (1)
SealParams
(3-12)frac/sealed/preloaded_data.go (1)
PreloadedData
(10-16)consts/consts.go (2)
IndexTmpFileSuffix
(60-60)IndexFileSuffix
(59-59)frac/sealed/sealing/index.go (1)
NewIndexSealer
(37-41)util/fs.go (1)
MustSyncPath
(14-27)
fracmanager/config.go (1)
frac/common/seal_params.go (1)
SealParams
(3-12)
🪛 GitHub Check: lint
frac/sealed/sealing/index.go
[failure] 79-79:
Error return value of bw.Write
is not checked (errcheck)
[failure] 64-64:
Error return value of bytespool.FlushReleaseWriter
is not checked (errcheck)
frac/active_sealing_source.go
[failure] 278-278:
sloppyReassign: re-assignment to err
can be replaced with err := util.CollapseErrors([]error{src.lastErr, err})
(gocritic)
[failure] 273-273:
Error return value of bytespool.FlushReleaseWriter
is not checked (errcheck)
frac/sealed/sealing/sealer.go
[failure] 44-44:
sloppyReassign: re-assignment to err
can be replaced with err := indexSealer.WriteIndex(indexFile, src)
(gocritic)
🪛 GitHub Actions: CI
frac/active_sealing_source.go
[error] 273-273: golangci-lint: Error return value of bytespool.FlushReleaseWriter
is not checked (errcheck)
🔇 Additional comments (39)
disk/index_block_header.go (1)
25-25
: LGTM - Clean parameter simplificationThe shift from buffer parameters to explicit size integers aligns well with the architectural refactoring. Implementation correctly maintains the same behavior.
Also applies to: 29-30
cmd/seq-db/seq-db.go (1)
257-265
: Guard against possibleint
overflow when castingDocBlockSize
cfg.DocsSorting.DocBlockSize
looks like a configurable (and likely 64-bit) size value.
Blindly casting it toint
can overflow on 32-bit builds.Please verify the value always fits into
int
, or use an explicit, safe conversion (e.g.int64
) and add a bounds check.consts/consts.go (1)
16-16
: LGTM - Constant repositioning aligns with refactoring.The repositioning of
RegularBlockSize
and removal ofIDsBlockSize
is consistent with the migration away from legacy disk block handling to the new sealing architecture.frac/sealed_index.go (1)
13-13
: LGTM - Clean type migration to common.Info.The import addition and field type change from local
*Info
to*common.Info
is consistent with the broader refactoring. Usage throughout the file remains compatible.Also applies to: 58-58
frac/fraction.go (1)
9-9
: LGTM - Interface updated for common type unification.The interface method signature change to return
*common.Info
aligns with the codebase-wide migration to shared types. Usage infracToString
remains compatible.Also applies to: 20-20
fracmanager/searcher_test.go (1)
13-13
: LGTM - Test updated for common.Info migration.The test mock correctly implements the updated interface signature and uses
common.NewInfo
instead offrac.NewInfo
. Implementation is consistent with the refactoring.Also applies to: 31-32
frac/sealed/block_info.go (1)
1-1
: LGTM - Package rename and type migration executed correctly.The package rename to
sealed
and migration to*common.Info
is consistent throughout. TheUnpack
method correctly instantiates the new type while maintaining the same functionality.Also applies to: 9-9, 16-16, 36-36
tests/setup/env.go (2)
20-20
: Import path updates look good.Clean package restructuring with appropriate aliasing for the test utilities.
Also applies to: 31-31
86-86
: Type and function references updated correctly.All references properly updated to use the new package paths and aliases.
Also applies to: 214-214, 379-379
frac/common/info.go (2)
1-1
: Package restructuring looks good.Clean move to
common
package with appropriate iterator import.Also applies to: 5-5
74-81
: Good API modernization with iterators.Switching from
[]seq.ID
toiter.Seq[seq.ID]
improves flexibility - callers no longer need to materialize slices. The core logic remains unchanged.frac/sealed/token/table.go (1)
12-28
: Excellent documentation addition.The detailed comment with ASCII diagram clearly explains the complex relationships between fields, token blocks, and table entries. This will be valuable for future maintainers.
frac/sealed/preloaded_data.go (1)
10-16
: Clean data structure design.The
PreloadedData
struct appropriately aggregates related components with proper pointer usage for larger structs.frac/sealed/token/table_entry.go (1)
3-5
: Clear documentation improvements.The updated comments better explain the TableEntry's role within the token.Table structure and clarify the StartIndex field purpose.
frac/common/seal_params.go (1)
3-12
: Clean centralization of sealing parameters.The struct consolidates compression settings that were previously scattered across the codebase. Good documentation on the document block fields.
fracmanager/sealed_frac_cache.go (1)
13-13
: Consistent type migration tocommon.Info
.Clean refactoring from
frac.Info
tocommon.Info
with consistent updates across all method signatures and data structures.Also applies to: 26-26, 35-35, 79-79, 99-99
fracmanager/fraction_provider.go (1)
9-11
: Proper migration to specialized subpackages.Method signatures correctly updated to use
common.Info
andsealed.PreloadedData
. The constructor change tonewProxyFrac
follows good encapsulation practices.Also applies to: 49-49, 60-60, 76-76
cmd/distribution/main.go (1)
15-16
: Correct migration to subpackage types.Clean updates to use
common.Info
andsealed.BlockInfo
consistent with the package restructuring. No logic changes required.Also applies to: 62-62, 73-73, 88-88
fracmanager/sealed_frac_cache_test.go (1)
16-16
: Clean test migration with proper import aliasing.Correctly resolves naming conflicts with
tests_common
alias and consistently updates all type references tocommon.Info
. Test logic preserved.Also applies to: 18-18, 29-29, 35-35, 125-125, 146-146, 170-170, 237-237, 287-287, 372-372
fracmanager/sealer_test.go (3)
25-25
: Import alias follows good practice.The alias
tests_common
clearly distinguishes the test utilities from the main common package.
87-98
: Return type change aligns with architecture.The migration from
frac.SealParams
tocommon.SealParams
is consistent with the broader refactoring to shared types.
120-124
: Clean abstraction of new sealing API.The local
seal
helper function properly encapsulates the new sealing workflow usingActiveSealingSource
andsealing.Seal
.frac/sealed_loader.go (1)
77-88
: Improved abstraction with BlockOffsets.Unpack.Replacing manual parsing with the structured
BlockOffsets.Unpack
method centralizes serialization logic and improves maintainability.frac/active.go (2)
38-38
: Type unification with common.Info.The migration from local
Info
tocommon.Info
promotes consistency across the codebase.
307-313
: Maintains copy semantics correctly.The
Info()
method properly returns a copy of thecommon.Info
struct, preserving thread safety.frac/sealed/block_offsets.go (2)
13-23
: Efficient delta encoding implementation.The
Pack
method uses delta encoding for offsets, which is space-efficient for sequential data. The implementation correctly tracks previous values.
34-50
: Comprehensive error handling in Unpack.The method properly handles varint decoding errors and validates the final offset count against the expected value.
frac/sealed/sealing/stats.go (1)
29-39
: Handle potential division by zero.The compression ratio calculation
float64(s.rawLen) / float64(s.len)
could panic ifs.len
is zero.func (s *blocksStats) log(name string, endTime time.Time) { - ratio := float64(s.rawLen) / float64(s.len) + var ratio float64 + if s.len > 0 { + ratio = float64(s.rawLen) / float64(s.len) + } logger.Info("seal block stats",Likely an incorrect or invalid review comment.
frac/sealed/sealing/sealer.go (1)
30-81
: Well-implemented sealing logic with proper error handlingThe implementation correctly handles atomic file operations with temporary files, proper syncing, and comprehensive error handling. The directory sync after rename ensures durability.
fracmanager/proxy_frac.go (2)
46-58
: Good defensive programming with cached name fieldCaching the fraction name prevents potential nil pointer dereferences during logging after the active fraction is released.
136-143
: Clean integration with new sealing abstractionThe refactoring properly separates concerns by using ActiveSealingSource and the new sealing.Seal function.
frac/active_docs_positions.go (2)
10-22
: Well-designed dual data structure for efficient lookupsThe separation of ID-to-position and LID-to-position mappings supports both lookup patterns efficiently while maintaining consistency.
43-61
: Robust duplicate handling in SetMultipleThe implementation correctly handles duplicate IDs by skipping entries with mismatched positions, preventing data inconsistency. The preallocation of the result slice is good for performance.
frac/sealed/sealing/blocks_builder_test.go (1)
16-392
: Comprehensive test coverage for block building logicExcellent test implementation with:
- Well-structured mock source
- Detailed assertions for all block types
- Table-driven test data for maintainability
- Coverage of edge cases like block boundaries
frac/sealed.go (2)
112-121
: Verify the reversed file opening orderThe file opening order has been reversed - now trying
.sdocs
before.docs
. Please confirm this is intentional and won't break compatibility with existing deployments.
126-172
: Proper initialization of file handles in preloaded constructorGood addition of
openDocs()
andopenIndex()
calls to ensure file handles are available even for preloaded fractions.frac/active_sealing_source.go (1)
346-354
: Well-implemented duplicate ID handlingThe logic for handling duplicate IDs in nested indexes is efficient and well-documented.
frac/active_index.go (1)
49-53
: common.Info.Name() VerifiedConfirmed that
*common.Info
implementsName()
infrac/common/info.go:67–72
. No action needed.fracmanager/config.go (1)
27-30
: LGTM – type switch is consistentSwitching to
common.SealParams
aligns with the refactor; field accesses below match the new struct. No further action needed.
135fab8
to
7c4fee0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
♻️ Duplicate comments (1)
frac/active_sealing_source.go (1)
273-273
: Handle error from deferred writer releaseWhile there's an explicit flush at line 281, the writer should still be properly released with error handling.
-defer bytespool.ReleaseWriter(bw) +defer func() { + if err := bytespool.FlushReleaseWriter(bw); err != nil { + logger.Error("failed to flush and release writer", zap.Error(err)) + } +}()
🧹 Nitpick comments (2)
frac/active_docs_positions.go (1)
57-57
: Monitor memory usage of lidToPosThe
lidToPos
slice grows unbounded as documents are added. Consider monitoring its memory usage in production environments with high document throughput.frac/sealed/sealing/blocks_builder.go (1)
150-151
: Address TODO comments for technical debtThe code contains TODOs indicating
IsLastLID
andisContinued
should be removed from the payload structure. This suggests incomplete refactoring.Would you like me to help refactor this to remove these deprecated fields?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (44)
bytespool/writer.go
(1 hunks)cmd/distribution/main.go
(4 hunks)cmd/index_analyzer/main.go
(2 hunks)cmd/seq-db/seq-db.go
(2 hunks)consts/consts.go
(1 hunks)disk/block_former.go
(0 hunks)disk/blocks_stats.go
(0 hunks)disk/blocks_writer.go
(0 hunks)disk/index_block_header.go
(1 hunks)frac/active.go
(4 hunks)frac/active_docs_positions.go
(2 hunks)frac/active_index.go
(2 hunks)frac/active_sealer.go
(0 hunks)frac/active_sealer_test.go
(0 hunks)frac/active_sealing_source.go
(1 hunks)frac/common/info.go
(2 hunks)frac/common/seal_params.go
(1 hunks)frac/disk_blocks.go
(0 hunks)frac/disk_blocks_producer.go
(0 hunks)frac/disk_blocks_writer.go
(0 hunks)frac/fraction.go
(2 hunks)frac/seal_stats.go
(0 hunks)frac/sealed.go
(7 hunks)frac/sealed/block_info.go
(2 hunks)frac/sealed/block_offsets.go
(1 hunks)frac/sealed/preloaded_data.go
(1 hunks)frac/sealed/sealing/blocks_builder.go
(1 hunks)frac/sealed/sealing/blocks_builder_test.go
(1 hunks)frac/sealed/sealing/index.go
(1 hunks)frac/sealed/sealing/sealer.go
(1 hunks)frac/sealed/sealing/stats.go
(1 hunks)frac/sealed/token/table.go
(1 hunks)frac/sealed/token/table_entry.go
(1 hunks)frac/sealed/token/table_loader.go
(1 hunks)frac/sealed_index.go
(2 hunks)frac/sealed_loader.go
(2 hunks)fracmanager/config.go
(2 hunks)fracmanager/fraction_provider.go
(4 hunks)fracmanager/proxy_frac.go
(6 hunks)fracmanager/sealed_frac_cache.go
(5 hunks)fracmanager/sealed_frac_cache_test.go
(14 hunks)fracmanager/sealer_test.go
(5 hunks)fracmanager/searcher_test.go
(2 hunks)tests/setup/env.go
(5 hunks)
💤 Files with no reviewable changes (9)
- frac/seal_stats.go
- disk/blocks_stats.go
- frac/active_sealer_test.go
- frac/disk_blocks.go
- disk/blocks_writer.go
- disk/block_former.go
- frac/active_sealer.go
- frac/disk_blocks_writer.go
- frac/disk_blocks_producer.go
🧰 Additional context used
🧬 Code Graph Analysis (20)
cmd/seq-db/seq-db.go (1)
frac/common/seal_params.go (1)
SealParams
(3-12)
bytespool/writer.go (1)
bytespool/bytespool.go (1)
Release
(28-30)
tests/setup/env.go (2)
frac/common/seal_params.go (1)
SealParams
(3-12)tests/common/util.go (1)
CreateDir
(35-40)
frac/sealed_index.go (1)
frac/common/info.go (1)
Info
(22-41)
frac/common/info.go (1)
seq/seq.go (1)
ID
(12-15)
cmd/index_analyzer/main.go (2)
frac/sealed/block_info.go (1)
BlockInfo
(15-17)logger/logger.go (1)
Fatal
(82-84)
frac/sealed/block_info.go (1)
frac/common/info.go (1)
Info
(22-41)
frac/active_index.go (1)
frac/common/info.go (1)
Info
(22-41)
fracmanager/sealed_frac_cache.go (1)
frac/common/info.go (1)
Info
(22-41)
fracmanager/config.go (1)
frac/common/seal_params.go (1)
SealParams
(3-12)
frac/fraction.go (1)
frac/common/info.go (1)
Info
(22-41)
frac/active_docs_positions.go (2)
seq/seq.go (1)
ID
(12-15)seq/doc_pos.go (1)
DocPos
(19-19)
cmd/distribution/main.go (2)
frac/common/info.go (1)
Info
(22-41)frac/sealed/block_info.go (1)
BlockInfo
(15-17)
fracmanager/searcher_test.go (1)
frac/common/info.go (2)
Info
(22-41)NewInfo
(43-57)
frac/active.go (1)
frac/common/info.go (2)
Info
(22-41)NewInfo
(43-57)
frac/sealed/sealing/blocks_builder.go (5)
frac/sealed/seqids/blocks.go (3)
BlockMIDs
(10-12)BlockRIDs
(32-35)BlockParams
(57-59)frac/sealed/token/table_loader.go (2)
TableBlock
(109-111)FieldTable
(113-116)frac/sealed/token/table_entry.go (1)
TableEntry
(4-12)seq/seq.go (3)
ID
(12-15)MID
(17-17)RID
(18-18)seq/doc_pos.go (1)
DocPos
(19-19)
frac/sealed/sealing/sealer.go (6)
frac/common/info.go (1)
Info
(22-41)frac/common/seal_params.go (1)
SealParams
(3-12)frac/sealed/preloaded_data.go (1)
PreloadedData
(10-16)consts/consts.go (2)
IndexTmpFileSuffix
(60-60)IndexFileSuffix
(59-59)frac/sealed/sealing/index.go (1)
NewIndexSealer
(37-41)util/fs.go (1)
MustSyncPath
(14-27)
frac/sealed_loader.go (1)
frac/sealed/block_offsets.go (1)
BlockOffsets
(8-11)
frac/sealed.go (7)
frac/common/info.go (1)
Info
(22-41)consts/consts.go (2)
SdocsFileSuffix
(55-55)DocsFileSuffix
(52-52)frac/sealed/preloaded_data.go (1)
PreloadedData
(10-16)frac/index_cache.go (1)
IndexCache
(10-18)frac/sealed/token/table_loader.go (1)
CacheKeyTable
(14-14)frac/sealed/token/table.go (1)
Table
(34-34)frac/sealed/block_info.go (1)
BlockInfo
(15-17)
frac/sealed/sealing/blocks_builder_test.go (5)
frac/common/info.go (1)
Info
(22-41)seq/seq.go (3)
ID
(12-15)MID
(17-17)RID
(18-18)seq/doc_pos.go (2)
DocPos
(19-19)PackDocPos
(21-34)frac/sealed/token/table_loader.go (2)
TableBlock
(109-111)FieldTable
(113-116)frac/sealed/token/table_entry.go (1)
TableEntry
(4-12)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test
🔇 Additional comments (40)
consts/consts.go (1)
16-16
: Confirm removal ofIDsBlockSize
is safeRipgrep search across the codebase found no occurrences of
IDsBlockSize
, so its removal won’t break any references. Repositioning ofRegularBlockSize
is purely cosmetic.bytespool/writer.go (2)
32-37
: Good refactoring for modularity.Extracting
ReleaseWriter
improves reusability and separation of concerns.
39-45
: Error handling preserved correctly.The logic maintains proper error handling - writer is only released after successful flush.
frac/sealed/token/table_loader.go (1)
108-108
: Good documentation addition.The comment clarifies the on-disk representation purpose of
TableBlock
.cmd/seq-db/seq-db.go (1)
25-25
: Type migration aligns with refactoring.Moving
SealParams
tocommon
package is consistent with the broader consolidation effort.Also applies to: 257-257
frac/active_index.go (1)
10-10
: Consistent type centralization.Moving
Info
tocommon
package follows the established refactoring pattern.Also applies to: 51-51
fracmanager/config.go (1)
10-10
: LGTM - Clean type migration to common packageThe migration from
frac.SealParams
tocommon.SealParams
is consistent with the broader refactoring effort.Also applies to: 27-27
frac/sealed_index.go (1)
13-13
: LGTM - Consistent with Info type migrationThe change from local
*Info
to*common.Info
maintains compatibility while centralizing the type definition.Also applies to: 58-58
frac/fraction.go (1)
9-9
: Interface signature change looks well-coordinatedThe migration of
Info()
method to return*common.Info
is consistent with the broader type centralization effort.Also applies to: 20-20
cmd/index_analyzer/main.go (1)
95-98
: Good fix - Error handling improvement addresses previous concernThe explicit error handling for
BlockInfo.Unpack
resolves the issue where errors were previously discarded, preventing corrupt state usage.fracmanager/searcher_test.go (1)
13-13
: Test code properly updated for interface changesThe test implementation correctly updates to use
*common.Info
andcommon.NewInfo
to align with the refactored interface.Also applies to: 31-32
tests/setup/env.go (1)
20-20
: Clean package restructuring.The import changes and type reference updates correctly handle the refactoring of
frac
types into thecommon
package while avoiding naming conflicts through appropriate aliasing.Also applies to: 31-31, 86-86, 214-214, 379-379
frac/sealed/block_info.go (1)
1-1
: Consistent package restructuring.The changes correctly update the package structure and type references to use the centralized
common.Info
type while maintaining the same functionality.Also applies to: 9-9, 16-16, 36-36
frac/sealed/token/table.go (1)
12-28
: Excellent documentation addition.The detailed comment and ASCII diagram significantly improve understanding of the token table structure and its relationships with blocks and entries. This type of conceptual documentation is valuable for maintainability.
frac/common/info.go (1)
1-1
: Good API modernization with iterators.The change from slice to
iter.Seq[seq.ID]
improves memory efficiency by avoiding loading all IDs into memory at once. The iterator pattern is more suitable for potentially large ID sequences.Also applies to: 5-5, 74-74, 78-78
frac/sealed/token/table_entry.go (1)
3-3
: Improved documentation clarity.The refined comments better explain the
TableEntry
's role within the token table structure and clarify theStartIndex
field semantics.Also applies to: 5-5
disk/index_block_header.go (1)
25-30
: LGTM: Clean refactor to size-based parameters.The change from byte slice parameters to explicit size integers is more efficient and cleaner, eliminating unnecessary buffer passing when only lengths are needed.
frac/sealed/preloaded_data.go (1)
10-16
: LGTM: Well-structured data container.The
PreloadedData
struct properly encapsulates the preloaded components with appropriate field types and follows Go naming conventions.cmd/distribution/main.go (1)
15-16
: LGTM: Consistent package refactoring.The import and type changes correctly align with the new package structure, moving from
frac
to the more specificfrac/common
andfrac/sealed
packages.Also applies to: 62-62, 73-73, 88-88
fracmanager/sealed_frac_cache.go (1)
13-13
: LGTM: Consistent type migration.All references to
frac.Info
have been properly updated to usecommon.Info
, maintaining consistency with the package restructuring.Also applies to: 26-26, 35-35, 79-79, 99-99
frac/common/seal_params.go (1)
3-12
: LGTM: Well-designed configuration struct.The
SealParams
struct effectively centralizes sealing parameters with clear field names and appropriate documentation for the document block fields.fracmanager/fraction_provider.go (4)
9-10
: LGTM!Clean import additions supporting the package refactoring.
49-49
: LGTM!Method signature correctly updated to use the new
common.Info
type.
60-60
: LGTM!Method signature correctly updated to use the new
sealed.PreloadedData
type.
76-76
: LGTM!Good refactor to use a proper constructor instead of direct struct instantiation.
frac/active.go (4)
20-20
: LGTM!Import addition supports the Info type refactoring.
38-38
: LGTM!Field type correctly updated to use
common.Info
.
103-103
: LGTM!Constructor call correctly updated to use the common package.
307-307
: LGTM!Method return type correctly updated to use
common.Info
.fracmanager/sealed_frac_cache_test.go (3)
16-18
: LGTM!Good practice aliasing the test common package to avoid naming conflicts with the new
frac/common
package.
29-29
: LGTM!Type references consistently updated to use
common.Info
throughout the test file.Also applies to: 35-35, 125-125, 146-146, 170-170, 237-237, 287-287, 372-372
51-54
: LGTM!Test helper function calls correctly updated to use the aliased package name.
Also applies to: 75-78, 105-107, 159-162, 229-232, 268-271, 349-352, 411-414
frac/sealed_loader.go (2)
9-9
: LGTM!Import addition supports the BlockOffsets refactoring.
77-88
: Excellent refactor!Replacing manual binary parsing with
BlockOffsets.Unpack
significantly improves code clarity and centralizes the parsing logic. Error handling is properly maintained.frac/sealed/block_offsets.go (3)
8-11
: LGTM!Clean struct definition. The TODO comment appropriately flags a potential future improvement.
13-23
: LGTM!Well-implemented packing with efficient delta encoding and proper buffer reuse.
25-50
: Excellent implementation!Comprehensive error handling covers all failure modes including varint overflow and count validation. The delta reconstruction logic is correct and the error messages are clear.
fracmanager/sealer_test.go (1)
118-118
: LGTM!Moving
fp.Stop()
from deferred to immediate execution ensures the fraction provider is stopped right after use, which is cleaner for benchmarking.frac/sealed/sealing/blocks_builder_test.go (1)
91-392
: Well-structured and comprehensive test coverageThe tests provide excellent coverage of the block building functionality with clear test data and assertions.
frac/sealed/sealing/blocks_builder.go (1)
201-225
: Well-structured token block builderClean implementation with proper length-prefixed encoding and efficient memory reuse.
7c4fee0
to
e1c78d6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
fracmanager/sealed_frac_cache.go (1)
141-179
: File descriptor leak and missing fsync on frac cache saveThe temp file is never closed and the containing dir isn’t fsync’d. This can leak FDs and risk data loss on crash.
Apply:
func (fc *sealedFracCache) SaveCacheToDisk(version uint64, content []byte) error { fc.saveMu.Lock() defer fc.saveMu.Unlock() @@ - tmp, err := os.CreateTemp(fc.dataDir, fc.fileName+".") + tmp, err := os.CreateTemp(fc.dataDir, fc.fileName+".") if err != nil { return fmt.Errorf("can't save frac cache: %w", err) } err = tmp.Chmod(defaultFilePermission) if err != nil { return fmt.Errorf("can't change frac cache file permission: %w", err) } - if _, err = tmp.Write(content); err != nil { + if _, err = tmp.Write(content); err != nil { return fmt.Errorf("can't save frac cache: %w", err) } - if err = os.Rename(tmp.Name(), fc.fullPath); err != nil { + // Ensure data is on disk before rename + if err := tmp.Sync(); err != nil { + return fmt.Errorf("can't fsync frac cache tmp file: %w", err) + } + if err := tmp.Close(); err != nil { + return fmt.Errorf("can't close frac cache tmp file: %w", err) + } + if err = os.Rename(tmp.Name(), fc.fullPath); err != nil { return fmt.Errorf("can't rename tmp to actual frac cache: %w", err) } + // fsync parent dir for durability + util.MustSyncPath(fc.dataDir) + fc.savedVersion.Store(version) logger.Info("frac cache saved to disk", zap.String("filepath", fc.fullPath), zap.Uint64("version", version)) return nil }Add missing import at top:
import "github.com/ozontech/seq-db/util"fracmanager/proxy_frac.go (1)
119-143
: Seal error paths leak waitgroup and wedge fraction in readonly state
- f.sealWg.Add(1) is not balanced on early returns.
- readonly is set to true and never reset on error, leaving the active fraction unwritable and not sealed.
Proposed fix:
-func (f *proxyFrac) Seal(params common.SealParams) (*frac.Sealed, error) { +func (f *proxyFrac) Seal(params common.SealParams) (*frac.Sealed, error) { f.useMu.Lock() if f.isSuicidedState() { f.useMu.Unlock() return nil, ErrSealingFractionSuicided } if !f.isActiveState() { f.useMu.Unlock() return nil, errors.New("sealing fraction is not active") } f.readonly = true active := f.active f.sealWg.Add(1) // It's important to put wg.Add() inside a lock, otherwise we might call wg.Wait() before it f.useMu.Unlock() + // Always release the waiter; revert readonly on failure. + var sealed *frac.Sealed + var err error + defer func() { + if err != nil { + f.useMu.Lock() + f.readonly = false + f.useMu.Unlock() + } + f.sealWg.Done() + }() + f.WaitWriteIdle() - src, err := frac.NewActiveSealingSource(active, params) - if err != nil { - return nil, err - } - preloaded, err := sealing.Seal(src, params) - if err != nil { - return nil, err - } + src, err := frac.NewActiveSealingSource(active, params) + if err != nil { + return nil, err + } + preloaded, err := sealing.Seal(src, params) + if err != nil { + return nil, err + } - sealed := f.fp.NewSealedPreloaded(active.BaseFileName, preloaded) + sealed = f.fp.NewSealedPreloaded(active.BaseFileName, preloaded) f.useMu.Lock() f.sealed = sealed f.active = nil f.useMu.Unlock() - f.sealWg.Done() - active.Release() return sealed, nil }Add tests to cover:
- Seal failure before creating src.
- Seal failure after src creation.
- Success path (ensures Done is called, readonly transitions, and active.Release).
I can provide the unit tests scaffolding if helpful.
♻️ Duplicate comments (2)
fracmanager/proxy_frac.go (1)
136-140
: Clean up sealing source on failure (follow-up to prior review)If sealing.Seal fails, release resources held by src. Consider adding Close() to ActiveSealingSource and invoking it on error.
Would you like me to sketch a Close() implementation in frac/active_sealing_source.go and wire it here?
frac/sealed/sealing/blocks_builder.go (1)
46-48
: LGTM - Error handling implemented correctly.Good addition of the error getter method addressing the previous feedback.
🧹 Nitpick comments (6)
consts/consts.go (1)
14-16
: No staleIDsBlockSize
references; clarify byte units for block-size constantsAll searches returned zero hits for the removed
IDsBlockSize
. To prevent future confusion between counts vs. byte sizes, please add a concise comment above the block constants:const ( + // Block sizes (in bytes): IDsPerBlock = int(4 * units.KiB) LIDBlockCap = int(64 * units.KiB) RegularBlockSize = int(16 * units.KiB) )
frac/sealed.go (1)
128-150
: Eagerly opening files after preloaded initProactively opening docs and index ensures readers are ready; acceptable trade-off. Keep an eye on startup latency if many fractions are created at once.
Also applies to: 154-159
frac/sealed/sealing/blocks_builder.go (4)
69-71
: Improve error specificity.The error message could be more descriptive to aid debugging.
- bb.lastErr = errors.New("empty token blocks") + bb.lastErr = errors.New("empty token blocks sequence provided")
94-96
: Improve error specificity.The error message should indicate which condition failed for better debugging.
- bb.lastErr = errors.New("fields and tokens not consistent") + bb.lastErr = errors.New("fields and tokens not consistent: field requires more tokens than available")
112-114
: Improve error specificity.Similar to above, this error should be more specific about the inconsistency.
- bb.lastErr = errors.New("fields and tokens not consistent") + bb.lastErr = errors.New("fields and tokens not consistent: unused token blocks remaining")
154-155
: Address TODO comments.The TODOs indicate deprecated fields that should be removed.
These TODO comments suggest
IsLastLID
andisContinued
fields are deprecated. Would you like me to help identify usage patterns and create a refactoring plan to remove them?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (44)
bytespool/writer.go
(1 hunks)cmd/distribution/main.go
(4 hunks)cmd/index_analyzer/main.go
(2 hunks)cmd/seq-db/seq-db.go
(2 hunks)consts/consts.go
(1 hunks)disk/block_former.go
(0 hunks)disk/blocks_stats.go
(0 hunks)disk/blocks_writer.go
(0 hunks)disk/index_block_header.go
(1 hunks)frac/active.go
(4 hunks)frac/active_docs_positions.go
(2 hunks)frac/active_index.go
(2 hunks)frac/active_sealer.go
(0 hunks)frac/active_sealer_test.go
(0 hunks)frac/active_sealing_source.go
(1 hunks)frac/common/info.go
(2 hunks)frac/common/seal_params.go
(1 hunks)frac/disk_blocks.go
(0 hunks)frac/disk_blocks_producer.go
(0 hunks)frac/disk_blocks_writer.go
(0 hunks)frac/fraction.go
(2 hunks)frac/seal_stats.go
(0 hunks)frac/sealed.go
(6 hunks)frac/sealed/block_info.go
(2 hunks)frac/sealed/block_offsets.go
(1 hunks)frac/sealed/preloaded_data.go
(1 hunks)frac/sealed/sealing/blocks_builder.go
(1 hunks)frac/sealed/sealing/blocks_builder_test.go
(1 hunks)frac/sealed/sealing/index.go
(1 hunks)frac/sealed/sealing/sealer.go
(1 hunks)frac/sealed/sealing/stats.go
(1 hunks)frac/sealed/token/table.go
(1 hunks)frac/sealed/token/table_entry.go
(1 hunks)frac/sealed/token/table_loader.go
(1 hunks)frac/sealed_index.go
(2 hunks)frac/sealed_loader.go
(2 hunks)fracmanager/config.go
(2 hunks)fracmanager/fraction_provider.go
(4 hunks)fracmanager/proxy_frac.go
(6 hunks)fracmanager/sealed_frac_cache.go
(5 hunks)fracmanager/sealed_frac_cache_test.go
(14 hunks)fracmanager/sealer_test.go
(5 hunks)fracmanager/searcher_test.go
(2 hunks)tests/setup/env.go
(5 hunks)
💤 Files with no reviewable changes (9)
- disk/blocks_stats.go
- frac/active_sealer.go
- disk/blocks_writer.go
- frac/seal_stats.go
- frac/disk_blocks_writer.go
- frac/active_sealer_test.go
- frac/disk_blocks_producer.go
- frac/disk_blocks.go
- disk/block_former.go
✅ Files skipped from review due to trivial changes (2)
- frac/sealed/token/table_loader.go
- frac/sealed/token/table.go
🚧 Files skipped from review as they are similar to previous changes (26)
- frac/sealed/token/table_entry.go
- tests/setup/env.go
- fracmanager/searcher_test.go
- fracmanager/config.go
- frac/sealed/preloaded_data.go
- fracmanager/sealed_frac_cache_test.go
- frac/sealed_index.go
- frac/sealed/sealing/index.go
- fracmanager/sealer_test.go
- bytespool/writer.go
- frac/sealed/block_info.go
- frac/active.go
- fracmanager/fraction_provider.go
- frac/common/info.go
- frac/sealed_loader.go
- frac/common/seal_params.go
- disk/index_block_header.go
- frac/sealed/sealing/sealer.go
- frac/sealed/block_offsets.go
- cmd/distribution/main.go
- frac/sealed/sealing/blocks_builder_test.go
- cmd/index_analyzer/main.go
- cmd/seq-db/seq-db.go
- frac/active_docs_positions.go
- frac/sealed/sealing/stats.go
- frac/active_sealing_source.go
🧰 Additional context used
🧬 Code Graph Analysis (6)
frac/active_index.go (1)
frac/common/info.go (1)
Info
(22-41)
fracmanager/sealed_frac_cache.go (1)
frac/common/info.go (1)
Info
(22-41)
frac/fraction.go (1)
frac/common/info.go (1)
Info
(22-41)
frac/sealed/sealing/blocks_builder.go (5)
frac/sealed/seqids/blocks.go (3)
BlockMIDs
(10-12)BlockRIDs
(32-35)BlockParams
(57-59)frac/sealed/token/table_loader.go (2)
TableBlock
(109-111)FieldTable
(113-116)frac/sealed/token/table_entry.go (1)
TableEntry
(4-12)seq/seq.go (3)
ID
(12-15)MID
(17-17)RID
(18-18)seq/doc_pos.go (1)
DocPos
(19-19)
fracmanager/proxy_frac.go (5)
frac/active.go (1)
Active
(28-60)frac/common/info.go (1)
Info
(22-41)frac/sealed/sealing/sealer.go (1)
Seal
(30-82)frac/common/seal_params.go (1)
SealParams
(3-12)frac/active_sealing_source.go (1)
NewActiveSealingSource
(45-78)
frac/sealed.go (5)
frac/common/info.go (1)
Info
(22-41)consts/consts.go (2)
SdocsFileSuffix
(55-55)DocsFileSuffix
(52-52)disk/docs_reader.go (1)
NewDocsReader
(16-21)frac/sealed/preloaded_data.go (1)
PreloadedData
(10-16)frac/sealed/block_info.go (1)
BlockInfo
(15-17)
🔇 Additional comments (10)
consts/consts.go (1)
16-16
: Move-only for RegularBlockSize — no functional change noted.Value remains 16 KiB; looks fine.
frac/fraction.go (1)
9-12
: AllInfo()
implementations now return*common.Info
— no mismatches found
- No occurrences of
func (…) Info() *frac.Info
remain- Interfaces in frac/fraction.go and frac/sealed/sealing/sealer.go both declare
Info() *common.Info
fracmanager/sealed_frac_cache.go (1)
13-15
: Manual Verification Needed: No legacy cache files found
Thefd -a 'frac-cache*.json'
scan returned no results in the repo. If you keep persisted cache JSON elsewhere, please rerun this check in the appropriate directory and confirm they still load correctly.fracmanager/proxy_frac.go (3)
46-58
: Nice: cached name for logging and constructorThe new name field and constructor simplify log calls and avoid repeated BaseFileName lookups.
111-117
: Log improvements are fineUsing the cached name in WaitWriteIdle logs is clean and consistent.
167-175
: Variable rename to isSealing is fine; logic unchangedReadability improvement without behavior change.
Also applies to: 178-184
frac/sealed.go (3)
14-16
: Migration to common.Info and sealed.PreloadedData is consistentThe type unification to *common.Info and use of sealed.PreloadedData across constructors and Info() accessor look correct.
Also applies to: 30-31, 69-71, 365-367
112-121
: Docs open fallback and messages look correctOpen .sdocs first, fallback to .docs on not-exist. Error messages now match the attempted file type.
174-186
: *loadHeader now returns common.Info using sealed.BlockInfoThe switch to sealed.BlockInfo and returning *common.Info is consistent with the refactor.
Also applies to: 181-181
frac/sealed/sealing/blocks_builder.go (1)
220-221
: Token encoding/decoding consistency verifiedThe builder appends the token length via binary.LittleEndian.AppendUint32 (four-byte LE) and then the token bytes. GetToken reads the length with binary.LittleEndian.Uint32 and slices offset+4 for the payload—matching exactly. No changes needed.
1d040f9
to
94e00f8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
fracmanager/sealed_frac_cache_test.go (1)
29-41
: Implement missing Name() and FullSize() on common.InfoThe
Info
struct infrac/common/info.go
currently has noName()
orFullSize()
methods, yet both are used throughout the cache and manager tests. Add them immediately to prevent compilation breaks and ensure correct basename extraction and size accounting:• In
frac/common/info.go
, after thetype Info struct {…}
block, add:import "path/filepath" func (i *Info) Name() string { return filepath.Base(i.Path) } func (i *Info) FullSize() int64 { // TODO: sum up the relevant size fields on Info, // e.g., return i.DocsTotal + i.OtherSizeField + … }frac/active_docs_positions.go (2)
24-36
: Get is unsynchronized and can race with SetMultiple; make Get thread-safe and deprecate GetSync.As written, calling Get concurrently with SetMultiple risks “concurrent map read and map write” panics. Prefer a single thread-safe Get and make GetSync a thin wrapper.
Apply this change:
func (dp *DocsPositions) Get(id seq.ID) seq.DocPos { - if val, ok := dp.idToPos[id]; ok { - return val - } - return seq.DocPosNotFound + dp.mu.RLock() + defer dp.mu.RUnlock() + if val, ok := dp.idToPos[id]; ok { + return val + } + return seq.DocPosNotFound } func (dp *DocsPositions) GetSync(id seq.ID) seq.DocPos { - dp.mu.RLock() - defer dp.mu.RUnlock() - - return dp.Get(id) + // Deprecated: Get is already thread-safe. + return dp.Get(id) }If you prefer keeping a lockless fast-path, please rename it to GetUnsafe, document the contract, and ensure all concurrent call sites use the synchronized variant. I can help generate the refactor.
38-61
: Guard against ids/pos length mismatch to avoid panic; clarify return semantics; handle zero-value initialization.
- Indexing pos[i] assumes equal lengths; if not, it panics. Add a defensive bound.
- Comment says “added ids,” but duplicates with the same position are appended as well. Clarify intent.
- If someone constructs DocsPositions as a zero value (bypassing the ctor), idToPos/lidToPos are nil; writes will panic. Initialize defensively.
Apply this diff:
-// SetMultiple returns a slice of added ids +// SetMultiple returns the IDs for which positions were appended to LID order. +// Duplicate IDs with the same position are allowed (appended); conflicting duplicates are skipped. func (dp *DocsPositions) SetMultiple(ids []seq.ID, pos []seq.DocPos) []seq.ID { dp.mu.Lock() defer dp.mu.Unlock() - appended := make([]seq.ID, 0, len(ids)) - for i, id := range ids { + // defensive init in case the zero-value struct is used + if dp.idToPos == nil { + dp.idToPos = make(map[seq.ID]seq.DocPos, len(ids)) + } + if dp.lidToPos == nil { + // index 0 reserved for systemID + dp.lidToPos = make([]seq.DocPos, 1) + } + + // zip lengths to avoid out-of-bounds if callers pass mismatched slices + n := len(ids) + if len(pos) < n { + n = len(pos) + } + appended := make([]seq.ID, 0, n) + for i := 0; i < n; i++ { + id := ids[i] p, ok := dp.idToPos[id] if ok { if p != pos[i] { // same ID but different position // this is a duplicate ID, we can't append it continue } } else { dp.idToPos[id] = pos[i] } dp.lidToPos = append(dp.lidToPos, pos[i]) appended = append(appended, id) } return appended }If mismatched lengths should be treated as programmer error instead, replace the zip logic with an explicit check that returns early or panics. Say the word and I’ll adjust accordingly.
♻️ Duplicate comments (1)
frac/sealed/sealing/index.go (1)
56-65
: Propagate FlushReleaseWriter errors (avoid silent data loss) and make the return namedRight now ReleaseWriter ignores potential buffered data and any flush error on early returns. Use FlushReleaseWriter in a defer and surface its error via a named return. This also resolves prior errcheck lint feedback.
-func (s *IndexSealer) WriteIndex(ws io.WriteSeeker, src Source) error { +func (s *IndexSealer) WriteIndex(ws io.WriteSeeker, src Source) (retErr error) { const prefixSize = 16 if _, err := ws.Seek(prefixSize, io.SeekStart); err != nil { // skip `prefixSize` bytes for pos and length of registry return err } hw := bytes.NewBuffer(nil) bw := bytespool.AcquireWriterSize(ws, int(units.MiB)) - defer bytespool.ReleaseWriter(bw) + defer func() { + if err := bytespool.FlushReleaseWriter(bw); err != nil && retErr == nil { + retErr = err + } + }()
🧹 Nitpick comments (15)
fracmanager/sealed_frac_cache_test.go (3)
35-38
: Minor: avoid preallocating the map before json.Unmarshal.json.Unmarshal will allocate the map when given a nil map. This is a tiny simplification.
Apply this diff:
- fracCache := make(map[string]*common.Info) + var fracCache map[string]*common.Info err = json.Unmarshal(content, &fracCache)
125-139
: DRY up duplicated newInfo literal.The same common.Info value is constructed twice. Use a small helper to centralize it and reduce noise.
Apply this diff to both occurrences:
- newInfo := &common.Info{ - Path: "/data/c", - Ver: "1.3", - DocsTotal: 0, - DocsOnDisk: 0, - DocsRaw: 0, - MetaOnDisk: 0, - IndexOnDisk: 0, - ConstRegularBlockSize: 0, - ConstIDsPerBlock: 0, - ConstLIDBlockCap: 100500, - From: 0, - To: 0, - CreationTime: 0, - } + newInfo := testInfoC()Add this helper (outside the selected ranges):
func testInfoC() *common.Info { return &common.Info{ Path: "/data/c", Ver: "1.3", ConstLIDBlockCap: 100500, } }Also applies to: 170-184
372-384
: Remove unused infos map in TestExtraFractionsRemoved.The map is written to but never read; trimming it clarifies intent.
Apply this diff:
- infos := map[string]*common.Info{} @@ - infos[info.Name()] = infofrac/active_docs_positions.go (1)
16-21
: Reserve index 0 without an extra append
DocPosNotFound
is defined asmath.MaxUint64
, so using 0 here is safe. Please apply this cleanup:- dp := DocsPositions{ - lidToPos: make([]seq.DocPos, 0), - idToPos: make(map[seq.ID]seq.DocPos), - } - dp.lidToPos = append(dp.lidToPos, 0) // systemID + dp := DocsPositions{ + // index 0 reserved for systemID + lidToPos: make([]seq.DocPos, 1), + idToPos: make(map[seq.ID]seq.DocPos), + }frac/common/seal_params.go (1)
1-12
: Enhance SealParams documentation
- Add a struct-level doc comment for
SealParams
.- Refine inline comments for consistency (“compression level” and “decompressed payload size”).
- TokenListZstdLevel is in active use (index sealer, config defaults, CLI, tests)—do not remove.
package common +// SealParams configures compression levels and document block layout used during sealing. type SealParams struct { IDsZstdLevel int LIDsZstdLevel int - TokenListZstdLevel int + TokenListZstdLevel int // TokenListZstdLevel is the zstd compression level for token lists. DocsPositionsZstdLevel int TokenTableZstdLevel int - DocBlocksZstdLevel int // DocBlocksZstdLevel is the zstd compress level of each document block. - DocBlockSize int // DocBlockSize is decompressed payload size of document block. + DocBlocksZstdLevel int // DocBlocksZstdLevel is the zstd compression level of each document block. + DocBlockSize int // DocBlockSize is the decompressed payload size of a document block. }frac/sealed/sealing/index.go (3)
66-70
: Fix typo: writeBocks → writeBlocksMinor readability nit. Rename the method and its call site.
- // write index blocks - if err := s.writeBocks(prefixSize, bw, hw, src); err != nil { + // write index blocks + if err := s.writeBlocks(prefixSize, bw, hw, src); err != nil { return err }-func (s *IndexSealer) writeBocks(pos int, payloadWriter, headersWriter io.Writer, src Source) error { +func (s *IndexSealer) writeBlocks(pos int, payloadWriter, headersWriter io.Writer, src Source) error { for block := range s.indexBlocks(src) { header, payload := block.Bin(int64(pos)) if _, err := payloadWriter.Write(payload); err != nil { return err } if _, err := headersWriter.Write(header); err != nil { return err } pos += len(payload) } if s.lastErr != nil { return s.lastErr } return nil }Also applies to: 102-117
300-303
: Use DocsPositionsZstdLevel for positions, not IDsZstdLevelPositions are configured via DocsPositionsZstdLevel in SealParams. Using IDsZstdLevel here is inconsistent with the other “positions” payload (offsets section) and with the config intent.
func (s *IndexSealer) packPosBlock(block idsSealBlock) indexBlock { s.buf1 = block.params.Pack(s.buf1[:0]) - b := s.newIndexBlockZSTD(s.buf1, s.params.IDsZstdLevel) + b := s.newIndexBlockZSTD(s.buf1, s.params.DocsPositionsZstdLevel) return b }
180-201
: If IDBlocksTotal should reflect IDs blocks, set it after building IDs blocksAssuming the intent is number of IDs blocks, set IDBlocksTotal here based on the number of MinBlockIDs collected, and drop the earlier assignment tied to doc Offsets.
// IDs section s.idsTable.StartBlockIndex = blocksCounter statsMIDs, statsRIDs, statsParams := startStats(), startStats(), startStats() for block := range makeIDsSealBlocks(src.IDsBlocks(consts.IDsPerBlock)) { if !push(s.packMIDsBlock(block), &statsMIDs) { return } if !push(s.packRIDsBlock(block), &statsRIDs) { return } if !push(s.packPosBlock(block), &statsParams) { return } } + // after the loop, the number of IDs blocks equals the number of MinBlockIDs entries + s.idsTable.IDBlocksTotal = uint32(len(s.idsTable.MinBlockIDs)) if s.lastErr = util.CollapseErrors([]error{src.LastError(), bb.Err()}); s.lastErr != nil { return }frac/sealed.go (1)
152-159
: Eager open of docs and index in preloaded path—double-check startup I/O budgetOpening both files immediately improves readiness but removes previous lazy behavior. If fractions are created in bulk, consider keeping it lazy to reduce startup I/O and FD usage.
frac/sealed/sealing/blocks_builder_test.go (6)
63-73
: Nit/Robustness: avoid overshadowing imported packagetoken
; enforce size boundary before append
- Local variable
token
shadows the imported package nametoken
(readability).- Current check allows a block to exceed
size
if a single token is larger than remaining space. Safer to check before append.- for _, token := range m.tokens { - if blockSize >= size { + for _, tok := range m.tokens { + // Emit current block if the next token won't fit. + if blockSize > 0 && blockSize+len(tok)+4 > size { if !yield(block) { return } blockSize = 0 block = block[:0] } - block = append(block, token) - blockSize += len(token) + 4 + block = append(block, tok) + blockSize += len(tok) + 4 }
74-75
: Avoid yielding empty trailing blockGuard the final yield to skip emitting an empty block (helps edge cases and keeps iterator semantics clean).
- yield(block) + if len(block) > 0 { + yield(block) + }
55-56
: Avoid yielding empty trailing IDs/POS blockIf the source is empty or a multiple of
size
, this may still emit an empty block. Guard the final yield.- yield(ids, pos) + if len(ids) > 0 { + yield(ids, pos) + }
42-56
: Optional: avoid slice aliasing across yieldsYou reuse the backing arrays via
ids = ids[:0]
andpos = pos[:0]
. If a consumer retains the slices beyond the callback, data races/aliasing can occur. To make the mock robust, yield clones.for i, id := range m.ids { if len(ids) == size { - if !yield(ids, pos) { + idsCopy := append([]seq.ID(nil), ids...) + posCopy := append([]seq.DocPos(nil), pos...) + if !yield(idsCopy, posCopy) { return } ids = ids[:0] pos = pos[:0] } ids = append(ids, id) pos = append(pos, m.pos[i]) } - yield(ids, pos) + if len(ids) > 0 { + idsCopy := append([]seq.ID(nil), ids...) + posCopy := append([]seq.DocPos(nil), pos...) + yield(idsCopy, posCopy) + }
138-138
: Nit: assert.Equal argument orderUse expected, actual for clearer diffs.
- assert.Equal(t, tid, len(src.tokens)) + assert.Equal(t, len(src.tokens), tid)
232-232
: Nit: assert.Equal argument orderSwap to expected, actual for consistency and clearer failure output.
- assert.Equal(t, actualTokenTable.FieldsTables, expectedTokenTable.FieldsTables) + assert.Equal(t, expectedTokenTable.FieldsTables, actualTokenTable.FieldsTables)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (46)
bytespool/writer.go
(1 hunks)cmd/distribution/main.go
(4 hunks)cmd/index_analyzer/main.go
(2 hunks)cmd/seq-db/seq-db.go
(2 hunks)consts/consts.go
(1 hunks)disk/block_former.go
(0 hunks)disk/blocks_stats.go
(0 hunks)disk/blocks_writer.go
(0 hunks)disk/index_block_header.go
(1 hunks)frac/active.go
(4 hunks)frac/active_docs_positions.go
(2 hunks)frac/active_index.go
(2 hunks)frac/active_sealer.go
(0 hunks)frac/active_sealer_test.go
(0 hunks)frac/active_sealing_source.go
(1 hunks)frac/common/info.go
(2 hunks)frac/common/seal_params.go
(1 hunks)frac/disk_blocks.go
(0 hunks)frac/disk_blocks_producer.go
(0 hunks)frac/disk_blocks_writer.go
(0 hunks)frac/fraction.go
(2 hunks)frac/seal_stats.go
(0 hunks)frac/sealed.go
(6 hunks)frac/sealed/block_info.go
(2 hunks)frac/sealed/block_offsets.go
(1 hunks)frac/sealed/preloaded_data.go
(1 hunks)frac/sealed/sealing/blocks_builder.go
(1 hunks)frac/sealed/sealing/blocks_builder_test.go
(1 hunks)frac/sealed/sealing/index.go
(1 hunks)frac/sealed/sealing/sealer.go
(1 hunks)frac/sealed/sealing/stats.go
(1 hunks)frac/sealed/token/table.go
(1 hunks)frac/sealed/token/table_entry.go
(1 hunks)frac/sealed/token/table_loader.go
(1 hunks)frac/sealed_index.go
(2 hunks)frac/sealed_loader.go
(2 hunks)fracmanager/async_searcher_test.go
(2 hunks)fracmanager/config.go
(2 hunks)fracmanager/fracmanager.go
(1 hunks)fracmanager/fraction_provider.go
(3 hunks)fracmanager/proxy_frac.go
(6 hunks)fracmanager/sealed_frac_cache.go
(5 hunks)fracmanager/sealed_frac_cache_test.go
(14 hunks)fracmanager/sealer_test.go
(5 hunks)fracmanager/searcher_test.go
(2 hunks)tests/setup/env.go
(5 hunks)
💤 Files with no reviewable changes (9)
- frac/seal_stats.go
- frac/active_sealer.go
- frac/disk_blocks_producer.go
- disk/blocks_stats.go
- disk/blocks_writer.go
- disk/block_former.go
- frac/active_sealer_test.go
- frac/disk_blocks_writer.go
- frac/disk_blocks.go
✅ Files skipped from review due to trivial changes (1)
- frac/sealed/token/table_loader.go
🚧 Files skipped from review as they are similar to previous changes (25)
- frac/active_index.go
- frac/sealed/block_info.go
- disk/index_block_header.go
- fracmanager/searcher_test.go
- bytespool/writer.go
- frac/sealed/token/table.go
- cmd/index_analyzer/main.go
- frac/sealed/sealing/sealer.go
- frac/active.go
- frac/sealed_index.go
- fracmanager/fraction_provider.go
- frac/sealed/preloaded_data.go
- fracmanager/config.go
- frac/sealed_loader.go
- frac/fraction.go
- frac/sealed/sealing/blocks_builder.go
- frac/sealed/sealing/stats.go
- frac/active_sealing_source.go
- frac/sealed/token/table_entry.go
- frac/sealed/block_offsets.go
- tests/setup/env.go
- fracmanager/proxy_frac.go
- fracmanager/sealed_frac_cache.go
- frac/common/info.go
- cmd/distribution/main.go
🧰 Additional context used
🧬 Code Graph Analysis (8)
cmd/seq-db/seq-db.go (1)
frac/common/seal_params.go (1)
SealParams
(3-12)
fracmanager/sealed_frac_cache_test.go (4)
frac/common/info.go (1)
Info
(22-41)tests/common/util.go (3)
GetTestTmpDir
(69-71)RecreateDir
(49-52)RemoveDir
(42-47)consts/consts.go (1)
FracCacheFileSuffix
(63-63)fracmanager/sealed_frac_cache.go (1)
NewSealedFracCache
(33-44)
fracmanager/async_searcher_test.go (2)
frac/fraction.go (1)
Fraction
(19-25)frac/common/info.go (1)
Info
(22-41)
frac/active_docs_positions.go (2)
seq/seq.go (1)
ID
(12-15)seq/doc_pos.go (1)
DocPos
(19-19)
frac/sealed/sealing/blocks_builder_test.go (5)
frac/common/info.go (1)
Info
(22-41)seq/seq.go (3)
ID
(12-15)MID
(17-17)RID
(18-18)seq/doc_pos.go (2)
DocPos
(19-19)PackDocPos
(21-34)frac/sealed/token/table_loader.go (2)
TableBlock
(109-111)FieldTable
(113-116)frac/sealed/token/table_entry.go (1)
TableEntry
(4-12)
frac/sealed/sealing/index.go (12)
frac/common/seal_params.go (1)
SealParams
(3-12)frac/sealed/token/table.go (1)
Table
(34-34)disk/index_block_header.go (2)
IndexBlockHeader
(19-19)NewIndexBlockHeader
(25-34)frac/sealed/sealing/sealer.go (1)
Source
(20-28)bytespool/writer.go (3)
AcquireWriterSize
(18-30)ReleaseWriter
(32-37)Writer
(13-16)frac/common/info.go (1)
Info
(22-41)frac/sealed/block_info.go (1)
BlockInfo
(15-17)consts/consts.go (3)
RegularBlockSize
(16-16)IDsPerBlock
(14-14)LIDBlockCap
(15-15)util/errors_collapser.go (1)
CollapseErrors
(25-27)frac/sealed/block_offsets.go (1)
BlockOffsets
(8-11)frac/sealed/token/table_loader.go (2)
TableBlock
(109-111)TableFromBlocks
(49-71)seq/seq.go (3)
ID
(12-15)MID
(17-17)RID
(18-18)
frac/sealed.go (5)
frac/common/info.go (1)
Info
(22-41)consts/consts.go (2)
SdocsFileSuffix
(55-55)DocsFileSuffix
(52-52)disk/docs_reader.go (1)
NewDocsReader
(16-21)frac/sealed/preloaded_data.go (1)
PreloadedData
(10-16)frac/sealed/block_info.go (1)
BlockInfo
(15-17)
fracmanager/sealer_test.go (5)
frac/common/seal_params.go (1)
SealParams
(3-12)tests/common/util.go (1)
RecreateDir
(49-52)frac/sealed/preloaded_data.go (1)
PreloadedData
(10-16)frac/active_sealing_source.go (1)
NewActiveSealingSource
(45-78)frac/sealed/sealing/sealer.go (1)
Seal
(30-82)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: test
- GitHub Check: lint
🔇 Additional comments (22)
consts/consts.go (1)
16-16
: Document RegularBlockSize & add deprecated alias for IDsBlockSizeI confirmed there are no in-repo references to
IDsBlockSize
, but since it’s an exported constant it may still be used downstream. Meanwhile,RegularBlockSize
is referenced infrac/common/info.go
andfrac/sealed/sealing/index.go
. To smooth the transition:• In
consts/consts.go
, add a doc comment aboveRegularBlockSize
.
• In the same file, introduce a one-release deprecated alias forIDsBlockSize
.
• Manually verify with downstream consumers that removingIDsBlockSize
won’t break their builds.Apply:
--- a/consts/consts.go +++ b/consts/consts.go @@ const ( - RegularBlockSize = int(16 * units.KiB) + // RegularBlockSize is the default byte size for regular (non-LID) blocks. + RegularBlockSize = int(16 * units.KiB) ) +// Deprecated: prefer using common.SealParams.BlockSize or consts.RegularBlockSize. +const IDsBlockSize = RegularBlockSizefracmanager/sealed_frac_cache_test.go (3)
16-19
: Migration to common.Info and tests_common helpers looks correct.Imports align with the broader refactor. Type usage in assertions and helper calls is consistent.
29-41
: *loadFracCache now returns map[string]common.Info — good alignment.The helper mirrors the production cache type; JSON tags in common.Info will parse the fixture correctly.
51-54
: LGTM: switched to tests_common for tmpdir lifecycle.Consistent temp dir creation/cleanup improves isolation across tests.
Also applies to: 75-79, 105-107, 159-162, 229-233, 268-272, 349-353, 411-415
frac/active_docs_positions.go (1)
10-12
: Dual-container layout (idToPos + lidToPos) makes sense for GID↔LID flows.This aligns with sealing’s LID mapping and looks appropriate.
fracmanager/fracmanager.go (1)
66-72
: Constructor use improves encapsulation and future-proofing.Switching to newProxyFrac(...) avoids leaking struct internals and centralizes invariants. Looks good.
fracmanager/async_searcher_test.go (2)
11-26
: Migration to common.Info is consistent with Fraction interface changes.Type and method signature updates line up with frac.Fraction.Info() returning *common.Info. No issues spotted.
58-60
: Test data init updated correctly.Initializing fakeFrac with common.Info is correct and keeps the test behavior intact.
cmd/seq-db/seq-db.go (2)
25-26
: Import change aligns with type migration.Switching to frac/common is expected after moving SealParams/Info. No concerns.
257-265
: SealParams wiring looks correct.Levels come from the intended compression knobs; DocBlockSize sourced from DocsSorting. This matches the new sealing API.
fracmanager/sealer_test.go (5)
87-98
: Defaults updated to new common.SealParams — sane baseline.Min zstd levels and 128 KiB DocBlockSize are reasonable for benchmarks.
113-114
: Test util swap is fine.Using tests_common.RecreateDir keeps tests consistent with shared helpers.
118-119
: Double-check: Stopping the provider before sealing.You now stop fp immediately after filling. Ensure NewActiveSealingSource/Seal don’t rely on provider-owned resources (e.g., background workers, pools). If there’s any coupling, consider deferring Stop until after sealing to avoid subtle flakiness in benchmarks.
Would you like me to scan usages of Active.sortReader and provider links to confirm no runtime dependency?
128-130
: Warm-up sealing before benchmarking is a good call.Pre-sorting/warm-up outside the timed loop isolates the steady-state cost.
147-149
: Benchmark loop body updated correctly.Sealing inside the loop while checking errors keeps the measurement focused.
frac/sealed/sealing/index.go (4)
80-86
: Good: write error is now checked before flushingChecking the bw.Write error before Flush fixes the earlier errcheck finding and prevents hiding write failures.
271-273
: Verify: IDBlocksTotal derived from doc Offsets length looks offIDBlocksTotal is set from len(block.Offsets) (doc blocks). Typically, the number of IDs blocks differs from doc blocks (IDsPerBlock vs DocBlockSize). If the provider expects the count of IDs blocks, this is likely wrong.
If it should reflect IDs blocks, consider removing this assignment here and set it after the IDs loop to the number of MIDs/RIDs/Params blocks produced (e.g., len(s.idsTable.MinBlockIDs)). I can prepare a patch once you confirm the intended meaning.
280-286
: Confirm ordering: “minID” taken from the last elementYou append MinBlockIDs but derive it from the last element of mids/rids. Is the block order descending so last == min? If not, this should use index 0. Please confirm the invariant of idsSealBlock.mids.Values ordering.
306-323
: LIDs block: minTID is mutated when continued; confirm PreloadedData expectationsYou increment minTID when isContinued and then persist that adjusted value in LIDsTable.MinTIDs and in ext2. This matches the comment “get rid of this confusion,” but it can be a subtle off-by-one if consumers expect the physical min TID in the block vs the logical first new TID. Please confirm downstream consumers expect the incremented value.
frac/sealed.go (2)
109-120
: LGTM: safer .sdocs-first open with clear error messages and .docs fallbackThe revised openDocs flow looks correct: try .sdocs first, only fall back on not-exist, and fatal on other errors with accurate messages.
174-194
: *LGTM: header load now returns common.InfoThe switch to sealed.BlockInfo and returning *common.Info aligns with the new common.Info unification.
frac/sealed/sealing/blocks_builder_test.go (1)
387-389
: Good defensive copy to avoid aliasingCloning LIDs and Offsets before comparing prevents subtle iterator aliasing. Nice touch.
94e00f8
to
dc7368c
Compare
❌ PR Title Validation Failed |
❌ PR Title Validation Failed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
frac/remote.go (2)
100-109
: Avoid nil deref after openIndex failure; don’t call loadHeader on errorIf openIndex fails, the code logs an error and continues to loadHeader, likely with nil readers, risking a panic and leaving f.info nil (later used in Contains/IsIntersecting).
if err := f.openIndex(); err != nil { logger.Error( "cannot open index file: any subsequent operation will fail", zap.String("fraction", filepath.Base(f.BaseFileName)), zap.Error(err), ) - } - - f.info = loadHeader(f.indexFile, f.indexReader) + // Best-effort: reuse provided info if any and return early. + f.info = info + return f + } + + f.info = loadHeader(f.indexFile, f.indexReader)Follow-up: Consider making NewRemote return (*Remote, error) to surface initialization failures explicitly.
132-140
: Guard logging against nil Info in error pathWhen load() fails, logging uses f.Info().Name(), which will panic if f.info is nil (e.g., after constructor’s early error). Log the base filename instead.
- zap.String("fraction", f.Info().Name()), + zap.String("fraction", filepath.Base(f.BaseFileName)),fracmanager/proxy_frac.go (1)
121-159
: Seal error path leaves readonly set and leaks sealWg; fix with defer rollbackIf NewActiveSealingSource or sealing.Seal fails, method returns without calling sealWg.Done() and without resetting readonly, leaving the fraction stuck and Waiters blocked.
-func (f *proxyFrac) Seal(params common.SealParams) (*frac.Sealed, error) { +func (f *proxyFrac) Seal(params common.SealParams) (_ *frac.Sealed, err error) { f.useMu.Lock() if f.isSuicidedState() { f.useMu.Unlock() return nil, ErrSealingFractionSuicided } if !f.isActiveState() { f.useMu.Unlock() return nil, errors.New("sealing fraction is not active") } f.readonly = true active := f.active f.sealWg.Add(1) // It's important to put wg.Add() inside a lock, otherwise we might call wg.Wait() before it f.useMu.Unlock() + // Ensure we rollback readonly and release the waiter on any error path. + defer func() { + if err != nil { + f.useMu.Lock() + f.readonly = false + f.useMu.Unlock() + f.sealWg.Done() + } + }() + f.WaitWriteIdle() - src, err := frac.NewActiveSealingSource(active, params) + src, err := frac.NewActiveSealingSource(active, params) if err != nil { return nil, err } preloaded, err := sealing.Seal(src, params) if err != nil { return nil, err } sealed := f.fp.NewSealedPreloaded(active.BaseFileName, preloaded) f.useMu.Lock() f.sealed = sealed f.active = nil f.useMu.Unlock() - f.sealWg.Done() + f.sealWg.Done() active.Release() return sealed, nil }
♻️ Duplicate comments (6)
fracmanager/sealer_test.go (1)
121-125
: Return on error instead of asserting; avoid nil src panic.
If NewActiveSealingSource fails, assert.NoError won’t stop execution; passing a nil src into sealing.Seal may panic.Apply:
- seal := func(active *frac.Active, params common.SealParams) (*sealed.PreloadedData, error) { - src, err := frac.NewActiveSealingSource(active, params) - assert.NoError(b, err) - return sealing.Seal(src, params) - } + seal := func(active *frac.Active, params common.SealParams) (*sealed.PreloadedData, error) { + src, err := frac.NewActiveSealingSource(active, params) + if err != nil { + return nil, err + } + return sealing.Seal(src, params) + }fracmanager/proxy_frac.go (1)
138-145
: Clean up ActiveSealingSource on Seal failurePrevious review flagged missing cleanup; it’s still unaddressed. Ensure resources held by src are released if sealing fails.
src, err := frac.NewActiveSealingSource(active, params) if err != nil { return nil, err } + // Release src on failure paths. + defer func() { + if err != nil { + if c, ok := interface{ Close() error }(src); ok { + _ = c.Close() + } + } + }() preloaded, err := sealing.Seal(src, params) if err != nil { return nil, err }Outside this file, add Close() to ActiveSealingSource (no-op is fine initially, expand as needed):
// in frac/active_sealing_source.go func (s *ActiveSealingSource) Close() error { // TODO: free buffers, release readers if any return nil }frac/sealed/sealing/index.go (1)
64-64
: Handle error from deferred writer releaseThe deferred
ReleaseWriter
doesn't flush buffered data. UseFlushReleaseWriter
instead and handle any error.-defer bytespool.ReleaseWriter(bw) +defer func() { + if err := bytespool.FlushReleaseWriter(bw); err != nil { + // Log error or handle appropriately + } +}()frac/sealed/sealing/blocks_builder_test.go (2)
30-38
: Fix invalid range-over-int syntaxThe
for i := range len(m.fields)
construct requires Go 1.22+. Use a standard loop for compatibility.func (m *mockSource) Fields() iter.Seq2[string, uint32] { return func(yield func(string, uint32) bool) { - for i := range len(m.fields) { + for i := 0; i < len(m.fields); i++ { if !yield(m.fields[i], m.fieldMaxTIDs[i]) { return } } } }
131-134
: Fix invalid range-over-int in test loopSame compatibility issue here with
range block.payload.Len()
.- for i := range block.payload.Len() { + for i := 0; i < block.payload.Len(); i++ { tid++ assert.Equal(t, src.tokens[tid-1], block.payload.GetToken(i)) }frac/active_sealing_source.go (1)
272-273
: Handle error from deferred FlushReleaseWriterThe error from
FlushReleaseWriter
is not being checked, which could silently fail to flush buffered data.bw := bytespool.AcquireWriterSize(sdocsFile, int(units.MiB)) -defer bytespool.ReleaseWriter(bw) +defer func() { + if err := bytespool.FlushReleaseWriter(bw); err != nil { + logger.Error("failed to flush and release writer", zap.Error(err)) + } +}()
🧹 Nitpick comments (13)
fracmanager/async_searcher_test.go (2)
56-57
: Use a shorter retention in tests to speed up and reduce flake surface.- Retention: time.Hour, + Retention: 2 * time.Second,
63-64
: Guard the test’s WaitGroup with a timeout to prevent CI hangs
Wrap theas.processWg.Wait()
call in a goroutine and select on atime.After
, for example:done := make(chan struct{}) go func() { as.processWg.Wait() close(done) }() select { case <-done: // all tasks finished case <-time.After(5 * time.Second): t.Fatal("timeout waiting for AsyncSearcher to finish") }AsyncSearcher has no
Stop
/Close
/Shutdown
method yet; if you introduce one, defer it immediately afterMustStartAsync
.fracmanager/sealer_test.go (3)
87-98
: Consider centralizing default SealParams for tests.
To avoid drift across benchmarks/tests, move these defaults into tests/common (e.g., tests_common.DefaultSealParams()) and reuse here.
111-111
: Double Stop() — choose one to avoid double-close.
You both defer fp.Stop() and call fp.Stop() immediately. If Stop() isn’t idempotent, this can race or error.Prefer keeping the immediate Stop() (to remove background noise from the benchmark) and drop the deferred one:
- defer fp.Stop()
Alternatively, if Stop() must always run at function end, drop the immediate call:
- fp.Stop()
Also applies to: 119-119
148-151
: Optional: separate source-build cost from sealing cost.
If you want a pure “sealing” metric, build src once per iteration outside the timed section or add a second benchmark variant that reuses a prebuilt src in the loop.fracmanager/sealed_frac_cache_test.go (4)
61-61
: Prefer JSONEq for cache-content assertions to avoid brittle byte comparisons.Using JSON-aware equality makes tests resilient to whitespace/ordering.
Apply this diff:
- assert.Equal(t, []byte("{}"), content) + assert.JSONEq(t, "{}", string(content)) - assert.Equal(t, []byte("{}"), content) + assert.JSONEq(t, "{}", string(content)) - assert.Equal(t, contents, []byte("{}")) + assert.JSONEq(t, "{}", string(contents)) - assert.Equal(t, contents, []byte("{}")) + assert.JSONEq(t, "{}", string(contents)) - assert.Equal(t, []byte("{}"), cacheStr) + assert.JSONEq(t, "{}", string(cacheStr)) - assert.Equal(t, fracCacheFromDisk, []byte("{}")) + assert.JSONEq(t, "{}", string(fracCacheFromDisk))Also applies to: 71-71, 123-123, 155-155, 259-259, 480-480
125-139
: Avoid path-like values in Info.Path (json:"name"); store the canonical fraction name.Elsewhere “name” is just “a”/“b”. Keeping it consistent reduces ambiguity and simplifies comparisons.
newInfo := &common.Info{ - Path: "/data/c", + Path: "c", Ver: "1.3", DocsTotal: 0, DocsOnDisk: 0, DocsRaw: 0, MetaOnDisk: 0, IndexOnDisk: 0, ConstRegularBlockSize: 0, ConstIDsPerBlock: 0, ConstLIDBlockCap: 100500, From: 0, To: 0, CreationTime: 0, }
170-184
: Same here: set Info.Path to the fraction name, not a filesystem path.Keeps JSON field “name” semantics uniform across tests.
newInfo := &common.Info{ - Path: "/data/c", + Path: "c", Ver: "1.3", DocsTotal: 0, DocsOnDisk: 0, DocsRaw: 0, MetaOnDisk: 0, IndexOnDisk: 0, ConstRegularBlockSize: 0, ConstIDsPerBlock: 0, ConstLIDBlockCap: 100500, From: 0, To: 0, CreationTime: 0, }
188-191
: Use the domain method for lookups instead of re-deriving the key.Rely on Info.Name() rather than filepath.Base to avoid drift if Name() logic changes.
- fracFromDisk, has := f.GetFracInfo(filepath.Base(newInfo.Path)) + fracFromDisk, has := f.GetFracInfo(newInfo.Name())frac/sealed_loader.go (1)
91-99
: Preallocate MinBlockIDs capacity to reduce allocationsYou know the block count from b.Offsets. Preallocate to avoid repeated reslices during append.
idsTable.IDsTotal = b.IDsTotal idsTable.IDBlocksTotal = uint32(len(b.Offsets)) idsTable.StartBlockIndex = l.blockIndex + // Preallocate to the exact number of ID blocks. + idsTable.MinBlockIDs = make([]seq.ID, 0, idsTable.IDBlocksTotal) for {frac/sealed/sealing/index.go (1)
119-229
: Consider splitting the indexBlocks method for better maintainabilityThis 110-line generator function handles multiple responsibilities and nested closures. Consider extracting section-specific logic into separate methods.
For example, extract token section handling:
func (s *IndexSealer) writeTokenSection(src Source, bb *blocksBuilder, push func(indexBlock, *blocksStats) bool) error { statsTokens := startStats() tokensBlocks, tokenTable := bb.TokensBlocksWithTokenTable(src.TokenBlocks(consts.RegularBlockSize), src.Fields()) for block := range tokensBlocks { if !push(s.packTokenBlock(block), &statsTokens) { return nil } } return util.CollapseErrors([]error{src.LastError(), bb.Err()}) }frac/sealed.go (1)
121-130
: Consider extracting the file opening logic to reduce duplicationThe pattern of trying
.sdocs
first then falling back to.docs
appears both here and inNewSealedPreloaded
. Consider extracting this into a helper function.+func openDocsFile(baseFileName string) (*os.File, error) { + // Try opening *.sdocs file first + docsFile, err := os.Open(baseFileName + consts.SdocsFileSuffix) + if err != nil { + if !errors.Is(err, os.ErrNotExist) { + return nil, err + } + // Fallback to *.docs file + docsFile, err = os.Open(baseFileName + consts.DocsFileSuffix) + if err != nil { + return nil, err + } + } + return docsFile, nil +} func (f *Sealed) openDocs() { if f.docsFile == nil { var err error - f.docsFile, err = os.Open(f.BaseFileName + consts.SdocsFileSuffix) // try first open *.sdocs file + f.docsFile, err = openDocsFile(f.BaseFileName) if err != nil { - if !errors.Is(err, os.ErrNotExist) { - logger.Fatal("can't open sdocs file", zap.String("frac", f.BaseFileName), zap.Error(err)) - } - f.docsFile, err = os.Open(f.BaseFileName + consts.DocsFileSuffix) // fallback to *.docs file - if err != nil { - logger.Fatal("can't open docs file", zap.String("frac", f.BaseFileName), zap.Error(err)) - } + logger.Fatal("can't open docs file", zap.String("frac", f.BaseFileName), zap.Error(err)) } f.docsReader = storage.NewDocsReader(f.readLimiter, f.docsFile, f.docsCache) } }frac/active_sealing_source.go (1)
233-234
: Document the systemSeqID behaviorThe special handling of
systemSeqID
that results in nil documents isn't immediately clear. Consider adding a comment.for i, id := range ids { + // systemSeqID is a sentinel value that indicates no actual document if id == systemSeqID { curDoc = nil
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (44)
bytespool/writer.go
(1 hunks)cmd/distribution/main.go
(4 hunks)cmd/index_analyzer/main.go
(2 hunks)cmd/seq-db/seq-db.go
(2 hunks)consts/consts.go
(1 hunks)frac/active.go
(4 hunks)frac/active_docs_positions.go
(2 hunks)frac/active_index.go
(2 hunks)frac/active_sealer.go
(0 hunks)frac/active_sealer_test.go
(0 hunks)frac/active_sealing_source.go
(1 hunks)frac/common/info.go
(2 hunks)frac/common/seal_params.go
(1 hunks)frac/disk_blocks.go
(0 hunks)frac/disk_blocks_producer.go
(0 hunks)frac/disk_blocks_writer.go
(0 hunks)frac/fraction.go
(2 hunks)frac/remote.go
(4 hunks)frac/seal_stats.go
(0 hunks)frac/sealed.go
(7 hunks)frac/sealed/block_info.go
(2 hunks)frac/sealed/block_offsets.go
(1 hunks)frac/sealed/preloaded_data.go
(1 hunks)frac/sealed/sealing/blocks_builder.go
(1 hunks)frac/sealed/sealing/blocks_builder_test.go
(1 hunks)frac/sealed/sealing/index.go
(1 hunks)frac/sealed/sealing/sealer.go
(1 hunks)frac/sealed/sealing/stats.go
(1 hunks)frac/sealed/token/table.go
(1 hunks)frac/sealed/token/table_entry.go
(1 hunks)frac/sealed/token/table_loader.go
(1 hunks)frac/sealed_index.go
(2 hunks)frac/sealed_loader.go
(3 hunks)fracmanager/async_searcher_test.go
(2 hunks)fracmanager/config.go
(2 hunks)fracmanager/fracmanager.go
(1 hunks)fracmanager/fraction_provider.go
(4 hunks)fracmanager/proxy_frac.go
(7 hunks)fracmanager/sealed_frac_cache.go
(5 hunks)fracmanager/sealed_frac_cache_test.go
(14 hunks)fracmanager/sealer_test.go
(5 hunks)fracmanager/searcher_test.go
(2 hunks)storage/index_block_header.go
(1 hunks)tests/setup/env.go
(5 hunks)
💤 Files with no reviewable changes (6)
- frac/seal_stats.go
- frac/active_sealer_test.go
- frac/active_sealer.go
- frac/disk_blocks_producer.go
- frac/disk_blocks.go
- frac/disk_blocks_writer.go
✅ Files skipped from review due to trivial changes (3)
- fracmanager/fracmanager.go
- frac/sealed/block_offsets.go
- frac/sealed/token/table.go
🚧 Files skipped from review as they are similar to previous changes (22)
- frac/sealed/token/table_loader.go
- frac/active_index.go
- frac/active.go
- fracmanager/searcher_test.go
- frac/sealed/sealing/stats.go
- frac/sealed/preloaded_data.go
- frac/sealed/token/table_entry.go
- consts/consts.go
- cmd/index_analyzer/main.go
- cmd/seq-db/seq-db.go
- frac/sealed_index.go
- bytespool/writer.go
- tests/setup/env.go
- frac/common/info.go
- frac/sealed/sealing/sealer.go
- fracmanager/fraction_provider.go
- fracmanager/sealed_frac_cache.go
- cmd/distribution/main.go
- frac/common/seal_params.go
- frac/active_docs_positions.go
- frac/sealed/block_info.go
- fracmanager/config.go
🧰 Additional context used
🧬 Code graph analysis (13)
frac/fraction.go (1)
frac/common/info.go (1)
Info
(22-41)
storage/index_block_header.go (1)
storage/codec.go (1)
Codec
(18-18)
fracmanager/sealed_frac_cache_test.go (4)
frac/common/info.go (1)
Info
(22-41)tests/common/util.go (3)
GetTestTmpDir
(69-71)RecreateDir
(49-52)RemoveDir
(42-47)consts/consts.go (1)
FracCacheFileSuffix
(65-65)fracmanager/sealed_frac_cache.go (1)
NewSealedFracCache
(32-43)
fracmanager/async_searcher_test.go (2)
frac/fraction.go (1)
Fraction
(20-27)frac/common/info.go (1)
Info
(22-41)
frac/sealed/sealing/blocks_builder_test.go (5)
frac/common/info.go (1)
Info
(22-41)seq/seq.go (3)
ID
(12-15)MID
(17-17)RID
(18-18)seq/doc_pos.go (2)
DocPos
(19-19)PackDocPos
(21-34)frac/sealed/token/table_loader.go (2)
TableBlock
(109-111)FieldTable
(113-116)frac/sealed/token/table_entry.go (1)
TableEntry
(4-12)
fracmanager/proxy_frac.go (6)
frac/active.go (1)
Active
(32-64)frac/common/info.go (1)
Info
(22-41)frac/sealed/sealing/sealer.go (1)
Seal
(30-82)frac/common/seal_params.go (1)
SealParams
(3-12)frac/sealed.go (1)
Sealed
(31-57)frac/active_sealing_source.go (1)
NewActiveSealingSource
(45-78)
fracmanager/sealer_test.go (5)
frac/common/seal_params.go (1)
SealParams
(3-12)tests/common/util.go (1)
RecreateDir
(49-52)frac/sealed/preloaded_data.go (1)
PreloadedData
(10-16)frac/active_sealing_source.go (1)
NewActiveSealingSource
(45-78)frac/sealed/sealing/sealer.go (1)
Seal
(30-82)
frac/sealed/sealing/index.go (10)
frac/common/seal_params.go (1)
SealParams
(3-12)frac/sealed/token/table.go (1)
Table
(34-34)storage/codec.go (3)
Codec
(18-18)CodecNo
(13-13)CodecZSTD
(15-15)storage/index_block_header.go (2)
IndexBlockHeader
(19-19)NewIndexBlockHeader
(25-34)frac/sealed/sealing/sealer.go (1)
Source
(20-28)bytespool/writer.go (3)
AcquireWriterSize
(18-30)ReleaseWriter
(32-37)Writer
(13-16)consts/consts.go (3)
RegularBlockSize
(16-16)IDsPerBlock
(14-14)LIDBlockCap
(15-15)frac/sealed/block_offsets.go (1)
BlockOffsets
(8-11)frac/sealed/token/table_loader.go (2)
TableBlock
(109-111)TableFromBlocks
(49-71)seq/seq.go (3)
ID
(12-15)MID
(17-17)RID
(18-18)
frac/sealed/sealing/blocks_builder.go (5)
frac/sealed/seqids/blocks.go (3)
BlockMIDs
(10-12)BlockRIDs
(32-35)BlockParams
(57-59)frac/sealed/token/table_loader.go (2)
TableBlock
(109-111)FieldTable
(113-116)frac/sealed/token/table_entry.go (1)
TableEntry
(4-12)seq/seq.go (3)
ID
(12-15)MID
(17-17)RID
(18-18)seq/doc_pos.go (1)
DocPos
(19-19)
frac/active_sealing_source.go (13)
frac/common/seal_params.go (1)
SealParams
(3-12)frac/common/info.go (1)
Info
(22-41)frac/active_lids.go (1)
TokenLIDs
(27-33)seq/doc_pos.go (2)
DocPos
(19-19)PackDocPos
(21-34)storage/docs_reader.go (1)
DocsReader
(11-14)frac/active.go (1)
Active
(32-64)frac/active_token_list.go (1)
TokenList
(60-74)frac/active_docs_positions.go (1)
DocsPositions
(9-13)seq/seq.go (3)
ID
(12-15)MID
(17-17)RID
(18-18)consts/consts.go (3)
IDsPerBlock
(14-14)SdocsTmpFileSuffix
(56-56)SdocsFileSuffix
(55-55)bytespool/writer.go (3)
AcquireWriterSize
(18-30)ReleaseWriter
(32-37)Writer
(13-16)util/errors_collapser.go (1)
CollapseErrors
(25-27)storage/doc_block.go (1)
CompressDocBlock
(79-88)
frac/sealed_loader.go (2)
frac/common/info.go (1)
Info
(22-41)frac/sealed/block_offsets.go (1)
BlockOffsets
(8-11)
frac/sealed.go (6)
frac/common/info.go (1)
Info
(22-41)consts/consts.go (2)
SdocsFileSuffix
(55-55)DocsFileSuffix
(52-52)storage/docs_reader.go (1)
NewDocsReader
(16-21)frac/sealed/preloaded_data.go (1)
PreloadedData
(10-16)frac/index_cache.go (1)
IndexCache
(10-18)frac/sealed/block_info.go (1)
BlockInfo
(15-17)
frac/remote.go (1)
frac/common/info.go (1)
Info
(22-41)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test
🔇 Additional comments (15)
fracmanager/async_searcher_test.go (4)
20-20
: Field type updated to common.Info — aligned with new Fraction API.
24-26
: *Method signature matches interface (Info() common.Info).
59-60
: Init with common.Info is correct.
37-39
: Ignore the nil-channel warning on QPR –seq.QPR
has nochan
fields, and its nil maps/slices are only read (iterated) by the merge/compress logic, so returning a zero-value QPR here is safe.Likely an incorrect or invalid review comment.
fracmanager/sealer_test.go (4)
21-26
: Imports aligned with new sealing pipeline — looks good.
The new deps (common, sealed, sealing, tests_common) are appropriate for the refactor.
44-45
: Fixture path switch to tests_common.TestDataDir — OK.
Improves consistency across tests.
114-115
: Directory recreation change — OK.
Using tests_common.RecreateDir keeps setup consistent across suites.
127-131
: Warmup may not fully exclude sorting; verify behavior.
NewActiveSealingSource calls SortDocs when active.Config.SkipSortDocs == false. After the warmup seal, subsequent iterations may still call SortDocs (even if no-op), skewing results. If the goal is to exclude sorting cost, flip the flag after warmup:_, err = seal(active, params) assert.NoError(b, err) + // Exclude doc sorting from timed loop + active.Config.SkipSortDocs = trueIf you want a benchmark “with sort,” keep a separate sub-benchmark that doesn’t flip the flag.
fracmanager/sealed_frac_cache_test.go (4)
16-19
: Migration to common.Info and tests_common helpers looks correct.Imports and aliases align with the broader refactor; no issues spotted.
29-39
: *Type switch of frac cache to map[string]common.Info is consistent.Helper mirrors production type in NewSealedFracCache; unmarshalling into pointers is fine for equality checks below.
51-55
: Tmp dir/test util migration is fine.Using tests_common helpers for setup/teardown is consistent across the file.
Also applies to: 75-79, 105-108, 159-163, 229-233, 268-272, 349-353, 412-416
80-89
: Confirm LoadFromDisk’s error semantics and assert accordingly.If LoadFromDisk returns an error, assert it here to catch format/IO issues early; otherwise ignore this.
Would you confirm the signature? If it returns error, update to:
- f.LoadFromDisk(filepath.Join(dataDir, consts.FracCacheFileSuffix)) + err := f.LoadFromDisk(filepath.Join(dataDir, consts.FracCacheFileSuffix)) + assert.NoError(t, err)frac/fraction.go (1)
21-21
: All Info() implementations updated Verified allFraction.Info()
methods return*common.Info
; no outdated signatures remain.frac/sealed/sealing/blocks_builder.go (1)
53-117
: LGTM! Well-structured token block builder with proper error handlingThe implementation correctly handles field/token consistency validation and builds the token table entries efficiently.
frac/sealed.go (1)
168-170
: Verify the intended behavior of opening files during preloaded constructionOpening both docs and index files immediately in
NewSealedPreloaded
differs from the lazy loading pattern used elsewhere. This could impact memory usage and startup time for preloaded fractions.Is this intentional? The previous implementation with
preloaded.docsFile
andpreloaded.indexFile
suggests these were already opened during sealing.
dc7368c
to
44d9328
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
cmd/distribution/main.go (3)
62-86
: Close opened file to avoid FD leaks in loadInfogetReader opens a file that isn’t closed. Add a defer to close f.
func loadInfo(path string) *common.Info { indexReader, f := getReader(path) + defer f.Close() result, err := readBlock(indexReader, 0) if err != nil { logger.Fatal("error reading block", zap.String("file", path), zap.Error(err)) } @@ - err = b.Unpack(result) + err = b.Unpack(result) if err != nil { - logger.Fatal("can't unpack info bloc of index file", zap.String("file", path), zap.Error(err)) + logger.Fatal("can't unpack info block of index file", zap.String("file", path), zap.Error(err)) }
73-77
: Fix typo in log: “bloc” → “block”User-facing log message.
- logger.Fatal("can't unpack info bloc of index file", zap.String("file", path), zap.Error(err)) + logger.Fatal("can't unpack info block of index file", zap.String("file", path), zap.Error(err))
88-116
: Also close file in buildDist to prevent descriptor leaksYou discard the returned *os.File. Capture and close it.
-func buildDist(dist *seq.MIDsDistribution, path string, _ *common.Info) { - blocksReader, _ := getReader(path) +func buildDist(dist *seq.MIDsDistribution, path string, _ *common.Info) { + blocksReader, f := getReader(path) + defer f.Close()Also applies to: 118-144
fracmanager/proxy_frac.go (1)
121-159
: Seal error paths leak sealWg and leave readonly=trueIf NewActiveSealingSource or sealing.Seal fails, you return before f.sealWg.Done() and don’t reset readonly, causing waits to hang and the fraction to remain unwritable.
Apply:
func (f *proxyFrac) Seal(params common.SealParams) (*frac.Sealed, error) { @@ - f.sealWg.Add(1) // It's important to put wg.Add() inside a lock, otherwise we might call wg.Wait() before it + f.sealWg.Add(1) // It's important to put wg.Add() inside a lock, otherwise we might call wg.Wait() before it f.useMu.Unlock() f.WaitWriteIdle() - src, err := frac.NewActiveSealingSource(active, params) - if err != nil { - return nil, err - } - preloaded, err := sealing.Seal(src, params) - if err != nil { - return nil, err - } + // Ensure sealWg.Done and readonly rollback on any failure + defer f.sealWg.Done() + + src, err := frac.NewActiveSealingSource(active, params) + if err != nil { + f.useMu.Lock() + f.readonly = false + f.useMu.Unlock() + return nil, err + } + preloaded, err := sealing.Seal(src, params) + if err != nil { + f.useMu.Lock() + f.readonly = false + f.useMu.Unlock() + return nil, err + } @@ - f.sealWg.Done() - active.Release() return sealed, nil }
♻️ Duplicate comments (5)
fracmanager/sealer_test.go (1)
121-125
: Don’t assert then continue; return on error from NewActiveSealingSourceIf src creation fails, sealing.Seal will be called with nil and may panic. Return the error immediately.
Apply:
- seal := func(active *frac.Active, params common.SealParams) (*sealed.PreloadedData, error) { - src, err := frac.NewActiveSealingSource(active, params) - assert.NoError(b, err) - return sealing.Seal(src, params) - } + seal := func(active *frac.Active, params common.SealParams) (*sealed.PreloadedData, error) { + src, err := frac.NewActiveSealingSource(active, params) + if err != nil { + return nil, err + } + return sealing.Seal(src, params) + }fracmanager/proxy_frac.go (1)
138-145
: Consider cleaning up the sealing source on failureActiveSealingSource lacks a Close/cleanup; if resources are added later, failures will leak. Add Close() and defer it here (no-op if unnecessary).
Example:
- src, err := frac.NewActiveSealingSource(active, params) + src, err := frac.NewActiveSealingSource(active, params) if err != nil { f.useMu.Lock() f.readonly = false f.useMu.Unlock() return nil, err } + // defer src.Close() // once implementedAnd implement:
// in frac/active_sealing_source.go func (s *ActiveSealingSource) Close() error { return nil } // extend when resources appearfrac/sealed/sealing/blocks_builder_test.go (2)
30-38
: Fix invalid range-over-int in mockSource.Fields().
for i := range len(m.fields)
won’t compile on standard Go toolchains. Use an indexed loop.func (m *mockSource) Fields() iter.Seq2[string, uint32] { return func(yield func(string, uint32) bool) { - for i := range len(m.fields) { + for i := 0; i < len(m.fields); i++ { if !yield(m.fields[i], m.fieldMaxTIDs[i]) { return } } } }
129-136
: Fix invalid range-over-int in payload loop.
for i := range block.payload.Len()
is invalid; iterate by index.- for block := range tokenBlocks { + for block := range tokenBlocks { assert.Equal(t, expectedSizes[blockIndex], block.payload.Len()) - for i := range block.payload.Len() { + for i := 0; i < block.payload.Len(); i++ { tid++ assert.Equal(t, src.tokens[tid-1], block.payload.GetToken(i)) } blockIndex++ }frac/sealed.go (1)
121-129
: Previous logging mismatch resolvedThe earlier ".docs vs .sdocs" log confusion is fixed here by separating both error paths. Thanks.
🧹 Nitpick comments (12)
fracmanager/config.go (1)
27-27
: Set sane defaults for new DocBlocksZstdLevel and avoid magic 0.8
- DocBlocksZstdLevel in common.SealParams isn’t defaulted here; if 0 isn’t treated specially downstream, compression may be inconsistent. Consider defaulting it to zstdDefaultLevel like others.
- You define SdocsCacheSizeMaxRatio but use 0.8 directly.
Proposed diff:
const zstdDefaultLevel = 3 if config.SealParams.IDsZstdLevel == 0 { config.SealParams.IDsZstdLevel = zstdDefaultLevel } + if config.SealParams.DocBlocksZstdLevel == 0 { + config.SealParams.DocBlocksZstdLevel = zstdDefaultLevel + } if config.SealParams.LIDsZstdLevel == 0 { config.SealParams.LIDsZstdLevel = zstdDefaultLevel } @@ - if config.SortCacheSize > config.CacheSize { - config.SortCacheSize = uint64(float64(config.CacheSize) * 0.8) + if config.SortCacheSize > config.CacheSize { + config.SortCacheSize = uint64(float64(config.CacheSize) * SdocsCacheSizeMaxRatio) }Also confirm whether DocBlockSize should be validated (non-zero) here or elsewhere.
Also applies to: 48-65, 66-79
frac/sealed/preloaded_data.go (1)
10-16
: Add lightweight validation and ownership docs to prevent nil derefsA tiny Validate() helps catch misuse early; a brief doc clarifies ownership.
package sealed import ( + "fmt" "github.com/ozontech/seq-db/frac/common" "github.com/ozontech/seq-db/frac/sealed/lids" "github.com/ozontech/seq-db/frac/sealed/seqids" "github.com/ozontech/seq-db/frac/sealed/token" ) - type PreloadedData struct { + // PreloadedData bundles precomputed tables and metadata for constructing a sealed fraction. + // Ownership: do not mutate tables after passing to constructors; they may be used read-only and concurrently. + type PreloadedData struct { Info *common.Info IDsTable seqids.Table LIDsTable *lids.Table TokenTable token.Table BlocksOffsets []uint64 } + +// Validate performs basic sanity checks. +func (p *PreloadedData) Validate() error { + if p == nil { + return fmt.Errorf("preloaded data is nil") + } + if p.Info == nil { + return fmt.Errorf("info is nil") + } + if p.LIDsTable == nil { + return fmt.Errorf("LIDsTable is nil") + } + if p.TokenTable == nil { + return fmt.Errorf("TokenTable is nil") + } + return nil +}fracmanager/fraction_provider.go (1)
68-77
: Defensive check for nil preloadedDataIf preloadedData is ever nil, frac.NewSealedPreloaded will dereference it and panic. Add a guard or document the non-nil contract.
Apply:
func (fp *fractionProvider) NewSealedPreloaded(name string, preloadedData *sealed.PreloadedData) *frac.Sealed { + if preloadedData == nil { + panic("NewSealedPreloaded: preloadedData is nil") + } return frac.NewSealedPreloaded( name, preloadedData, fp.readLimiter, fp.cacheProvider.CreateIndexCache(), fp.cacheProvider.CreateDocBlockCache(), fp.config, ) }fracmanager/sealer_test.go (3)
111-120
: Avoid double Stop on fractionProviderYou both defer fp.Stop() and call fp.Stop() explicitly; depending on ActiveIndexer.Stop() idempotency this can race or panic.
Apply:
- defer fp.Stop() + // Stop indexer before sealing to avoid interference with benchmarks + // (avoid double Stop calls) + // no defer here
129-131
: Fail fast on warm-up sealing errorWarm-up errors should stop the benchmark immediately.
Apply:
- _, err = seal(active, params) - assert.NoError(b, err) + _, err = seal(active, params) + if err != nil { + b.Fatal(err) + }
39-54
: Typo: muliplier → multiplierMinor readability nit.
Apply:
-const muliplier = 10 +const multiplier = 10 @@ - for i := 0; i < muliplier; i++ { + for i := 0; i < multiplier; i++ {frac/sealed/sealing/index.go (1)
67-67
: Typo: writeBocks → writeBlocks.Rename for clarity.
- if err := s.writeBocks(prefixSize, bw, hw, src); err != nil { + if err := s.writeBlocks(prefixSize, bw, hw, src); err != nil { return err } @@ -func (s *IndexSealer) writeBocks(pos int, payloadWriter, headersWriter io.Writer, src Source) error { +func (s *IndexSealer) writeBlocks(pos int, payloadWriter, headersWriter io.Writer, src Source) error {Also applies to: 102-102
frac/sealed.go (5)
118-132
: Log full file paths when opening .sdocs/.docsUse the exact file path (like openIndex does) to simplify ops/debugging.
func (f *Sealed) openDocs() { if f.docsFile == nil { var err error - f.docsFile, err = os.Open(f.BaseFileName + consts.SdocsFileSuffix) // try first open *.sdocs file + sdocsName := f.BaseFileName + consts.SdocsFileSuffix // try opening *.sdocs first + f.docsFile, err = os.Open(sdocsName) if err != nil { if !errors.Is(err, os.ErrNotExist) { - logger.Fatal("can't open sdocs file", zap.String("frac", f.BaseFileName), zap.Error(err)) + logger.Fatal("can't open sdocs file", zap.String("file", sdocsName), zap.Error(err)) } - f.docsFile, err = os.Open(f.BaseFileName + consts.DocsFileSuffix) // fallback to *.docs file + docsName := f.BaseFileName + consts.DocsFileSuffix // fallback to *.docs + f.docsFile, err = os.Open(docsName) if err != nil { - logger.Fatal("can't open docs file", zap.String("frac", f.BaseFileName), zap.Error(err)) + logger.Fatal("can't open docs file", zap.String("file", docsName), zap.Error(err)) } } f.docsReader = storage.NewDocsReader(f.readLimiter, f.docsFile, f.docsCache) } }
168-171
: Populate Info.IndexOnDisk in the preloaded pathMirror NewSealed behavior; avoids leaving IndexOnDisk at zero when constructing from preloaded data.
f.openDocs() f.openIndex() + // ensure Info.IndexOnDisk is populated in the preloaded path as well + if stat, err := f.indexFile.Stat(); err != nil { + logger.Fatal("can't stat index file", zap.String("file", f.indexFile.Name()), zap.Error(err)) + } else { + f.info.IndexOnDisk = uint64(stat.Size()) + }
223-229
: Durability: fsync the marker file before syncing the directoryNot strictly required, but helps ensure the marker’s inode metadata hits disk on platforms/filesystems with quirky sync semantics.
file, err := os.Create(remoteFracName) if err != nil { return true, err } - defer file.Close() + if err := file.Sync(); err != nil { + return true, err + } + defer file.Close()
55-55
: Tone: replace informal commentRename to keep codebase professional.
- // shit for testing + // testing helper: emulates hard shutdown stages for deletion
164-166
: Coverage gaps: add focused tests
- openDocs() fallback from .sdocs→.docs (both success and error paths).
- NewSealedPreloaded sets IndexOnDisk (after stat) and uses preloaded TokenTable in cache.
I can sketch table-driven tests if helpful.Also applies to: 194-197
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (47)
bytespool/writer.go
(1 hunks)cmd/distribution/main.go
(4 hunks)cmd/index_analyzer/main.go
(2 hunks)cmd/seq-db/seq-db.go
(2 hunks)consts/consts.go
(1 hunks)frac/active.go
(4 hunks)frac/active_docs_positions.go
(2 hunks)frac/active_index.go
(2 hunks)frac/active_sealer.go
(0 hunks)frac/active_sealer_test.go
(0 hunks)frac/active_sealing_source.go
(1 hunks)frac/common/info.go
(2 hunks)frac/common/seal_params.go
(1 hunks)frac/disk_blocks.go
(0 hunks)frac/disk_blocks_producer.go
(0 hunks)frac/disk_blocks_writer.go
(0 hunks)frac/fraction.go
(2 hunks)frac/remote.go
(4 hunks)frac/seal_stats.go
(0 hunks)frac/sealed.go
(7 hunks)frac/sealed/block_info.go
(2 hunks)frac/sealed/block_offsets.go
(1 hunks)frac/sealed/preloaded_data.go
(1 hunks)frac/sealed/sealing/blocks_builder.go
(1 hunks)frac/sealed/sealing/blocks_builder_test.go
(1 hunks)frac/sealed/sealing/index.go
(1 hunks)frac/sealed/sealing/sealer.go
(1 hunks)frac/sealed/sealing/stats.go
(1 hunks)frac/sealed/token/table.go
(1 hunks)frac/sealed/token/table_entry.go
(1 hunks)frac/sealed/token/table_loader.go
(1 hunks)frac/sealed_index.go
(2 hunks)frac/sealed_loader.go
(3 hunks)fracmanager/async_searcher_test.go
(2 hunks)fracmanager/config.go
(2 hunks)fracmanager/fracmanager.go
(1 hunks)fracmanager/fraction_provider.go
(4 hunks)fracmanager/proxy_frac.go
(7 hunks)fracmanager/sealed_frac_cache.go
(5 hunks)fracmanager/sealed_frac_cache_test.go
(14 hunks)fracmanager/sealer_test.go
(5 hunks)fracmanager/searcher_test.go
(2 hunks)storage/block_former.go
(0 hunks)storage/blocks_stats.go
(0 hunks)storage/blocks_writer.go
(0 hunks)storage/index_block_header.go
(1 hunks)tests/setup/env.go
(5 hunks)
💤 Files with no reviewable changes (9)
- frac/seal_stats.go
- storage/blocks_writer.go
- frac/disk_blocks.go
- frac/active_sealer_test.go
- frac/active_sealer.go
- storage/blocks_stats.go
- frac/disk_blocks_writer.go
- storage/block_former.go
- frac/disk_blocks_producer.go
✅ Files skipped from review due to trivial changes (2)
- frac/sealed/token/table_entry.go
- frac/sealed/token/table_loader.go
🚧 Files skipped from review as they are similar to previous changes (26)
- tests/setup/env.go
- bytespool/writer.go
- frac/sealed/block_offsets.go
- frac/sealed/token/table.go
- frac/sealed/block_info.go
- consts/consts.go
- frac/sealed/sealing/stats.go
- frac/common/seal_params.go
- frac/active_index.go
- fracmanager/searcher_test.go
- frac/sealed_index.go
- frac/sealed/sealing/sealer.go
- fracmanager/sealed_frac_cache.go
- frac/active.go
- fracmanager/fracmanager.go
- fracmanager/sealed_frac_cache_test.go
- cmd/seq-db/seq-db.go
- cmd/index_analyzer/main.go
- frac/sealed_loader.go
- storage/index_block_header.go
- frac/common/info.go
- frac/active_sealing_source.go
- frac/remote.go
- fracmanager/async_searcher_test.go
- frac/fraction.go
- frac/active_docs_positions.go
🧰 Additional context used
🧬 Code graph analysis (10)
frac/sealed/preloaded_data.go (2)
frac/common/info.go (1)
Info
(22-41)frac/sealed/token/table.go (1)
Table
(34-34)
frac/sealed/sealing/blocks_builder_test.go (5)
frac/common/info.go (1)
Info
(22-41)seq/seq.go (3)
ID
(12-15)MID
(17-17)RID
(18-18)seq/doc_pos.go (2)
DocPos
(19-19)PackDocPos
(21-34)frac/sealed/token/table_loader.go (2)
TableBlock
(109-111)FieldTable
(113-116)frac/sealed/token/table_entry.go (1)
TableEntry
(4-12)
fracmanager/config.go (1)
frac/common/seal_params.go (1)
SealParams
(3-12)
fracmanager/proxy_frac.go (5)
frac/active.go (1)
Active
(32-64)frac/common/info.go (1)
Info
(22-41)frac/sealed/sealing/sealer.go (1)
Seal
(30-82)frac/common/seal_params.go (1)
SealParams
(3-12)frac/active_sealing_source.go (1)
NewActiveSealingSource
(45-78)
fracmanager/fraction_provider.go (3)
frac/sealed.go (3)
NewSealed
(73-104)Sealed
(31-57)NewSealedPreloaded
(135-183)frac/common/info.go (1)
Info
(22-41)frac/sealed/preloaded_data.go (1)
PreloadedData
(10-16)
frac/sealed/sealing/blocks_builder.go (5)
frac/sealed/seqids/blocks.go (3)
BlockMIDs
(10-12)BlockRIDs
(32-35)BlockParams
(57-59)frac/sealed/token/table_loader.go (2)
TableBlock
(109-111)FieldTable
(113-116)frac/sealed/token/table_entry.go (1)
TableEntry
(4-12)seq/seq.go (3)
ID
(12-15)MID
(17-17)RID
(18-18)seq/doc_pos.go (1)
DocPos
(19-19)
fracmanager/sealer_test.go (5)
frac/common/seal_params.go (1)
SealParams
(3-12)tests/common/util.go (1)
RecreateDir
(49-52)frac/sealed/preloaded_data.go (1)
PreloadedData
(10-16)frac/active_sealing_source.go (1)
NewActiveSealingSource
(45-78)frac/sealed/sealing/sealer.go (1)
Seal
(30-82)
frac/sealed/sealing/index.go (10)
frac/common/seal_params.go (1)
SealParams
(3-12)frac/sealed/token/table.go (1)
Table
(34-34)storage/index_block_header.go (2)
IndexBlockHeader
(19-19)NewIndexBlockHeader
(25-34)frac/sealed/sealing/sealer.go (1)
Source
(20-28)bytespool/writer.go (3)
AcquireWriterSize
(18-30)ReleaseWriter
(32-37)Writer
(13-16)frac/common/info.go (1)
Info
(22-41)consts/consts.go (3)
RegularBlockSize
(16-16)IDsPerBlock
(14-14)LIDBlockCap
(15-15)frac/sealed/block_offsets.go (1)
BlockOffsets
(8-11)frac/sealed/token/table_loader.go (2)
TableBlock
(109-111)TableFromBlocks
(49-71)seq/seq.go (3)
ID
(12-15)MID
(17-17)RID
(18-18)
cmd/distribution/main.go (2)
frac/common/info.go (1)
Info
(22-41)frac/sealed/block_info.go (1)
BlockInfo
(15-17)
frac/sealed.go (6)
frac/common/info.go (1)
Info
(22-41)consts/consts.go (2)
SdocsFileSuffix
(55-55)DocsFileSuffix
(52-52)storage/docs_reader.go (1)
NewDocsReader
(16-21)frac/sealed/preloaded_data.go (1)
PreloadedData
(10-16)frac/index_cache.go (1)
IndexCache
(10-18)frac/sealed/block_info.go (1)
BlockInfo
(15-17)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test
🔇 Additional comments (9)
fracmanager/config.go (1)
10-10
: LGTM: migrate to common.SealParamsImport + type change aligns with the refactor.
cmd/distribution/main.go (1)
14-16
: LGTM: ref targets moved to common.Info and sealed.BlockInfoImport refactor is consistent with the new package layout.
fracmanager/fraction_provider.go (1)
80-81
: Verify allNewRemote
calls use*common.Info
– ensure both bare and qualified invocations (NewRemote(
/frac.NewRemote(
) pass the updated*common.Info
as the cachedInfo argument.fracmanager/proxy_frac.go (2)
54-60
: Constructor LGTMCaching name for logs avoids data races on active.BaseFileName.
80-82
: No action required: Info() signature and callers already updated
Fraction.Info()
infrac/fraction.go
returns*common.Info
, and all downstream interfaces and usages (incl.Source.Info()
) align with the pointer return type.frac/sealed/sealing/index.go (1)
306-316
: Double-check LIDs minTID mutation on continuation.Incrementing minTID when isContinued affects ext2 and PreloadedData.MinTIDs. Verify readers expect this adjusted minTID; otherwise, store original minTID in tables and only adjust ext fields.
Would you like a quick grep script to trace all consumers of LIDsTable.MinTIDs/MaxTIDs and ext parsing to confirm invariants?
frac/sealed/sealing/blocks_builder.go (1)
184-203
: IDs/positions length mismatch can panic.If a producer yields uneven ids/pos lengths,
pos[i]
panics. If the contract doesn’t guarantee equality, add a check; otherwise, document it.I can add a defensive check and propagate bb.lastErr if desired—confirm the intended contract of Source.IDsBlocks().
frac/sealed.go (2)
15-16
: LGTM: migrated to common.Info and sealed.BlockInfoType migration and unpacking via sealed.BlockInfo look consistent across the constructor, accessor, and header loader.
Also applies to: 36-36, 388-390, 403-403, 413-413
73-91
: Guard against nil IndexCache to avoid panics in openIndexopenIndex dereferences f.indexCache.Registry. Please ensure indexCache is never nil at construction, or add a defensive check.
44d9328
to
489a401
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
cmd/seq-db/seq-db.go (1)
258-266
: Validate DocBlockSize before casting; zstd levels are already mapped
- Add a guard before building sconfig: the code casts cfg.DocsSorting.DocBlockSize -> int (cmd/seq-db/seq-db.go, SealParams around lines ~259–266). Ensure dbs > 0 and the cast doesn't overflow (round‑trip check). Example:
dbs := int(cfg.DocsSorting.DocBlockSize)
if dbs <= 0 || int64(dbs) != cfg.DocsSorting.DocBlockSize {
logger.Fatal("invalid DocBlockSize", zap.Int("doc_block_size", dbs))
}- Compression levels: config/config.go provides defaults (DocBlockZstdCompressionLevel default: 3) and the klauspost zstd lib maps integer levels via EncoderLevelFromZstd, so explicit range-clamping is optional — you may add it for clearer user errors.
- Optional: use the import alias (e.g. fraccommon.SealParams) if you’ve adopted it.
frac/remote.go (2)
101-111
: Nil-deref risk: loadHeader() called even when openIndex() failsIf openIndex() returns an error, f.indexFile is nil and loadHeader(f.indexFile, …) can panic.
Apply this guard:
- if err := f.openIndex(); err != nil { + if err := f.openIndex(); err != nil { logger.Error( "cannot open index file: any subsequent operation will fail", zap.String("fraction", filepath.Base(f.BaseFileName)), zap.Error(err), ) - } - - f.info = loadHeader(f.indexFile, f.indexReader) - return f + // keep minimal non-nil info to avoid crashes elsewhere + f.info = &common.Info{Path: baseFile} + return f + } + f.info = loadHeader(f.indexFile, f.indexReader) + return f
133-141
: Avoid calling f.Info().Name() — it can be nil and panicInfo() may be nil (e.g., after index-open failure); use BaseFileName instead to avoid a logging-time panic.
File: frac/remote.go:136
- zap.String("fraction", f.Info().Name()), + zap.String("fraction", filepath.Base(f.BaseFileName)),Add import "path/filepath" if missing.
fracmanager/proxy_frac.go (1)
121-159
: Seal(): missing Done() on error and stale readonly on failure; add cleanup + rollback
- f.sealWg.Add(1) isn’t paired with Done() on all error paths → potential deadlock.
- readonly is set to true and never reset if sealing fails → fraction stuck in “sealing” state.
Apply this refactor to guarantee Done() and rollback:
func (f *proxyFrac) Seal(params common.SealParams) (*frac.Sealed, error) { f.useMu.Lock() if f.isSuicidedState() { f.useMu.Unlock() return nil, ErrSealingFractionSuicided } if !f.isActiveState() { f.useMu.Unlock() return nil, errors.New("sealing fraction is not active") } f.readonly = true active := f.active f.sealWg.Add(1) // It's important to put wg.Add() inside a lock, otherwise we might call wg.Wait() before it f.useMu.Unlock() + defer f.sealWg.Done() + // rollback readonly on any failure + sealedOK := false + defer func() { + if !sealedOK { + f.useMu.Lock() + f.readonly = false + f.useMu.Unlock() + } + }() f.WaitWriteIdle() - src, err := frac.NewActiveSealingSource(active, params) + src, err := frac.NewActiveSealingSource(active, params) if err != nil { return nil, err } - preloaded, err := sealing.Seal(src, params) + preloaded, err := sealing.Seal(src, params) if err != nil { return nil, err } sealed := f.fp.NewSealedPreloaded(active.BaseFileName, preloaded) f.useMu.Lock() f.sealed = sealed f.active = nil f.useMu.Unlock() - f.sealWg.Done() + sealedOK = true active.Release() return sealed, nil }frac/sealed.go (1)
296-321
: Always close open files; current early return can leak handles.
close()
returns early whenisLoaded
is false, leavingindexFile
(and possiblydocsFile
) open. This can block renames/deletes and leak FDs.func (f *Sealed) close(hint string) { f.loadMu.Lock() defer f.loadMu.Unlock() - if !f.isLoaded { - return - } - - if f.docsFile != nil { // docs file may not be opened since it's loaded lazily + if f.docsFile != nil { // docs file may not be opened since it's loaded lazily if err := f.docsFile.Close(); err != nil { logger.Error("can't close docs file", zap.String("frac", f.BaseFileName), zap.String("type", "sealed"), zap.String("hint", hint), zap.Error(err)) } } - if err := f.indexFile.Close(); err != nil { - logger.Error("can't close index file", - zap.String("frac", f.BaseFileName), - zap.String("type", "sealed"), - zap.String("hint", hint), - zap.Error(err)) - } + if f.indexFile != nil { + if err := f.indexFile.Close(); err != nil { + logger.Error("can't close index file", + zap.String("frac", f.BaseFileName), + zap.String("type", "sealed"), + zap.String("hint", hint), + zap.Error(err)) + } + } }
♻️ Duplicate comments (12)
fracmanager/sealer_test.go (2)
121-125
: Return on error in seal() helper to avoid nil deref/panicIf NewActiveSealingSource fails, assert.NoError doesn’t stop execution; sealing.Seal may be called with a nil src. Return the error immediately.
- seal := func(active *frac.Active, params common.SealParams) (*sealed.PreloadedData, error) { - src, err := frac.NewActiveSealingSource(active, params) - assert.NoError(b, err) - return sealing.Seal(src, params) - } + seal := func(active *frac.Active, params common.SealParams) (*sealed.PreloadedData, error) { + src, err := frac.NewActiveSealingSource(active, params) + if err != nil { + return nil, err + } + return sealing.Seal(src, params) + }
148-151
: Replace b.Loop() with the standard testing.B looptesting.B has no Loop(); this won’t compile. Use the canonical for i := 0; i < b.N; i++ pattern and fail fast on error.
- for b.Loop() { - _, err = seal(active, params) - assert.NoError(b, err) - } + b.ResetTimer() + for i := 0; i < b.N; i++ { + if _, err = seal(active, params); err != nil { + b.Fatal(err) + } + }frac/active_sealing_source.go (1)
272-273
: Handle error from deferredFlushReleaseWriter
The deferred
FlushReleaseWriter
can return an error that should be handled. This issue was also flagged in a previous review.Apply this diff to handle the error:
bw := bytespool.AcquireWriterSize(sdocsFile, int(units.MiB)) -defer bytespool.ReleaseWriter(bw) +defer func() { + if err := bytespool.FlushReleaseWriter(bw); err != nil { + logger.Error("failed to flush and release writer", zap.Error(err)) + } +}()Note: The previous comment suggests
FlushReleaseWriter
, but the current code usesReleaseWriter
. Verify which is the correct function to call, and ensure error handling is added if the function returns an error.fracmanager/proxy_frac.go (1)
138-144
: Release resources of ActiveSealingSource on errorsIf sealing fails, src (buffers/readers) may leak. Add Close()/Release() on ActiveSealingSource and call it via defer.
Proposed pattern:
src, err := frac.NewActiveSealingSource(active, params) if err != nil { return nil, err } +defer func() { _ = src.Close() }()
I can add Close() to ActiveSealingSource and wire it here—say the word.
frac/sealed/sealing/blocks_builder_test.go (2)
127-134
: Fix invalid range over int (again)Same issue on block.payload.Len().
- for i := range block.payload.Len() { + for i := 0; i < block.payload.Len(); i++ { tid++ assert.Equal(t, src.tokens[tid-1], block.payload.GetToken(i)) }
30-38
: Fix invalidrange len(...)
loops — replace with index looprange over len(...) is invalid; use for i := 0; i < len(...); i++.
Affected locations:
- frac/sealed/sealing/blocks_builder_test.go:32
- proxy/search/async.go:62
- for i := range len(m.fields) { + for i := 0; i < len(m.fields); i++ { if !yield(m.fields[i], m.fieldMaxTIDs[i]) { return } }frac/sealed/sealing/sealer.go (1)
55-90
: Temp index file not cleaned up on failuresIf WriteIndex/Sync/Stat/Close/Rename fails, the tmp file persists. Add a deferred cleanup that cancels upon successful rename.
// Create temporary index file (will be renamed on success) indexFile, err := os.Create(info.Path + consts.IndexTmpFileSuffix) if err != nil { return nil, err } + // Ensure tmp file is removed on any failure before successful rename + renamed := false + defer func() { + if !renamed { + _ = indexFile.Close() + _ = os.Remove(indexFile.Name()) + } + }() // Create index sealer and write the index structure indexSealer := NewIndexSealer(params) if err := indexSealer.WriteIndex(indexFile, src); err != nil { - return nil, err + return nil, err } // Ensure data is flushed to disk if err := indexFile.Sync(); err != nil { - return nil, err + return nil, err } // Get final file size for metadata stat, err := indexFile.Stat() if err != nil { return nil, err } info.IndexOnDisk = uint64(stat.Size()) // Close file before renaming if err := indexFile.Close(); err != nil { return nil, err } // Atomically rename temporary file to final name if err := os.Rename(indexFile.Name(), info.Path+consts.IndexFileSuffix); err != nil { return nil, err } + renamed = truefrac/sealed/sealing/index.go (4)
383-387
: Use DocsPositionsZstdLevel for positions block.Positions should use the docs/positions compression level, not IDs.
- b := s.newIndexBlockZSTD(s.buf1, s.params.IDsZstdLevel) + b := s.newIndexBlockZSTD(s.buf1, s.params.DocsPositionsZstdLevel)
403-408
: Use LIDsZstdLevel for LIDs block.Apply the LIDs-specific compression level instead of IDs.
- b := s.newIndexBlockZSTD(s.buf1, s.params.IDsZstdLevel) + b := s.newIndexBlockZSTD(s.buf1, s.params.LIDsZstdLevel)
90-90
: Propagate deferred flush/release errors: use named return.Make the method use a named
err
to allow the deferred writer cleanup to surface errors.-func (s *IndexSealer) WriteIndex(ws io.WriteSeeker, src Source) error { +func (s *IndexSealer) WriteIndex(ws io.WriteSeeker, src Source) (err error) {
100-107
: Defer FlushReleaseWriter to avoid silent loss on early returns.If
writeBocks
fails, buffered data can be lost becauseReleaseWriter
doesn’t flush.- bw := bytespool.AcquireWriterSize(ws, int(units.MiB)) // Buffered writer for payload - defer bytespool.ReleaseWriter(bw) + bw := bytespool.AcquireWriterSize(ws, int(units.MiB)) // Buffered writer for payload + defer func() { + if e := bytespool.FlushReleaseWriter(bw); err == nil && e != nil { + err = e + } + }()frac/sealed/sealing/blocks_builder.go (1)
307-313
: Skip empty token batches to avoid bad ranges/underflows.Empty batches produce blocks with minTID > maxTID and can break table entry math.
// Process each batch of tokens for tokens := range tokenBatches { + if len(tokens) == 0 { + continue + } // Initialize new block block.ext.minTID = currentTID + 1
🧹 Nitpick comments (5)
fracmanager/sealer_test.go (1)
109-112
: Double Stop of fraction provider; pick one placeYou Stop via defer and again explicitly. If Stop isn’t idempotent this can misbehave; even if it is, it’s noise. Choose one (early Stop seems intentional).
Option A (prefer early Stop, remove defer):
- defer fp.Stop() + // Stopped explicitly below to keep provider out of the measurement window.Option B (keep defer, remove explicit Stop):
- fp.Stop() + // Provider will be stopped by defer.Please confirm Active remains valid after provider Stop in your implementation.
Also applies to: 119-120
cmd/seq-db/seq-db.go (1)
25-25
: Alias the frac/common import to avoid genericcommon
identifiersMinor readability win and prevents future name clashes if another
common
appears.- "github.com/ozontech/seq-db/frac/common" + fraccommon "github.com/ozontech/seq-db/frac/common"frac/active_sealing_source.go (1)
278-280
: Consider simplifying error handling logicThe error collapse pattern here could be cleaner. If
src.lastErr
is set, the iterator likely stopped early.-if err := util.CollapseErrors([]error{src.lastErr, err}); err != nil { - return err -} +if src.lastErr != nil { + return src.lastErr +} +if err != nil { + return err +}frac/sealed/preloaded_data.go (1)
10-20
: Consider pointer for BlocksData to avoid accidental copiesPreloadedData embeds BlocksData by value. While slices/maps are headers, copying this struct around still risks accidental divergence and makes intent (read‑only) unclear.
Apply this minimal change:
-type PreloadedData struct { - Info *common.Info - BlocksData BlocksData - TokenTable token.Table -} +type PreloadedData struct { + Info *common.Info + BlocksData *BlocksData + TokenTable token.Table +}Update call sites that construct PreloadedData accordingly (pass &BlocksData{...}).
frac/remote.go (1)
55-57
: Unnecessary pointer to sync.RWMutexStoring the lock as a pointer adds indirection and invites nil mistakes. Idiomatic Go embeds the value type.
- loadMu *sync.RWMutex + loadMu sync.RWMutexInitialize usage accordingly: f.loadMu.Lock() (no address-of).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (47)
bytespool/writer.go
(1 hunks)cmd/distribution/main.go
(4 hunks)cmd/index_analyzer/main.go
(2 hunks)cmd/seq-db/seq-db.go
(2 hunks)consts/consts.go
(1 hunks)frac/active.go
(4 hunks)frac/active_docs_positions.go
(2 hunks)frac/active_index.go
(2 hunks)frac/active_sealer.go
(0 hunks)frac/active_sealer_test.go
(0 hunks)frac/active_sealing_source.go
(1 hunks)frac/common/info.go
(2 hunks)frac/common/seal_params.go
(1 hunks)frac/disk_blocks.go
(0 hunks)frac/disk_blocks_producer.go
(0 hunks)frac/disk_blocks_writer.go
(0 hunks)frac/fraction.go
(2 hunks)frac/remote.go
(7 hunks)frac/seal_stats.go
(0 hunks)frac/sealed.go
(9 hunks)frac/sealed/block_info.go
(2 hunks)frac/sealed/block_offsets.go
(1 hunks)frac/sealed/preloaded_data.go
(1 hunks)frac/sealed/sealing/blocks_builder.go
(1 hunks)frac/sealed/sealing/blocks_builder_test.go
(1 hunks)frac/sealed/sealing/index.go
(1 hunks)frac/sealed/sealing/sealer.go
(1 hunks)frac/sealed/sealing/stats.go
(1 hunks)frac/sealed/token/table.go
(1 hunks)frac/sealed/token/table_entry.go
(1 hunks)frac/sealed/token/table_loader.go
(1 hunks)frac/sealed_index.go
(2 hunks)frac/sealed_loader.go
(4 hunks)fracmanager/async_searcher_test.go
(2 hunks)fracmanager/config.go
(2 hunks)fracmanager/fracmanager.go
(1 hunks)fracmanager/fraction_provider.go
(4 hunks)fracmanager/proxy_frac.go
(7 hunks)fracmanager/sealed_frac_cache.go
(5 hunks)fracmanager/sealed_frac_cache_test.go
(14 hunks)fracmanager/sealer_test.go
(5 hunks)fracmanager/searcher_test.go
(2 hunks)storage/block_former.go
(0 hunks)storage/blocks_stats.go
(0 hunks)storage/blocks_writer.go
(0 hunks)storage/index_block_header.go
(1 hunks)tests/setup/env.go
(5 hunks)
💤 Files with no reviewable changes (9)
- storage/blocks_stats.go
- frac/disk_blocks.go
- frac/active_sealer.go
- frac/seal_stats.go
- storage/block_former.go
- frac/disk_blocks_writer.go
- frac/disk_blocks_producer.go
- frac/active_sealer_test.go
- storage/blocks_writer.go
🚧 Files skipped from review as they are similar to previous changes (23)
- frac/sealed/token/table_loader.go
- frac/sealed/sealing/stats.go
- frac/fraction.go
- frac/sealed/block_offsets.go
- frac/sealed/token/table.go
- fracmanager/fraction_provider.go
- frac/active_index.go
- bytespool/writer.go
- storage/index_block_header.go
- fracmanager/fracmanager.go
- fracmanager/searcher_test.go
- frac/common/seal_params.go
- tests/setup/env.go
- frac/sealed/token/table_entry.go
- frac/sealed_index.go
- consts/consts.go
- frac/active.go
- cmd/index_analyzer/main.go
- frac/sealed/block_info.go
- fracmanager/async_searcher_test.go
- frac/common/info.go
- frac/active_docs_positions.go
- fracmanager/config.go
🧰 Additional context used
🧬 Code graph analysis (15)
frac/sealed/preloaded_data.go (2)
frac/common/info.go (1)
Info
(22-41)frac/sealed/token/table.go (1)
Table
(34-34)
frac/sealed/sealing/blocks_builder_test.go (5)
frac/common/info.go (1)
Info
(22-41)seq/seq.go (3)
ID
(12-15)MID
(17-17)RID
(18-18)seq/doc_pos.go (2)
DocPos
(19-19)PackDocPos
(21-34)frac/sealed/token/table_loader.go (2)
TableBlock
(109-111)FieldTable
(113-116)frac/sealed/token/table_entry.go (1)
TableEntry
(4-12)
fracmanager/sealed_frac_cache.go (1)
frac/common/info.go (1)
Info
(22-41)
frac/active_sealing_source.go (11)
frac/common/seal_params.go (1)
SealParams
(3-12)frac/common/info.go (1)
Info
(22-41)frac/active_lids.go (1)
TokenLIDs
(27-33)seq/doc_pos.go (2)
DocPos
(19-19)PackDocPos
(21-34)storage/docs_reader.go (1)
DocsReader
(11-14)frac/active.go (1)
Active
(32-64)frac/active_token_list.go (1)
TokenList
(60-74)frac/active_docs_positions.go (1)
DocsPositions
(9-13)consts/consts.go (3)
IDsPerBlock
(14-14)SdocsTmpFileSuffix
(56-56)SdocsFileSuffix
(55-55)bytespool/writer.go (3)
AcquireWriterSize
(18-30)ReleaseWriter
(32-37)Writer
(13-16)storage/doc_block.go (1)
CompressDocBlock
(79-88)
frac/sealed_loader.go (3)
frac/sealed/preloaded_data.go (1)
BlocksData
(16-20)frac/common/info.go (1)
Info
(22-41)frac/sealed/block_offsets.go (1)
BlockOffsets
(8-11)
frac/sealed/sealing/blocks_builder.go (5)
frac/sealed/seqids/blocks.go (3)
BlockMIDs
(10-12)BlockRIDs
(32-35)BlockParams
(57-59)frac/sealed/token/table_loader.go (2)
TableBlock
(109-111)FieldTable
(113-116)frac/sealed/token/table_entry.go (1)
TableEntry
(4-12)seq/seq.go (3)
ID
(12-15)MID
(17-17)RID
(18-18)seq/doc_pos.go (1)
DocPos
(19-19)
cmd/distribution/main.go (2)
frac/common/info.go (1)
Info
(22-41)frac/sealed/block_info.go (1)
BlockInfo
(15-17)
frac/sealed/sealing/index.go (10)
frac/common/seal_params.go (1)
SealParams
(3-12)storage/codec.go (3)
Codec
(18-18)CodecNo
(13-13)CodecZSTD
(15-15)storage/index_block_header.go (2)
IndexBlockHeader
(19-19)NewIndexBlockHeader
(25-34)frac/sealed/sealing/sealer.go (1)
Source
(22-30)bytespool/writer.go (3)
AcquireWriterSize
(18-30)ReleaseWriter
(32-37)Writer
(13-16)frac/common/info.go (1)
Info
(22-41)consts/consts.go (3)
RegularBlockSize
(16-16)IDsPerBlock
(14-14)LIDBlockCap
(15-15)frac/sealed/block_offsets.go (1)
BlockOffsets
(8-11)frac/sealed/token/table_loader.go (2)
TableBlock
(109-111)TableFromBlocks
(49-71)seq/seq.go (3)
ID
(12-15)MID
(17-17)RID
(18-18)
frac/remote.go (2)
frac/common/info.go (1)
Info
(22-41)frac/sealed/preloaded_data.go (1)
BlocksData
(16-20)
fracmanager/sealed_frac_cache_test.go (4)
frac/common/info.go (1)
Info
(22-41)tests/common/util.go (3)
GetTestTmpDir
(69-71)RecreateDir
(49-52)RemoveDir
(42-47)consts/consts.go (1)
FracCacheFileSuffix
(65-65)fracmanager/sealed_frac_cache.go (1)
NewSealedFracCache
(32-43)
cmd/seq-db/seq-db.go (1)
frac/common/seal_params.go (1)
SealParams
(3-12)
frac/sealed/sealing/sealer.go (6)
frac/common/info.go (1)
Info
(22-41)frac/common/seal_params.go (1)
SealParams
(3-12)frac/sealed/preloaded_data.go (2)
PreloadedData
(10-14)BlocksData
(16-20)consts/consts.go (2)
IndexTmpFileSuffix
(60-60)IndexFileSuffix
(59-59)frac/sealed/sealing/index.go (1)
NewIndexSealer
(44-48)util/fs.go (1)
MustSyncPath
(15-28)
fracmanager/sealer_test.go (5)
frac/common/seal_params.go (1)
SealParams
(3-12)tests/common/util.go (1)
RecreateDir
(49-52)frac/sealed/preloaded_data.go (1)
PreloadedData
(10-14)frac/active_sealing_source.go (1)
NewActiveSealingSource
(45-78)frac/sealed/sealing/sealer.go (1)
Seal
(46-110)
frac/sealed.go (6)
frac/common/info.go (1)
Info
(22-41)frac/sealed/preloaded_data.go (2)
BlocksData
(16-20)PreloadedData
(10-14)consts/consts.go (2)
SdocsFileSuffix
(55-55)DocsFileSuffix
(52-52)storage/docs_reader.go (1)
NewDocsReader
(16-21)frac/index_cache.go (1)
IndexCache
(10-18)frac/sealed/block_info.go (1)
BlockInfo
(15-17)
fracmanager/proxy_frac.go (5)
frac/active.go (1)
Active
(32-64)frac/common/info.go (1)
Info
(22-41)frac/sealed/sealing/sealer.go (1)
Seal
(46-110)frac/common/seal_params.go (1)
SealParams
(3-12)frac/active_sealing_source.go (1)
NewActiveSealingSource
(45-78)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test
🔇 Additional comments (11)
fracmanager/sealed_frac_cache_test.go (2)
29-29
: Function signature return type changed from*frac.Info
to*common.Info
The function signature has been updated to align with the package refactoring. This is a correct change that reflects the migration of the
Info
type to thecommon
package.
125-139
: Test data structure correctly updated to*common.Info
The test data initialization has been properly migrated to use
common.Info
. All fields are correctly preserved.fracmanager/sealed_frac_cache.go (2)
25-25
: Cache type correctly migrated to*common.Info
The in-memory cache type has been properly updated to use the new
common.Info
type, maintaining consistency with the broader refactoring.
77-77
: Method signatures updated for type migrationBoth
AddFraction
andGetFracInfo
method signatures have been correctly updated to use*common.Info
instead of*frac.Info
, maintaining API consistency.Also applies to: 97-97
frac/active_sealing_source.go (1)
339-371
: Well-structured document block packing implementationThe
docBlocks
function correctly handles document grouping with proper handling of duplicate IDs in nested indices. The position tracking and block size management are implemented correctly.cmd/distribution/main.go (2)
69-71
: Good addition of header validationThe header length check prevents potential panics from corrupted index files.
73-74
: Type migration tosealed.BlockInfo
is correctThe
BlockInfo
type has been properly migrated to thesealed
package as part of the refactoring.frac/sealed_loader.go (3)
24-24
: Method signature updated to use new data structuresThe
Load
method signature has been correctly updated to usesealed.BlocksData
andcommon.Info
, aligning with the new architecture.
81-81
: Type migration tosealed.BlockOffsets
The
BlockOffsets
type has been correctly migrated to thesealed
package.
34-40
: Data population correctly migrated toblocksData
All field assignments have been properly updated from
state
toblocksData
, maintaining the same loading logic with the new data structures.frac/remote.go (1)
192-213
: No nil-safety issue — Loader/Sealer initialize the tablesLoader.Load populates blocksData.IDsTable and blocksData.LIDsTable (see frac/sealed_loader.go: loadIDs / loadLIDsBlocksTable) and the sealing/preload path returns non‑nil tables (sealed/sealing/index.go and sealed.PreloadedData). DataProvider calls load() (or uses preloaded data) before createDataProvider and returns an EmptyDataProvider on load error, so passing &f.blocksData.IDsTable and f.blocksData.LIDsTable is safe.
thanks for the comments in the code, they are really helpful! |
489a401
to
a48a04f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (8)
fracmanager/sealed_frac_cache.go (1)
139-178
: Close/SYNC the temp file before rename; avoid FD leak and improve durability.tmp is never closed; on Windows, renaming an open file fails. Also fsyncing the file (and best‑effort syncing the directory) reduces risk of torn writes. Clean up the temp file on errors.
Apply this diff:
func (fc *sealedFracCache) SaveCacheToDisk(version uint64, content []byte) error { fc.saveMu.Lock() defer fc.saveMu.Unlock() @@ - tmp, err := os.CreateTemp(fc.dataDir, fc.fileName+".") + tmp, err := os.CreateTemp(fc.dataDir, fc.fileName+".") if err != nil { return fmt.Errorf("can't save frac-cache: %w", err) } - - err = tmp.Chmod(defaultFilePermission) - if err != nil { - return fmt.Errorf("can't change frac-cache file permission: %w", err) - } - - if _, err = tmp.Write(content); err != nil { - return fmt.Errorf("can't save frac-cache: %w", err) - } - - if err = os.Rename(tmp.Name(), fc.fullPath); err != nil { - return fmt.Errorf("can't rename tmp to actual frac-cache: %w", err) - } + tmpName := tmp.Name() + // Ensure no orphan temp files on early returns. + defer func() { _ = os.Remove(tmpName) }() + // Ensure descriptor is closed on all paths. + defer func() { _ = tmp.Close() }() + + if err = tmp.Chmod(defaultFilePermission); err != nil { + return fmt.Errorf("can't change frac-cache file permission: %w", err) + } + if _, err = tmp.Write(content); err != nil { + return fmt.Errorf("can't save frac-cache: %w", err) + } + // Flush file contents to disk before rename. + if err = tmp.Sync(); err != nil { + return fmt.Errorf("can't fsync frac-cache temp file: %w", err) + } + // Close before rename for Windows compatibility. + if err = tmp.Close(); err != nil { + return fmt.Errorf("can't close frac-cache temp file: %w", err) + } + if err = os.Rename(tmpName, fc.fullPath); err != nil { + return fmt.Errorf("can't rename tmp to actual frac-cache: %w", err) + } + // Best-effort fsync of the directory to persist the rename. + if dir, derr := os.Open(fc.dataDir); derr == nil { + _ = dir.Sync() + _ = dir.Close() + } @@ fc.savedVersion.Store(version) logger.Info("frac-cache saved to disk", zap.String("filepath", fc.fullPath), zap.Uint64("version", version)) return nil }frac/sealed/block_offsets.go (2)
17-21
: Use unsigned varints for non-negative deltas to avoid overflow/corruptionOffsets are monotonic and non-negative; encoding them with signed varints risks negative values on overflow and wrap in Unpack. Switch to Uvarint on both Pack/Unpack.
Apply:
var prev uint64 for _, pos := range b.Offsets { - buf = binary.AppendVarint(buf, int64(pos-prev)) + buf = binary.AppendUvarint(buf, uint64(pos-prev)) prev = pos }
41-51
: Harden varint decode; switch to Uvarint and validate nCurrent code ignores n from Varint; n==0 will loop forever, n<0 can panic on slicing. Also use Uvarint to match Pack.
Apply:
- delta, n := binary.Varint(data) - if n == 0 { - return errors.New("blocks offset decoding error: varint returned 0") - } - if n < 0 { - return errors.New("blocks offset decoding error: varint overflow") - } - data = data[n:] - offset += uint64(delta) + udelta, n := binary.Uvarint(data) + if n <= 0 { + return errors.New("blocks offset decoding error: uvarint underflow/overflow") + } + data = data[n:] + offset += udelta b.Offsets = append(b.Offsets, offset)cmd/distribution/main.go (2)
62-86
: Close index file handle in loadInfo to prevent FD leaksgetReader opens a file; loadInfo never closes it. Defer the close.
Apply:
func loadInfo(path string) *common.Info { - indexReader, f := getReader(path) + indexReader, f := getReader(path) + defer f.Close()
131-139
: Check Varint decode result to avoid infinite loop/panicn==0 causes an infinite loop; n<0 can slice with negative index. Validate before slicing.
Apply:
- for len(bytes) != 0 { - delta, n := binary.Varint(bytes) - bytes = bytes[n:] + for len(bytes) != 0 { + delta, n := binary.Varint(bytes) + if n == 0 { + logger.Fatal("varint decode underflow", zap.String("file", path), zap.Uint32("block_index", blockIndex)) + } + if n < 0 { + logger.Fatal("varint decode overflow", zap.String("file", path), zap.Uint32("block_index", blockIndex)) + } + bytes = bytes[n:] mid += uint64(delta)frac/active_docs_positions.go (2)
24-36
: Make Get concurrency-safe; simplify GetSyncPublic Get reads the map without a lock. Either document it as unsafe or add RLock and make GetSync a thin delegate.
Apply:
-func (dp *DocsPositions) Get(id seq.ID) seq.DocPos { - if val, ok := dp.idToPos[id]; ok { - return val - } - return seq.DocPosNotFound -} +func (dp *DocsPositions) Get(id seq.ID) seq.DocPos { + dp.mu.RLock() + defer dp.mu.RUnlock() + if val, ok := dp.idToPos[id]; ok { + return val + } + return seq.DocPosNotFound +} func (dp *DocsPositions) GetSync(id seq.ID) seq.DocPos { - dp.mu.RLock() - defer dp.mu.RUnlock() - - return dp.Get(id) + return dp.Get(id) }
38-46
: Guard against length mismatch between ids and posCurrent code will panic on pos[i] if lengths differ.
Apply:
func (dp *DocsPositions) SetMultiple(ids []seq.ID, pos []seq.DocPos) []seq.ID { dp.mu.Lock() defer dp.mu.Unlock() + if len(ids) != len(pos) { + panic("DocsPositions.SetMultiple: ids and pos length mismatch") + }fracmanager/proxy_frac.go (1)
121-146
: Seal error path leaks state: readonly stays true and sealWg never DoneIf NewActiveSealingSource/Seal fails, readonly remains set and sealWg.Done() is not called, blocking Offload/Suicide waits.
Apply defers and revert readonly on failures:
func (f *proxyFrac) Seal(params common.SealParams) (*frac.Sealed, error) { f.useMu.Lock() if f.isSuicidedState() { f.useMu.Unlock() return nil, ErrSealingFractionSuicided } if !f.isActiveState() { f.useMu.Unlock() return nil, errors.New("sealing fraction is not active") } f.readonly = true active := f.active f.sealWg.Add(1) // It's important to put wg.Add() inside a lock, otherwise we might call wg.Wait() before it f.useMu.Unlock() f.WaitWriteIdle() - src, err := frac.NewActiveSealingSource(active, params) - if err != nil { - return nil, err - } + // ensure Done on every exit path + defer f.sealWg.Done() + src, err := frac.NewActiveSealingSource(active, params) + if err != nil { + f.useMu.Lock() + f.readonly = false + f.useMu.Unlock() + return nil, err + } preloaded, err := sealing.Seal(src, params) if err != nil { - return nil, err + f.useMu.Lock() + f.readonly = false + f.useMu.Unlock() + return nil, err }
♻️ Duplicate comments (11)
fracmanager/proxy_frac.go (1)
138-144
: Consider sealing-source cleanup on failureIf src holds resources, add a Close() and call it when Seal fails.
frac/sealed/sealing/blocks_builder.go (2)
279-305
: Skip empty token batches to avoid degenerate blocksEmpty batches yield minTID > maxTID blocks and can confuse downstream table builders.
for tokens := range tokenBatches { + if len(tokens) == 0 { + continue + } idx++
165-222
: Validate capacity and fix LIDs continuation flag semantics
- Guard blockCapacity <= 0 to avoid endless loops.
- isContinued should mark a block that continues a token from the previous block; current code marks the block that ends mid‑token.
func (bb *blocksBuilder) BuildLIDsBlocks(tokenLIDs iter.Seq[[]uint32], blockCapacity int) iter.Seq[lidsSealBlock] { return func(yield func(lidsSealBlock) bool) { - var ( - currentTID uint32 // Current TID being processed - currentBlock lidsSealBlock // Current block under construction - isEndOfToken bool // Flag for end of current token's LIDs - continuesNext bool // Flag for block continuation - ) + if blockCapacity <= 0 { + bb.lastErr = errors.New("sealing: LID block size must be > 0") + return + } + var ( + currentTID uint32 // Current TID being processed + currentBlock lidsSealBlock // Current block under construction + isEndOfToken bool // Flag for end of current token's LIDs + continuedFromPrev bool // This block starts by continuing a previous token + ) @@ - finalizeBlock := func() bool { + finalizeBlock := func() bool { if !isEndOfToken { // Add final offset for current token if not already done currentBlock.payload.Offsets = append(currentBlock.payload.Offsets, uint32(len(currentBlock.payload.LIDs))) } currentBlock.payload.IsLastLID = isEndOfToken // TODO: Remove legacy field - currentBlock.ext.isContinued = continuesNext // TODO: Remove legacy field - continuesNext = !isEndOfToken + currentBlock.ext.isContinued = continuedFromPrev // TODO: Remove legacy field return yield(currentBlock) } @@ - if len(currentBlock.payload.LIDs) == blockCapacity { + if len(currentBlock.payload.LIDs) == blockCapacity { if !finalizeBlock() { return } // Initialize new block - currentBlock.ext.minTID = currentTID + continuedFromPrev = true // next block starts with continuation + currentBlock.ext.minTID = currentTID // same TID continues currentBlock.payload.LIDs = currentBlock.payload.LIDs[:0] currentBlock.payload.Offsets = currentBlock.payload.Offsets[:1] // Reset to initial offset } @@ - isEndOfToken = true + isEndOfToken = true + continuedFromPrev = false } // Yield the final block finalizeBlock() } }storage/index_block_header.go (1)
25-33
: Make on-disk types explicit to prevent truncation/wrapsize/rawSize are int and pos is int64 but header stores uint32/uint64. Use the exact types in the API to avoid silent narrowing and sign wrap.
Apply:
-func NewIndexBlockHeader(pos int64, ext1, ext2 uint64, size, rawSize int, codec Codec) IndexBlockHeader { +func NewIndexBlockHeader(pos uint64, ext1, ext2 uint64, size, rawSize uint32, codec Codec) IndexBlockHeader { header := NewEmptyIndexBlockHeader() header.SetExt1(ext1) header.SetExt2(ext2) - header.SetLen(uint32(size)) - header.SetRawLen(uint32(rawSize)) + header.SetLen(size) + header.SetRawLen(rawSize) header.SetCodec(codec) - header.SetPos(uint64(pos)) + header.SetPos(pos) return header }If changing the signature is not feasible now, at minimum add runtime guards to reject negative pos/oversized lengths instead of casting.
Run to find all call sites to update:
#!/bin/bash ast-grep --pattern $'NewIndexBlockHeader($_pos, $_e1, $_e2, $_size, $_raw, $_codec)'fracmanager/sealer_test.go (2)
121-125
: Don’t assert then continue; return on errorIf NewActiveSealingSource fails, the benchmark continues and may panic. Return early.
- src, err := frac.NewActiveSealingSource(active, params) - assert.NoError(b, err) - return sealing.Seal(src, params) + src, err := frac.NewActiveSealingSource(active, params) + if err != nil { + return nil, err + } + return sealing.Seal(src, params)
148-151
: Replace b.Loop() with standard benchmark looptesting.B has no Loop(); use the canonical pattern and fail fast on error.
- for b.Loop() { - _, err = seal(active, params) - assert.NoError(b, err) - } + b.ResetTimer() + for i := 0; i < b.N; i++ { + if _, err = seal(active, params); err != nil { + b.Fatal(err) + } + }frac/sealed/sealing/index.go (4)
403-406
: Use DocsPositionsZstdLevel for positions compression.Positions should use the docs/positions compression setting, not IDs.
- b := s.newIndexBlockZSTD(s.buf1, s.params.IDsZstdLevel) + b := s.newIndexBlockZSTD(s.buf1, s.params.DocsPositionsZstdLevel)
423-429
: Use LIDsZstdLevel for LIDs compression.The LIDs block currently uses IDs level.
- b := s.newIndexBlockZSTD(s.buf1, s.params.IDsZstdLevel) + b := s.newIndexBlockZSTD(s.buf1, s.params.LIDsZstdLevel)
90-101
: Propagate flush/release errors; use FlushReleaseWriter with a named return.Ensure the buffered writer is always flushed and its error is not lost on early returns.
-func (s *IndexSealer) WriteIndex(ws io.WriteSeeker, src Source) error { +func (s *IndexSealer) WriteIndex(ws io.WriteSeeker, src Source) (err error) { @@ - bw := bytespool.AcquireWriterSize(ws, int(units.MiB)) // Buffered writer for payload - defer bytespool.ReleaseWriter(bw) + bw := bytespool.AcquireWriterSize(ws, int(units.MiB)) // Buffered writer for payload + defer func() { + if e := bytespool.FlushReleaseWriter(bw); err == nil && e != nil { + err = e + } + }()
104-106
: Fix typo and use int64 for file offsets to avoid overflow.Rename writeBocks → writeBlocks and carry offsets as int64 to match on-disk positions.
- if err := s.writeBocks(prefixSize, bw, hw, src); err != nil { + if err := s.writeBlocks(int64(prefixSize), bw, hw, src); err != nil { return err }-func (s *IndexSealer) writeBocks(pos int, payloadWriter, headersWriter io.Writer, src Source) error { +func (s *IndexSealer) writeBlocks(pos int64, payloadWriter, headersWriter io.Writer, src Source) error { @@ - header, payload := block.Bin(int64(pos)) + header, payload := block.Bin(pos) @@ - pos += len(payload) // Advance position for next block + pos += int64(len(payload)) // Advance position for next blockAlso applies to: 150-168
frac/active_sealing_source.go (1)
264-276
: Always flush/release buffered writer and surface its error.Switch to FlushReleaseWriter and propagate its error using a named return. Keeps explicit bw.Flush() before stat/rename but also covers early returns.
-func (src *ActiveSealingSource) SortDocs() error { +func (src *ActiveSealingSource) SortDocs() (err error) { @@ - bw := bytespool.AcquireWriterSize(sdocsFile, int(units.MiB)) - defer bytespool.ReleaseWriter(bw) + bw := bytespool.AcquireWriterSize(sdocsFile, int(units.MiB)) + defer func() { + if e := bytespool.FlushReleaseWriter(bw); err == nil && e != nil { + err = e + } + }()
🧹 Nitpick comments (12)
bytespool/writer.go (1)
32-37
: Make ReleaseWriter idempotent and nil-safe.Double-release or releasing a partially initialized writer will pass nil into Release and may panic. Guard for nils before releasing.
func ReleaseWriter(w *Writer) { - Release(w.Buf) - w.Buf = nil - w.out = nil - writerPool.Put(w) + if w == nil { + return + } + if w.Buf != nil { + Release(w.Buf) + } + w.Buf = nil + w.out = nil + writerPool.Put(w) }fracmanager/sealed_frac_cache_test.go (2)
121-124
: Swap expected/actual in assert.Equal for clearer failures.Testify expects assert.Equal(t, expected, actual). Current order is reversed in these spots.
Apply this diff:
- assert.Equal(t, contents, []byte("{}")) + assert.Equal(t, []byte("{}"), contents) @@ - assert.Equal(t, contents, []byte("{}")) + assert.Equal(t, []byte("{}"), contents) @@ - assert.Equal(t, fracCacheFromDisk, []byte("{}")) + assert.Equal(t, []byte("{}"), fracCacheFromDisk)Also applies to: 153-156, 478-481
146-148
: *Avoid deep-equality on Info across persistence; assert key fields instead.Future additions to Info (timestamps, distribution) can make deep-equal brittle. Compare stable fields.
Example change:
- expected := map[string]*common.Info{"c": newInfo} - assert.Equal(t, expected, m) + got := m["c"] + assert.NotNil(t, got) + assert.Equal(t, newInfo.Name(), got.Name()) + assert.Equal(t, newInfo.Ver, got.Ver) + assert.Equal(t, newInfo.ConstLIDBlockCap, got.ConstLIDBlockCap) @@ - assert.Equal(t, mapFracCache["c"], newInfo) + got = mapFracCache["c"] + assert.NotNil(t, got) + assert.Equal(t, newInfo.Path, got.Path) + assert.Equal(t, newInfo.Ver, got.Ver) + assert.Equal(t, newInfo.ConstLIDBlockCap, got.ConstLIDBlockCap)Also applies to: 196-199
cmd/distribution/main.go (2)
31-35
: Fix log typo: “disribution” → “distribution”Minor polish.
Apply:
- logger.Info(" * disribution data", zap.Int("num", i), zap.Time("bucket", bucket)) + logger.Info(" * distribution data", zap.Int("num", i), zap.Time("bucket", bucket))
75-77
: Fix log typo: “bloc” → “block”Minor polish.
Apply:
- logger.Fatal("can't unpack info bloc of index file", zap.String("file", path), zap.Error(err)) + logger.Fatal("can't unpack info block of index file", zap.String("file", path), zap.Error(err))frac/active_docs_positions.go (1)
47-59
: Clarify duplicate handling semantics (same ID, same position)Today duplicates with the same position are re-appended to lidToPos, creating multiple LIDs for one ID. If uniqueness per ID is desired, skip appending when ok && p == pos[i].
Apply if uniqueness is intended:
- if ok { - if p != pos[i] { - // same ID but different position - // this is a duplicate ID, we can't append it - continue - } - } else { + if ok { + if p != pos[i] { + // same ID but different position — skip + continue + } + // same ID and same position — no-op + continue + } else { dp.idToPos[id] = pos[i] } - dp.lidToPos = append(dp.lidToPos, pos[i]) - appended = append(appended, id) + dp.lidToPos = append(dp.lidToPos, pos[i]) + appended = append(appended, id)frac/remote.go (2)
97-111
: Avoid nil deref when index open failsYou log the openIndex error but still call loadHeader on possibly nil f.indexFile/reader. Return early or guard the call.
Apply:
if err := f.openIndex(); err != nil { logger.Error( "cannot open index file: any subsequent operation will fail", zap.String("fraction", filepath.Base(f.BaseFileName)), zap.Error(err), ) - } - - f.info = loadHeader(f.indexFile, f.indexReader) + return f + } + f.info = loadHeader(f.indexFile, f.indexReader) return f
55-57
: Prefer value RWMutex over pointerUse sync.RWMutex as a value field to avoid extra allocation and simplify init.
- loadMu *sync.RWMutex + loadMu sync.RWMutexAnd in NewRemote:
- loadMu: &sync.RWMutex{}, + loadMu: sync.RWMutex{},Also applies to: 76-78
fracmanager/sealer_test.go (1)
119-121
: Remove duplicate Stop()fp.Stop() is already deferred; calling it again risks double-stop races.
- fp.Stop()
frac/sealed/sealing/blocks_builder.go (1)
76-123
: Optional: coalesce per-field entries within a blockBuildTokenBlocks appends a new FieldTable per slice; merging contiguous ranges for the same field reduces allocations and downstream merges.
Would you like a small helper that appends to the last FieldTable if the field matches and ranges are adjacent?
frac/active_sealing_source.go (2)
320-334
: Use int64 for accumulating on-disk offsets.offset can overflow int on 32‑bit or large files; carry it as int64.
-func (src *ActiveSealingSource) writeDocs(blocks iter.Seq2[[]byte, []seq.DocPos], w io.Writer) ([]uint64, []seq.DocPos, error) { - offset := 0 +func (src *ActiveSealingSource) writeDocs(blocks iter.Seq2[[]byte, []seq.DocPos], w io.Writer) ([]uint64, []seq.DocPos, error) { + var offset int64 @@ - blocksOffsets = append(blocksOffsets, uint64(offset)) + blocksOffsets = append(blocksOffsets, uint64(offset)) @@ - offset += len(buf) + offset += int64(len(buf))
200-214
: Avoid slice aliasing across yields.newLIDs is reused and grown in-place; consumers of TokenLIDs may observe mutated slices. Emit a fresh slice per yield.
-func (src *ActiveSealingSource) TokenLIDs() iter.Seq[[]uint32] { +func (src *ActiveSealingSource) TokenLIDs() iter.Seq[[]uint32] { return func(yield func([]uint32) bool) { - newLIDs := []uint32{} for _, tid := range src.tids { oldLIDs := src.lids[tid].GetLIDs(src.mids, src.rids) - newLIDs = slices.Grow(newLIDs[:0], len(oldLIDs)) - for _, lid := range oldLIDs { - newLIDs = append(newLIDs, src.oldToNewLIDs[lid]) - } - if !yield(newLIDs) { + translated := make([]uint32, len(oldLIDs)) + for i, lid := range oldLIDs { + translated[i] = src.oldToNewLIDs[lid] + } + if !yield(translated) { return } } } }If BuildLIDsBlocks guarantees immediate consumption without retaining slices, feel free to dismiss, but please confirm.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (47)
bytespool/writer.go
(1 hunks)cmd/distribution/main.go
(4 hunks)cmd/index_analyzer/main.go
(2 hunks)cmd/seq-db/seq-db.go
(2 hunks)consts/consts.go
(1 hunks)frac/active.go
(4 hunks)frac/active_docs_positions.go
(2 hunks)frac/active_index.go
(2 hunks)frac/active_sealer.go
(0 hunks)frac/active_sealer_test.go
(0 hunks)frac/active_sealing_source.go
(1 hunks)frac/common/info.go
(2 hunks)frac/common/seal_params.go
(1 hunks)frac/disk_blocks.go
(0 hunks)frac/disk_blocks_producer.go
(0 hunks)frac/disk_blocks_writer.go
(0 hunks)frac/fraction.go
(2 hunks)frac/remote.go
(7 hunks)frac/seal_stats.go
(0 hunks)frac/sealed.go
(9 hunks)frac/sealed/block_info.go
(2 hunks)frac/sealed/block_offsets.go
(1 hunks)frac/sealed/preloaded_data.go
(1 hunks)frac/sealed/sealing/blocks_builder.go
(1 hunks)frac/sealed/sealing/blocks_builder_test.go
(1 hunks)frac/sealed/sealing/index.go
(1 hunks)frac/sealed/sealing/sealer.go
(1 hunks)frac/sealed/sealing/stats.go
(1 hunks)frac/sealed/token/table.go
(1 hunks)frac/sealed/token/table_entry.go
(1 hunks)frac/sealed/token/table_loader.go
(1 hunks)frac/sealed_index.go
(2 hunks)frac/sealed_loader.go
(4 hunks)fracmanager/async_searcher_test.go
(2 hunks)fracmanager/config.go
(2 hunks)fracmanager/fracmanager.go
(1 hunks)fracmanager/fraction_provider.go
(4 hunks)fracmanager/proxy_frac.go
(7 hunks)fracmanager/sealed_frac_cache.go
(5 hunks)fracmanager/sealed_frac_cache_test.go
(14 hunks)fracmanager/sealer_test.go
(5 hunks)fracmanager/searcher_test.go
(2 hunks)storage/block_former.go
(0 hunks)storage/blocks_stats.go
(0 hunks)storage/blocks_writer.go
(0 hunks)storage/index_block_header.go
(1 hunks)tests/setup/env.go
(5 hunks)
💤 Files with no reviewable changes (9)
- frac/disk_blocks_producer.go
- storage/blocks_stats.go
- frac/seal_stats.go
- frac/disk_blocks.go
- storage/blocks_writer.go
- frac/active_sealer_test.go
- storage/block_former.go
- frac/active_sealer.go
- frac/disk_blocks_writer.go
✅ Files skipped from review due to trivial changes (1)
- frac/sealed/token/table_loader.go
🚧 Files skipped from review as they are similar to previous changes (22)
- fracmanager/config.go
- fracmanager/searcher_test.go
- frac/sealed/token/table.go
- frac/common/info.go
- frac/sealed_index.go
- cmd/index_analyzer/main.go
- consts/consts.go
- frac/sealed/token/table_entry.go
- frac/common/seal_params.go
- frac/active_index.go
- frac/sealed/sealing/stats.go
- frac/fraction.go
- fracmanager/fracmanager.go
- fracmanager/fraction_provider.go
- frac/active.go
- frac/sealed/sealing/blocks_builder_test.go
- frac/sealed/block_info.go
- frac/sealed/sealing/sealer.go
- tests/setup/env.go
- cmd/seq-db/seq-db.go
- fracmanager/async_searcher_test.go
- frac/sealed.go
🧰 Additional context used
🧬 Code graph analysis (14)
fracmanager/sealed_frac_cache_test.go (4)
frac/common/info.go (1)
Info
(22-41)tests/common/util.go (3)
GetTestTmpDir
(69-71)RecreateDir
(49-52)RemoveDir
(42-47)consts/consts.go (1)
FracCacheFileSuffix
(65-65)fracmanager/sealed_frac_cache.go (1)
NewSealedFracCache
(32-43)
bytespool/writer.go (1)
bytespool/bytespool.go (1)
Release
(28-30)
storage/index_block_header.go (1)
storage/codec.go (1)
Codec
(18-18)
frac/sealed_loader.go (3)
frac/sealed/preloaded_data.go (1)
BlocksData
(16-20)frac/common/info.go (1)
Info
(22-41)frac/sealed/block_offsets.go (1)
BlockOffsets
(8-11)
fracmanager/sealer_test.go (6)
frac/common/seal_params.go (1)
SealParams
(3-12)tests/common/util.go (1)
RecreateDir
(49-52)frac/active.go (2)
NewActive
(76-118)Active
(32-64)frac/sealed/preloaded_data.go (1)
PreloadedData
(10-14)frac/active_sealing_source.go (1)
NewActiveSealingSource
(45-78)frac/sealed/sealing/sealer.go (1)
Seal
(46-110)
frac/active_docs_positions.go (2)
seq/seq.go (1)
ID
(12-15)seq/doc_pos.go (1)
DocPos
(19-19)
frac/remote.go (3)
frac/common/info.go (1)
Info
(22-41)frac/sealed/preloaded_data.go (1)
BlocksData
(16-20)frac/sealed_loader.go (1)
Loader
(18-22)
fracmanager/sealed_frac_cache.go (1)
frac/common/info.go (1)
Info
(22-41)
cmd/distribution/main.go (2)
frac/common/info.go (1)
Info
(22-41)frac/sealed/block_info.go (1)
BlockInfo
(15-17)
frac/active_sealing_source.go (12)
frac/common/seal_params.go (1)
SealParams
(3-12)frac/common/info.go (1)
Info
(22-41)frac/active_lids.go (1)
TokenLIDs
(27-33)seq/doc_pos.go (2)
DocPos
(19-19)PackDocPos
(21-34)storage/docs_reader.go (1)
DocsReader
(11-14)frac/active.go (1)
Active
(32-64)frac/active_token_list.go (1)
TokenList
(60-74)frac/active_docs_positions.go (1)
DocsPositions
(9-13)consts/consts.go (3)
IDsPerBlock
(14-14)SdocsTmpFileSuffix
(56-56)SdocsFileSuffix
(55-55)bytespool/writer.go (3)
AcquireWriterSize
(18-30)ReleaseWriter
(32-37)Writer
(13-16)util/errors_collapser.go (1)
CollapseErrors
(25-27)storage/doc_block.go (1)
CompressDocBlock
(79-88)
fracmanager/proxy_frac.go (6)
frac/active.go (1)
Active
(32-64)frac/common/info.go (1)
Info
(22-41)frac/sealed/sealing/sealer.go (1)
Seal
(46-110)frac/common/seal_params.go (1)
SealParams
(3-12)frac/sealed.go (1)
Sealed
(31-57)frac/active_sealing_source.go (1)
NewActiveSealingSource
(45-78)
frac/sealed/sealing/index.go (9)
frac/common/seal_params.go (1)
SealParams
(3-12)storage/codec.go (3)
Codec
(18-18)CodecNo
(13-13)CodecZSTD
(15-15)storage/index_block_header.go (2)
IndexBlockHeader
(19-19)NewIndexBlockHeader
(25-34)frac/sealed/sealing/sealer.go (1)
Source
(22-30)bytespool/writer.go (3)
AcquireWriterSize
(18-30)ReleaseWriter
(32-37)Writer
(13-16)frac/sealed/token/table_loader.go (3)
FieldTable
(113-116)TableBlock
(109-111)TableFromBlocks
(49-71)consts/consts.go (3)
RegularBlockSize
(16-16)IDsPerBlock
(14-14)LIDBlockCap
(15-15)frac/sealed/block_offsets.go (1)
BlockOffsets
(8-11)seq/seq.go (3)
ID
(12-15)MID
(17-17)RID
(18-18)
frac/sealed/sealing/blocks_builder.go (5)
frac/sealed/seqids/blocks.go (3)
BlockMIDs
(10-12)BlockRIDs
(32-35)BlockParams
(57-59)frac/sealed/token/table_loader.go (1)
FieldTable
(113-116)frac/sealed/token/table_entry.go (1)
TableEntry
(4-12)seq/seq.go (3)
ID
(12-15)MID
(17-17)RID
(18-18)seq/doc_pos.go (1)
DocPos
(19-19)
frac/sealed/preloaded_data.go (2)
frac/common/info.go (1)
Info
(22-41)frac/sealed/token/table.go (1)
Table
(34-34)
🔇 Additional comments (4)
fracmanager/sealed_frac_cache.go (1)
24-40
: Type migration to common.Info is consistent.Cache map, constructor init, and public method signatures align with frac/common.Info. Tests reflect the change. LGTM.
Also applies to: 76-83, 97-103
frac/sealed_loader.go (1)
24-56
: LGTM: Loader API aligns with sealed.BlocksData/common.InfoSignature and usage look consistent with the refactor; logging is clear.
frac/sealed/preloaded_data.go (1)
10-20
: LGTM: Cohesive grouping via BlocksDataClean separation of preloaded state; exported fields are sensible.
fracmanager/proxy_frac.go (1)
154-158
: Incorrect — keep the explicit f.sealWg.Done()Seal() calls f.sealWg.Add(1) (fracmanager/proxy_frac.go:133) and later calls f.sealWg.Done() (line 154). Offload/Suicide wait on the same WaitGroup (lines 184, 200). There is no defer f.sealWg.Done(), so removing the explicit Done() would leave Wait() blocked.
Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
cmd/distribution/main.go (2)
62-86
: Close the file returned by getReader to avoid fd leaksloadInfo opens a file and never closes it; on many fracs this can exhaust descriptors.
func loadInfo(path string) *common.Info { - indexReader, f := getReader(path) + indexReader, f := getReader(path) + defer f.Close() result, err := readBlock(indexReader, 0)
88-144
: Close reader, validate varint parsing, and use binary.Uvarint for non‑negative deltasFile: cmd/distribution/main.go Lines: 88-144
Unclosed reader and unchecked Varint parsing can hang/crash; close the returned file, fail on malformed varints (n <= 0), and prefer binary.Uvarint for monotonic non-negative deltas.
-func buildDist(dist *seq.MIDsDistribution, path string, _ *common.Info) { - blocksReader, _ := getReader(path) +func buildDist(dist *seq.MIDsDistribution, path string, _ *common.Info) { + blocksReader, f := getReader(path) + defer f.Close() @@ - for len(bytes) != 0 { - delta, n := binary.Varint(bytes) - bytes = bytes[n:] - mid += uint64(delta) - if cnt > 0 { // skip only first stub ID - dist.Add(seq.MID(mid)) - } - cnt++ - } + for len(bytes) != 0 { + delta, n := binary.Uvarint(bytes) + if n <= 0 { + logger.Fatal("corrupted uvarint while parsing IDs", zap.Uint32("block", blockIndex)) + } + bytes = bytes[n:] + mid += delta + if cnt > 0 { // skip only first stub ID + dist.Add(seq.MID(mid)) + } + cnt++ + }
♻️ Duplicate comments (4)
storage/index_block_header.go (1)
25-33
: Use on-disk types in the constructor to prevent wrap/truncation (same as earlier bot note).
pos int64
→uint64
andsize/rawSize int
→uint32
to match header fields and avoid silent narrowing or negative-to-unsigned wrap.Apply:
-func NewIndexBlockHeader(pos int64, ext1, ext2 uint64, size, rawSize int, codec Codec) IndexBlockHeader { +func NewIndexBlockHeader(pos uint64, ext1, ext2 uint64, size, rawSize uint32, codec Codec) IndexBlockHeader { header := NewEmptyIndexBlockHeader() header.SetExt1(ext1) header.SetExt2(ext2) - header.SetLen(uint32(size)) - header.SetRawLen(uint32(rawSize)) + header.SetLen(size) + header.SetRawLen(rawSize) header.SetCodec(codec) - header.SetPos(uint64(pos)) + header.SetPos(pos) return header }Run to locate call sites to update:
#!/bin/bash rg -n --type=go -C2 '\bNewIndexBlockHeader\('frac/active_sealing_source.go (1)
272-273
: Handle FlushReleaseWriter error.- defer bytespool.ReleaseWriter(bw) + defer func() { + bytespool.ReleaseWriter(bw) + }()Note: The lint error indicates
FlushReleaseWriter
should be called, but the code showsReleaseWriter
. Verify which function should be used.frac/sealed/sealing/blocks_builder_test.go (2)
32-32
: Fix invalid range syntax.The
for i := range len(m.fields)
syntax is invalid in most Go versions. Use a standard indexed loop.- for i := range len(m.fields) { + for i := 0; i < len(m.fields); i++ {
133-133
: Fix invalid range syntax in test loop.Same issue -
range
overint
won't compile.- for i := range block.payload.Len() { + for i := 0; i < block.payload.Len(); i++ {
🧹 Nitpick comments (2)
cmd/distribution/main.go (1)
31-35
: Fix log typo: “disribution” → “distribution”- logger.Info(" * disribution data", zap.Int("num", i), zap.Time("bucket", bucket)) + logger.Info(" * distribution data", zap.Int("num", i), zap.Time("bucket", bucket))frac/active_docs_positions.go (1)
16-21
: Consider the systemID initialization purpose.The code appends a zero entry for
systemID
but lacks documentation explaining why this reservation is needed or howsystemID
relates to the local ID indexing scheme.Add a brief comment explaining the systemID reservation:
func NewSyncDocsPositions() *DocsPositions { dp := DocsPositions{ lidToPos: make([]seq.DocPos, 0), idToPos: make(map[seq.ID]seq.DocPos), } - dp.lidToPos = append(dp.lidToPos, 0) // systemID + dp.lidToPos = append(dp.lidToPos, 0) // systemID reservation at index 0 return &dp }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (47)
bytespool/writer.go
(1 hunks)cmd/distribution/main.go
(4 hunks)cmd/index_analyzer/main.go
(2 hunks)cmd/seq-db/seq-db.go
(2 hunks)consts/consts.go
(1 hunks)frac/active.go
(4 hunks)frac/active_docs_positions.go
(2 hunks)frac/active_index.go
(2 hunks)frac/active_sealer.go
(0 hunks)frac/active_sealer_test.go
(0 hunks)frac/active_sealing_source.go
(1 hunks)frac/common/info.go
(2 hunks)frac/common/seal_params.go
(1 hunks)frac/disk_blocks.go
(0 hunks)frac/disk_blocks_producer.go
(0 hunks)frac/disk_blocks_writer.go
(0 hunks)frac/fraction.go
(2 hunks)frac/remote.go
(7 hunks)frac/seal_stats.go
(0 hunks)frac/sealed.go
(9 hunks)frac/sealed/block_info.go
(2 hunks)frac/sealed/block_offsets.go
(1 hunks)frac/sealed/preloaded_data.go
(1 hunks)frac/sealed/sealing/blocks_builder.go
(1 hunks)frac/sealed/sealing/blocks_builder_test.go
(1 hunks)frac/sealed/sealing/index.go
(1 hunks)frac/sealed/sealing/sealer.go
(1 hunks)frac/sealed/sealing/stats.go
(1 hunks)frac/sealed/token/table.go
(1 hunks)frac/sealed/token/table_entry.go
(1 hunks)frac/sealed/token/table_loader.go
(1 hunks)frac/sealed_index.go
(2 hunks)frac/sealed_loader.go
(4 hunks)fracmanager/async_searcher_test.go
(2 hunks)fracmanager/config.go
(2 hunks)fracmanager/fracmanager.go
(1 hunks)fracmanager/fraction_provider.go
(4 hunks)fracmanager/proxy_frac.go
(7 hunks)fracmanager/sealed_frac_cache.go
(5 hunks)fracmanager/sealed_frac_cache_test.go
(14 hunks)fracmanager/sealer_test.go
(5 hunks)fracmanager/searcher_test.go
(2 hunks)storage/block_former.go
(0 hunks)storage/blocks_stats.go
(0 hunks)storage/blocks_writer.go
(0 hunks)storage/index_block_header.go
(1 hunks)tests/setup/env.go
(5 hunks)
💤 Files with no reviewable changes (9)
- frac/seal_stats.go
- frac/disk_blocks_writer.go
- frac/active_sealer.go
- frac/disk_blocks.go
- frac/active_sealer_test.go
- storage/blocks_stats.go
- storage/blocks_writer.go
- frac/disk_blocks_producer.go
- storage/block_former.go
✅ Files skipped from review due to trivial changes (1)
- frac/sealed/token/table_loader.go
🚧 Files skipped from review as they are similar to previous changes (28)
- fracmanager/fracmanager.go
- consts/consts.go
- frac/remote.go
- fracmanager/searcher_test.go
- fracmanager/config.go
- frac/active_index.go
- frac/sealed_index.go
- bytespool/writer.go
- frac/sealed/preloaded_data.go
- frac/sealed/sealing/sealer.go
- frac/sealed/token/table.go
- frac/sealed/token/table_entry.go
- frac/active.go
- cmd/seq-db/seq-db.go
- fracmanager/sealed_frac_cache_test.go
- frac/sealed/sealing/index.go
- frac/common/info.go
- cmd/index_analyzer/main.go
- fracmanager/async_searcher_test.go
- frac/fraction.go
- tests/setup/env.go
- frac/sealed/sealing/stats.go
- frac/common/seal_params.go
- frac/sealed/block_info.go
- fracmanager/sealed_frac_cache.go
- fracmanager/proxy_frac.go
- frac/sealed.go
- fracmanager/sealer_test.go
🧰 Additional context used
🧬 Code graph analysis (8)
frac/sealed_loader.go (4)
frac/sealed/preloaded_data.go (1)
BlocksData
(16-20)frac/common/info.go (1)
Info
(22-41)storage/index_reader.go (1)
IndexReader
(13-20)frac/sealed/block_offsets.go (1)
BlockOffsets
(8-11)
storage/index_block_header.go (1)
storage/codec.go (1)
Codec
(18-18)
fracmanager/fraction_provider.go (3)
frac/sealed.go (3)
NewSealed
(67-98)Sealed
(31-57)NewSealedPreloaded
(129-172)frac/common/info.go (1)
Info
(22-41)frac/sealed/preloaded_data.go (1)
PreloadedData
(10-14)
frac/sealed/sealing/blocks_builder_test.go (5)
frac/common/info.go (1)
Info
(22-41)seq/seq.go (3)
ID
(12-15)MID
(17-17)RID
(18-18)seq/doc_pos.go (2)
DocPos
(19-19)PackDocPos
(21-34)frac/sealed/token/table_loader.go (2)
FieldTable
(113-116)TableBlock
(109-111)frac/sealed/token/table_entry.go (1)
TableEntry
(4-12)
frac/active_sealing_source.go (14)
frac/common/seal_params.go (1)
SealParams
(3-12)frac/common/info.go (1)
Info
(22-41)frac/active_lids.go (1)
TokenLIDs
(27-33)seq/doc_pos.go (2)
DocPos
(19-19)PackDocPos
(21-34)storage/docs_reader.go (1)
DocsReader
(11-14)frac/active.go (1)
Active
(32-64)frac/active_token_list.go (1)
TokenList
(60-74)frac/active_docs_positions.go (1)
DocsPositions
(9-13)fracmanager/config.go (1)
Config
(15-34)seq/seq.go (3)
ID
(12-15)MID
(17-17)RID
(18-18)consts/consts.go (3)
IDsPerBlock
(14-14)SdocsTmpFileSuffix
(56-56)SdocsFileSuffix
(55-55)bytespool/writer.go (3)
AcquireWriterSize
(18-30)ReleaseWriter
(32-37)Writer
(13-16)util/errors_collapser.go (1)
CollapseErrors
(25-27)storage/doc_block.go (1)
CompressDocBlock
(79-88)
frac/active_docs_positions.go (2)
seq/seq.go (1)
ID
(12-15)seq/doc_pos.go (1)
DocPos
(19-19)
cmd/distribution/main.go (2)
frac/common/info.go (1)
Info
(22-41)frac/sealed/block_info.go (1)
BlockInfo
(15-17)
frac/sealed/sealing/blocks_builder.go (5)
frac/sealed/seqids/blocks.go (3)
BlockMIDs
(10-12)BlockRIDs
(32-35)BlockParams
(57-59)frac/sealed/token/table_loader.go (1)
FieldTable
(113-116)frac/sealed/token/table_entry.go (1)
TableEntry
(4-12)seq/seq.go (3)
ID
(12-15)MID
(17-17)RID
(18-18)seq/doc_pos.go (1)
DocPos
(19-19)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test
🔇 Additional comments (17)
fracmanager/fraction_provider.go (4)
10-11
: Imports moved to common/sealed look correctSwitch to common.Info and sealed.PreloadedData aligns with the refactor and upstream types. LGTM.
68-77
: NewSealedPreloaded: correct switch to sealed.PreloadedDataMatches frac.NewSealedPreloaded(base, preloaded, rl, indexCache, docsCache, cfg). LGTM.
57-66
: Approve: NewSealed signature change — no stale call sites. Arg order matches frac.NewSealed(base, rl, indexCache, docsCache, info, cfg); rg found only the definition and the local return in fracmanager/fraction_provider.go — no other call sites.
80-92
: NewRemote signature matches current definition.
The call in fracmanager/fraction_provider.go passes arguments that match frac.NewRemote(ctx, baseFile string, readLimiter *storage.ReadLimiter, indexCache *IndexCache, docsCache *cache.Cache[[]byte], info *common.Info, config *Config, s3cli *s3.Client).cmd/distribution/main.go (1)
14-16
: Imports updated to common/sealed are consistentAligns with moved Info and BlockInfo types. LGTM.
frac/sealed/block_offsets.go (1)
1-1
: Package reorganization looks clean.The move from
package frac
topackage sealed
aligns well with the broader refactoring to separate sealing logic.frac/sealed_loader.go (4)
8-9
: Import additions support the refactoring.The new imports for
frac/common
andfrac/sealed
enable the updated type usage in the Load method.
24-24
: Method signature change aligns with new data structures.The transition from
sealedState
andInfo
tosealed.BlocksData
andcommon.Info
is consistent with the broader refactoring.
81-81
: Type usage updated correctly.The change from
BlockOffsets{}
tosealed.BlockOffsets{}
is consistent with the package reorganization.
34-40
: Confirm removal of sealedState references
I didn’t find any occurrences ofsealedState
or the oldLoad(...)
signature in the Go codebase; please manually verify all call sites now pass the updatedblocksData
type.frac/sealed/sealing/blocks_builder.go (2)
76-123
: Solid lazy block construction with error handling.The pull iterator pattern and field alignment logic are well-implemented. Proper error accumulation prevents silent failures.
271-305
: Token block encoding handles empty batches.The implementation correctly skips empty token batches and uses little-endian encoding for token lengths. Block indexing starts appropriately at 1.
frac/active_sealing_source.go (2)
45-78
: Well-designed sealing source initialization.The constructor properly handles field sorting, LID mapping, and conditional document sorting. The
prepareInfo()
call ensures metadata consistency.
164-194
: Resolve: system ID handling is correctmids[0]/rids[0] is the system sentinel (systemSeqID in frac/active.go); lid 0 is reserved for it (frac/active_docs_positions.go:20); the sealing consumer already checks for systemSeqID and clears curDoc (frac/active_sealing_source.go:231–235). No change required.
frac/active_docs_positions.go (2)
10-12
: LGTM! Clean dual-container refactor.The new two-container approach separating global ID lookups (
idToPos
) from sequential local ID tracking (lidToPos
) maintains the existing API while enabling efficient access patterns for both use cases.
47-55
: Logic is correct - duplicates are handled as documented.The behavior is intentional and consistent with the caller's expectations. When
SetMultiple
returns fewer IDs than provided, the caller inactive_indexer.go
correctly interprets this as duplicates and handles them appropriately. The asymmetric handling makes sense:
- Skip conflicting positions (different ID, different position)
- Process matching duplicates (same ID, same position) since they're idempotent
This prevents data corruption while allowing retry scenarios.
frac/sealed/sealing/blocks_builder_test.go (1)
390-391
: Good practice: Cloning to prevent aliasing.The explicit cloning of
LIDs
andOffsets
slices prevents test data corruption from shared references. This is particularly important when comparing expected vs actual results.
a48a04f
to
f667f4f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
cmd/index_analyzer/main.go (3)
77-83
: Close the opened index fileDefer closing f to avoid leaking FDs when processing multiple inputs.
f, err := os.Open(path) if err != nil { panic(err) } +defer func() { + if err := f.Close(); err != nil { + logger.Error("error closing file", zap.String("file", f.Name()), zap.Error(err)) + } +}() indexReader := storage.NewIndexReader(reader, f.Name(), f, cache.Registry)
112-114
: Fix invalid range over int (compile-time error)range over block.Len() is invalid; use a counted for.
- for i := range block.Len() { - tokens = append(tokens, block.GetToken(i)) - } + for i := 0; i < block.Len(); i++ { + tokens = append(tokens, block.GetToken(i)) + }
182-192
: getLIDsHash always returns zeroed hash — fix Sum usagehasher.Sum(res[:0]) return is ignored; res stays zeroed. Copy the bytes returned by Sum.
func getLIDsHash(tokenLIDs []uint32) [16]byte { hasher := fnv.New128a() buf := make([]byte, 4) for _, l := range tokenLIDs { binary.LittleEndian.PutUint32(buf, l) hasher.Write(buf) } - var res [16]byte - hasher.Sum(res[:0]) - return res + sum := hasher.Sum(nil) + var res [16]byte + copy(res[:], sum) + return res }fracmanager/proxy_frac.go (1)
121-159
: Fix WaitGroup leak and readonly stuck-on error pathsIf NewActiveSealingSource or sealing.Seal fails, sealWg.Done() isn’t called and readonly remains true, causing deadlocks and a wedged fraction.
Apply this diff:
func (f *proxyFrac) Seal(params common.SealParams) (*frac.Sealed, error) { f.useMu.Lock() @@ f.readonly = true active := f.active f.sealWg.Add(1) // It's important to put wg.Add() inside a lock, otherwise we might call wg.Wait() before it f.useMu.Unlock() + // Always mark sealing finished on any return path. + defer f.sealWg.Done() @@ - src, err := frac.NewActiveSealingSource(active, params) - if err != nil { - return nil, err - } + src, err := frac.NewActiveSealingSource(active, params) + if err != nil { + // revert readonly on failure + f.useMu.Lock() + f.readonly = false + f.useMu.Unlock() + return nil, err + } preloaded, err := sealing.Seal(src, params) if err != nil { - return nil, err + // cleanup + revert readonly on failure (also see comment about src.Close below) + f.useMu.Lock() + f.readonly = false + f.useMu.Unlock() + return nil, err } @@ - f.sealWg.Done() - active.Release()
♻️ Duplicate comments (4)
fracmanager/proxy_frac.go (1)
138-145
: Ensure sealing source is cleaned up on failureActiveSealingSource holds resources; it isn’t closed when Seal fails.
- Implement Close() on ActiveSealingSource to release buffers/files.
- Invoke it on error paths in Seal().
Example call-site tweak (after creating src):
- preloaded, err := sealing.Seal(src, params) + preloaded, err := sealing.Seal(src, params) if err != nil { + _ = src.Close() // no-op until implemented f.useMu.Lock() f.readonly = false f.useMu.Unlock() return nil, err }I can draft Close() for ActiveSealingSource if you want.
frac/sealed/sealing/blocks_builder.go (3)
276-286
: Skip empty token batches to prevent underflow and bogus empty blocksEmpty batches currently produce blocks with minTID > maxTID, causing firstIndex underflow later.
- for tokens := range tokenBatches { - idx++ + for tokens := range tokenBatches { + if len(tokens) == 0 { + continue + } + idx++
163-169
: Validate LID block capacity > 0Avoid infinite loops/empty pushes and undefined behavior when capacity <= 0.
func (bb *blocksBuilder) BuildLIDsBlocks(tokenLIDs iter.Seq[[]uint32], blockCapacity int) iter.Seq[lidsSealBlock] { return func(yield func(lidsSealBlock) bool) { + if blockCapacity <= 0 { + bb.lastErr = errors.New("sealing: LID block capacity must be > 0") + return + } var ( currentTID uint32 // Current TID being processed currentBlock lidsSealBlock // Current block under construction - isEndOfToken bool // Flag for end of current token's LIDs - continuesNext bool // Flag for block continuation + isEndOfToken bool // Flag for end of current token's LIDs )
178-186
: Fix LID continuation flag (off-by-one)isContinued should reflect whether THIS block continues into the next.
finalizeBlock := func() bool { if !isEndOfToken { // Add final offset for current token if not already done currentBlock.payload.Offsets = append(currentBlock.payload.Offsets, uint32(len(currentBlock.payload.LIDs))) } - currentBlock.payload.IsLastLID = isEndOfToken // TODO: Remove legacy field - currentBlock.ext.isContinued = continuesNext // TODO: Remove legacy field - continuesNext = !isEndOfToken + currentBlock.payload.IsLastLID = isEndOfToken // TODO: Remove legacy field + currentBlock.ext.isContinued = !isEndOfToken // continues into the next block return yield(currentBlock) }
🧹 Nitpick comments (8)
cmd/index_analyzer/main.go (2)
94-97
: Handled Unpack error — add file context to the logGood fix. Add file to the fatal for symmetry with read errors.
- if err := b.Unpack(readBlock()); err != nil { - logger.Fatal("error unpacking block info", zap.Error(err)) - } + if err := b.Unpack(readBlock()); err != nil { + logger.Fatal("error unpacking block info", zap.String("file", f.Name()), zap.Error(err)) + }
149-158
: Reuse lids.UnpackBuffer to reduce allocationsAllocate the buffer once and reuse per block.
lidsLens := make([]int, len(tokens)) tokenLIDs := []uint32{} + ubuf := &lids.UnpackBuffer{} for { data := readBlock() if len(data) == 0 { // empty block - is section separator break } block := &lids.Block{} - if err := block.Unpack(data, &lids.UnpackBuffer{}); err != nil { + if err := block.Unpack(data, ubuf); err != nil { logger.Fatal("error unpacking lids block", zap.Error(err)) }frac/sealed.go (1)
296-321
: Close files even when not fully loadedclose() returns early if isLoaded is false, leaking file descriptors opened by openDocs/openIndex in other paths (e.g., Offload then Suicide).
Apply this diff:
func (f *Sealed) close(hint string) { f.loadMu.Lock() defer f.loadMu.Unlock() - - if !f.isLoaded { - return - } - - if f.docsFile != nil { // docs file may not be opened since it's loaded lazily + if f.docsFile != nil { // docs file may not be opened since it's loaded lazily if err := f.docsFile.Close(); err != nil { logger.Error("can't close docs file", zap.String("frac", f.BaseFileName), zap.String("type", "sealed"), zap.String("hint", hint), zap.Error(err)) } } - if err := f.indexFile.Close(); err != nil { - logger.Error("can't close index file", - zap.String("frac", f.BaseFileName), - zap.String("type", "sealed"), - zap.String("hint", hint), - zap.Error(err)) - } + if f.indexFile != nil { + if err := f.indexFile.Close(); err != nil { + logger.Error("can't close index file", + zap.String("frac", f.BaseFileName), + zap.String("type", "sealed"), + zap.String("hint", hint), + zap.Error(err)) + } + } }frac/active_docs_positions.go (1)
45-55
: Consider extracting duplicate ID handling logic.The nested conditionals make this harder to follow than necessary. Also, the comment on line 50 could be clearer about what "duplicate ID" means in this context.
for i, id := range ids { - p, ok := dp.idToPos[id] - - if ok { - if p != pos[i] { - // same ID but different position - // this is a duplicate ID, we can't append it - continue - } - } else { - dp.idToPos[id] = pos[i] + if existingPos, exists := dp.idToPos[id]; exists && existingPos != pos[i] { + // Skip IDs that already exist with a different position (data integrity check) + continue } + dp.idToPos[id] = pos[i]frac/sealed/sealing/blocks_builder.go (4)
92-96
: Defensive: skip zero-token blocksEven with the upstream guard, keep this as a safety net.
- for idx, block := range createTokensSealBlocks(tokenBatches) { + for idx, block := range createTokensSealBlocks(tokenBatches) { + // skip empty blocks defensively + if block.ext.minTID > block.ext.maxTID { + continue + } table := []token.FieldTable{}
216-218
: Don’t emit an empty LIDs blockNothing consumed → no block.
- // Yield the final block - finalizeBlock() + // Yield the final block only if anything was processed + if currentTID != 0 { + finalizeBlock() + }
243-248
: Guard against ids/positions length mismatchPrevents OOB on positions[i].
- // Convert each ID and position to storage format - for i, id := range ids { + // Convert each ID and position to storage format + if len(positions) != len(ids) { + panic("sealing: ids/positions length mismatch") + } + for i, id := range ids {
92-114
: Reduce FieldTable duplication per blockCurrent code emits many FieldTable entries for the same field. Aggregate entries per field to cut allocations and simplify downstream.
Example sketch:
- table := []token.FieldTable{} + ftMap := map[string][]*token.TableEntry{} // Process all TIDs in current block (from currentTID to block.ext.maxTID) for currentTID <= block.ext.maxTID { ... - table = append(table, token.FieldTable{Field: fieldName, Entries: []*token.TableEntry{entry}}) + ftMap[fieldName] = append(ftMap[fieldName], entry) currentTID += entry.ValCount } + table := make([]token.FieldTable, 0, len(ftMap)) + for name, entries := range ftMap { + table = append(table, token.FieldTable{Field: name, Entries: entries}) + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (47)
bytespool/writer.go
(1 hunks)cmd/distribution/main.go
(4 hunks)cmd/index_analyzer/main.go
(2 hunks)cmd/seq-db/seq-db.go
(2 hunks)consts/consts.go
(1 hunks)frac/active.go
(4 hunks)frac/active_docs_positions.go
(2 hunks)frac/active_index.go
(2 hunks)frac/active_sealer.go
(0 hunks)frac/active_sealer_test.go
(0 hunks)frac/active_sealing_source.go
(1 hunks)frac/common/info.go
(2 hunks)frac/common/seal_params.go
(1 hunks)frac/disk_blocks.go
(0 hunks)frac/disk_blocks_producer.go
(0 hunks)frac/disk_blocks_writer.go
(0 hunks)frac/fraction.go
(2 hunks)frac/remote.go
(7 hunks)frac/seal_stats.go
(0 hunks)frac/sealed.go
(9 hunks)frac/sealed/block_info.go
(2 hunks)frac/sealed/block_offsets.go
(1 hunks)frac/sealed/preloaded_data.go
(1 hunks)frac/sealed/sealing/blocks_builder.go
(1 hunks)frac/sealed/sealing/blocks_builder_test.go
(1 hunks)frac/sealed/sealing/index.go
(1 hunks)frac/sealed/sealing/sealer.go
(1 hunks)frac/sealed/sealing/stats.go
(1 hunks)frac/sealed/token/table.go
(1 hunks)frac/sealed/token/table_entry.go
(1 hunks)frac/sealed/token/table_loader.go
(1 hunks)frac/sealed_index.go
(2 hunks)frac/sealed_loader.go
(4 hunks)fracmanager/async_searcher_test.go
(2 hunks)fracmanager/config.go
(2 hunks)fracmanager/fracmanager.go
(1 hunks)fracmanager/fraction_provider.go
(4 hunks)fracmanager/proxy_frac.go
(7 hunks)fracmanager/sealed_frac_cache.go
(5 hunks)fracmanager/sealed_frac_cache_test.go
(14 hunks)fracmanager/sealer_test.go
(5 hunks)fracmanager/searcher_test.go
(2 hunks)storage/block_former.go
(0 hunks)storage/blocks_stats.go
(0 hunks)storage/blocks_writer.go
(0 hunks)storage/index_block_header.go
(1 hunks)tests/setup/env.go
(5 hunks)
💤 Files with no reviewable changes (9)
- frac/seal_stats.go
- frac/disk_blocks.go
- frac/active_sealer_test.go
- storage/blocks_stats.go
- storage/block_former.go
- frac/active_sealer.go
- storage/blocks_writer.go
- frac/disk_blocks_producer.go
- frac/disk_blocks_writer.go
✅ Files skipped from review due to trivial changes (2)
- frac/sealed/token/table_loader.go
- frac/sealed/token/table.go
🚧 Files skipped from review as they are similar to previous changes (27)
- frac/active_index.go
- frac/sealed/sealing/stats.go
- frac/active.go
- frac/common/seal_params.go
- fracmanager/config.go
- fracmanager/fracmanager.go
- frac/sealed/token/table_entry.go
- frac/common/info.go
- consts/consts.go
- frac/sealed_loader.go
- fracmanager/searcher_test.go
- frac/sealed/block_info.go
- cmd/seq-db/seq-db.go
- frac/sealed/block_offsets.go
- frac/sealed/preloaded_data.go
- frac/sealed_index.go
- frac/sealed/sealing/index.go
- storage/index_block_header.go
- tests/setup/env.go
- frac/sealed/sealing/blocks_builder_test.go
- frac/active_sealing_source.go
- fracmanager/async_searcher_test.go
- cmd/distribution/main.go
- frac/sealed/sealing/sealer.go
- bytespool/writer.go
- frac/fraction.go
- fracmanager/sealer_test.go
🧰 Additional context used
🧬 Code graph analysis (9)
fracmanager/sealed_frac_cache.go (1)
frac/common/info.go (1)
Info
(22-41)
frac/sealed/sealing/blocks_builder.go (5)
frac/sealed/seqids/blocks.go (3)
BlockMIDs
(10-12)BlockRIDs
(32-35)BlockParams
(57-59)frac/sealed/token/table_loader.go (1)
FieldTable
(113-116)frac/sealed/token/table_entry.go (1)
TableEntry
(4-12)seq/seq.go (3)
ID
(12-15)MID
(17-17)RID
(18-18)seq/doc_pos.go (1)
DocPos
(19-19)
frac/active_docs_positions.go (2)
seq/seq.go (1)
ID
(12-15)seq/doc_pos.go (1)
DocPos
(19-19)
cmd/index_analyzer/main.go (2)
frac/sealed/block_info.go (1)
BlockInfo
(15-17)logger/logger.go (2)
Fatal
(82-84)Error
(74-76)
fracmanager/fraction_provider.go (3)
frac/sealed.go (3)
NewSealed
(67-98)Sealed
(31-57)NewSealedPreloaded
(129-172)frac/common/info.go (1)
Info
(22-41)frac/sealed/preloaded_data.go (1)
PreloadedData
(10-14)
frac/sealed.go (6)
frac/common/info.go (1)
Info
(22-41)frac/sealed/preloaded_data.go (2)
BlocksData
(16-20)PreloadedData
(10-14)consts/consts.go (2)
SdocsFileSuffix
(55-55)DocsFileSuffix
(52-52)storage/docs_reader.go (1)
NewDocsReader
(16-21)frac/index_cache.go (1)
IndexCache
(10-18)frac/sealed/block_info.go (1)
BlockInfo
(15-17)
fracmanager/sealed_frac_cache_test.go (4)
frac/common/info.go (1)
Info
(22-41)tests/common/util.go (3)
GetTestTmpDir
(69-71)RecreateDir
(49-52)RemoveDir
(42-47)consts/consts.go (1)
FracCacheFileSuffix
(65-65)fracmanager/sealed_frac_cache.go (1)
NewSealedFracCache
(32-43)
frac/remote.go (2)
frac/common/info.go (1)
Info
(22-41)frac/sealed/preloaded_data.go (1)
BlocksData
(16-20)
fracmanager/proxy_frac.go (6)
frac/active.go (1)
Active
(32-64)frac/common/info.go (1)
Info
(22-41)frac/sealed/sealing/sealer.go (1)
Seal
(46-110)frac/common/seal_params.go (1)
SealParams
(3-12)frac/sealed.go (1)
Sealed
(31-57)frac/active_sealing_source.go (1)
NewActiveSealingSource
(45-78)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test
🔇 Additional comments (28)
cmd/index_analyzer/main.go (2)
13-13
: Sealed import path refactor — OKImport switch to frac/sealed is correct and consistent with downstream usage.
124-127
: Incorrect — TableBlock.Unpack has no error returnTableBlock.Unpack is declared as
func (b *TableBlock) Unpack(data []byte)
in frac/sealed/token/table_loader.go, so the call in cmd/index_analyzer/main.go (around lines 124–127) is correct and no error handling change is required.Likely an incorrect or invalid review comment.
fracmanager/sealed_frac_cache_test.go (9)
16-18
: LGTM! Clean import refactoring.The migration from
frac.Info
tocommon.Info
and the test utilities import alias are properly implemented.
29-42
: Function signature and implementation correctly updated.The return type change from
map[string]*frac.Info
tomap[string]*common.Info
is consistent with the refactoring.
125-139
: LGTM! Test data structure properly migrated.The
newInfo
variable correctly usescommon.Info
with all required fields.
146-146
: Expected map type correctly updated.The expected map now uses
*common.Info
consistently.
170-184
: Test data structure properly migrated.Another
newInfo
instance correctly usescommon.Info
.
237-237
: Expected map type updated.Consistent use of
map[string]*common.Info
throughout tests.
287-287
: Info map type correctly migrated.The
infos
map now uses*common.Info
as expected.
369-369
: Info map type updated.Consistent migration to
*common.Info
in test scenarios.
51-54
: Test utilities properly migrated.All test directory operations now correctly use
tests_common
functions.fracmanager/sealed_frac_cache.go (5)
13-13
: LGTM! Import path updated correctly.The import now references
frac/common
to use the refactoredcommon.Info
type.
25-26
: Cache type properly migrated to common.Info.The
fracCache
field now correctly stores*common.Info
instead of*frac.Info
, aligning with the refactoring.
34-39
: Initialization updated with new fields.The cache is correctly initialized with
*common.Info
type, and new fieldsdataDir
andversion
are properly set.
77-77
: AddFraction signature correctly updated.The method now accepts
*common.Info
parameter, consistent with the refactoring.
97-97
: GetFracInfo return type properly migrated.The method now returns
*common.Info
, maintaining consistency.fracmanager/fraction_provider.go (3)
57-66
: API migration to common.Info looks correctSignature change and wiring remain consistent with frac.NewSealed.
68-77
: Preloaded path wiring LGTMUses sealed.PreloadedData and passes caches/limiter in the right order.
79-92
: *Verify no remaining frac.Info referencesConstructor signature and argument flow look correct.
Search returned no matches; cannot confirm absence. Re-run locally to confirm no pointer or non-pointer references:
rg -nP --type=go '*frac.Info\b' -C2
rg -nP --type=go '\bfrac.Info\b' -C2frac/remote.go (3)
55-58
: State refactor to BlocksDataNew fields (loadMu/isLoaded/blocksData) align with the new loader path.
192-213
: createDataProvider migrated to BlocksDataAll tables/offsets sourced from f.blocksData; consistent with sealed.Sealed.
232-236
: Loader invocation updatedLoader now populates BlocksData; subsequent isLoaded set is correct.
fracmanager/proxy_frac.go (2)
54-60
: Cache fraction name for logsConstructor and field addition are straightforward and helpful for logging.
115-119
: Minor log polishSwitched to cached name for consistency; good.
frac/sealed.go (1)
115-123
: Docs open fallback messages are correct nowAccurate “sdocs” vs “docs” error messages. Nice.
frac/active_docs_positions.go (2)
10-12
: Good refactoring of the data structure.The separation of global ID mapping (
idToPos
) from local ID tracking (lidToPos
) aligns well with the new sealing pipeline architecture.
17-20
: Clarify or remove reserved lidToPos[0] (systemID)lidToPos is pre-seeded with 0 labeled "systemID" while idToPos has no corresponding entry — clarify intent or remove the sentinel.
- If index 0 is intentional: add a short comment documenting the invariant and either add an explicit mapping (if a systemID constant exists) or explain why idToPos omits it.
- If unused: remove the dp.lidToPos = append(dp.lidToPos, 0) line.
Location: frac/active_docs_positions.go (init, ~lines 17–21)
frac/sealed/sealing/blocks_builder.go (1)
233-256
: Ephemeral slice reuse across yields — verify no retention by consumersBlocks reuse underlying slices between yields. Ensure all consumers serialize/copy within the yield and do not retain references past the next yield; otherwise previously yielded data may be corrupted. If retention exists, we must deep-copy before yielding.
Also applies to: 163-219, 269-302
fa1313b
to
c5370b7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
cmd/distribution/main.go (1)
134-142
: Guard binary.Varint decode to avoid panic/loop on corrupt datan<=0 from binary.Varint can panic (bytes[n:]) or loop forever; negative deltas will wrap via uint64. Add checks.
Apply:
- delta, n := binary.Varint(bytes) - bytes = bytes[n:] - mid += uint64(delta) + delta, n := binary.Varint(bytes) + if n <= 0 { + logger.Fatal("invalid varint in IDs block", + zap.Int("n", n), zap.String("file", path), zap.Uint32("block_index", blockIndex)) + } + if delta < 0 { + logger.Fatal("negative delta in IDs block", + zap.Int64("delta", delta), zap.String("file", path), zap.Uint32("block_index", blockIndex)) + } + bytes = bytes[n:] + mid += uint64(delta)fracmanager/sealed_frac_cache.go (1)
151-171
: Close/sync tmp file and clean up on rename failureSaveCacheToDisk doesn’t close the temp file, doesn’t fsync, and leaves a stray tmp file if rename fails (Windows may also fail to rename open files).
Apply:
tmp, err := os.CreateTemp(fc.dataDir, fc.fileName+".") if err != nil { return fmt.Errorf("can't save frac-cache: %w", err) } err = tmp.Chmod(defaultFilePermission) if err != nil { return fmt.Errorf("can't change frac-cache file permission: %w", err) } if _, err = tmp.Write(content); err != nil { return fmt.Errorf("can't save frac-cache: %w", err) } - if err = os.Rename(tmp.Name(), fc.fullPath); err != nil { - return fmt.Errorf("can't rename tmp to actual frac-cache: %w", err) - } + if err = tmp.Sync(); err != nil { + return fmt.Errorf("can't fsync frac-cache tmp: %w", err) + } + if err = tmp.Close(); err != nil { + return fmt.Errorf("can't close frac-cache tmp: %w", err) + } + if err = os.Rename(tmp.Name(), fc.fullPath); err != nil { + _ = os.Remove(tmp.Name()) + return fmt.Errorf("can't rename tmp to actual frac-cache: %w", err) + }fracmanager/proxy_frac.go (1)
121-156
: Seal error paths leak state: sealWg not decremented; readonly stays true.If building the source or sealing fails,
sealWg.Done()
is never called and the fraction remainsreadonly
, causing deadlocks and a stuck “sealing” state.func (f *proxyFrac) Seal(params common.SealParams) (*frac.Sealed, error) { @@ f.readonly = true active := f.active f.sealWg.Add(1) // It's important to put wg.Add() inside a lock, otherwise we might call wg.Wait() before it f.useMu.Unlock() f.WaitWriteIdle() - src, err := frac.NewActiveSealingSource(active, params) - if err != nil { - return nil, err - } - preloaded, err := sealing.Seal(src, params) - if err != nil { - return nil, err - } + defer f.sealWg.Done() + + src, err := frac.NewActiveSealingSource(active, params) + if err != nil { + f.useMu.Lock() + f.readonly = false + f.useMu.Unlock() + return nil, err + } + preloaded, err := sealing.Seal(src, params) + if err != nil { + f.useMu.Lock() + f.readonly = false + f.useMu.Unlock() + return nil, err + } @@ - f.sealWg.Done()
♻️ Duplicate comments (7)
cmd/distribution/main.go (1)
62-65
: Good: readers are now closeddefer f.Close() added in both loadInfo and buildDist removes the leak flagged earlier.
Also applies to: 90-93
fracmanager/proxy_frac.go (1)
138-145
: Consider source cleanup on failure (add Close and defer it).
ActiveSealingSource
may hold resources; addClose()
on it and call on error.Proposed call-site change after
NewActiveSealingSource
:+ // If ActiveSealingSource implements Close(), ensure cleanup on failure + defer func() { + if err != nil { + if c, ok := any(src).(interface{ Close() error }); ok { + _ = c.Close() + } + } + }()Would you like me to add
Close()
to ActiveSealingSource and wire it through?fracmanager/sealer_test.go (2)
121-125
: Return on error when building sealing source (avoid nil deref).
assert.NoError
doesn’t stop execution; if construction fails,sealing.Seal(src, …)
may panic on nilsrc
. Return immediately on error.- src, err := frac.NewActiveSealingSource(active, params) - assert.NoError(b, err) - return sealing.Seal(src, params) + src, err := frac.NewActiveSealingSource(active, params) + if err != nil { + return nil, err + } + return sealing.Seal(src, params)
148-151
: Use the standard benchmark loop;b.Loop()
doesn’t exist.Switch to
for i := 0; i < b.N; i++ {}
and fail fast withb.Fatal
.-b.ReportAllocs() - - for b.Loop() { - _, err = seal(active, params) - assert.NoError(b, err) - } + b.ReportAllocs() + b.ResetTimer() + for i := 0; i < b.N; i++ { + if _, err = seal(active, params); err != nil { + b.Fatal(err) + } + }frac/sealed/sealing/blocks_builder.go (3)
3-12
: Missing import for cmp.
cmp.Min
is needed to replace the undefinedmin
.import ( + "cmp" "encoding/binary" "errors" "iter"
281-304
: Skip empty token batches to prevent invalid blocks (minTID > maxTID).- for tokens := range tokenBatches { + for tokens := range tokenBatches { + if len(tokens) == 0 { + continue + } idx++
133-149
: Replace undefined min() with cmp.Min (compile fix).- lastIndex := min(fieldMaxTID, block.ext.maxTID) - block.ext.minTID + lastIndex := cmp.Min(fieldMaxTID, block.ext.maxTID) - block.ext.minTID
🧹 Nitpick comments (5)
frac/sealed/preloaded_data.go (1)
10-14
: Add GoDoc for exported types; confirm pointer/value intent
- Please add brief comments for PreloadedData and BlocksData (exported API).
- Mixed pointer/value for LIDsTable vs IDsTable looks intentional—confirm this is by design.
Apply:
+// PreloadedData bundles immutable index metadata and prebuilt lookup tables for a sealed fraction. type PreloadedData struct { Info *common.Info BlocksData BlocksData TokenTable token.Table } +// BlocksData contains ID/LID tables and block offsets used to locate documents. type BlocksData struct { IDsTable seqids.Table LIDsTable *lids.Table BlocksOffsets []uint64 }Also applies to: 16-20
cmd/distribution/main.go (1)
90-93
: Drop unused buildDist argThe third parameter is unused; simplify the API and call site.
Apply:
-func buildDist(dist *seq.MIDsDistribution, path string, _ *common.Info) { +func buildDist(dist *seq.MIDsDistribution, path string) {And later:
- buildDist(info.Distribution, path, info) + buildDist(info.Distribution, path)Also applies to: 203-205
fracmanager/sealer_test.go (2)
60-76
: Handle scanner limits and errors.
- Increase the scanner buffer to avoid “token too long” on large log lines.
- Check
scanner.Err()
after the loop.- scanner := bufio.NewScanner(file) + scanner := bufio.NewScanner(file) + buf := make([]byte, 0, 1<<20) // 1 MiB + scanner.Buffer(buf, 1<<20) for scanner.Scan() { k++ doc := scanner.Bytes() if err := docRoot.DecodeBytes(doc); err != nil { return err } } + if err := scanner.Err(); err != nil { + return err + }
109-115
: Avoid double Stop() calls.You
defer fp.Stop()
and also callfp.Stop()
explicitly later. Drop the explicit call.- defer fp.Stop() + defer fp.Stop() ... - fp.Stop()Also applies to: 119-120
frac/active_sealing_source.go (1)
410-448
: Avoid emitting empty doc blocks.If no docs were buffered, skip the final yield.
- yield(buf, pos) + if len(buf) > 0 { + yield(buf, pos) + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (47)
bytespool/writer.go
(1 hunks)cmd/distribution/main.go
(4 hunks)cmd/index_analyzer/main.go
(2 hunks)cmd/seq-db/seq-db.go
(2 hunks)consts/consts.go
(1 hunks)frac/active.go
(4 hunks)frac/active_docs_positions.go
(2 hunks)frac/active_index.go
(2 hunks)frac/active_sealer.go
(0 hunks)frac/active_sealer_test.go
(0 hunks)frac/active_sealing_source.go
(1 hunks)frac/common/info.go
(2 hunks)frac/common/seal_params.go
(1 hunks)frac/disk_blocks.go
(0 hunks)frac/disk_blocks_producer.go
(0 hunks)frac/disk_blocks_writer.go
(0 hunks)frac/fraction.go
(2 hunks)frac/remote.go
(7 hunks)frac/seal_stats.go
(0 hunks)frac/sealed.go
(9 hunks)frac/sealed/block_info.go
(2 hunks)frac/sealed/block_offsets.go
(1 hunks)frac/sealed/preloaded_data.go
(1 hunks)frac/sealed/sealing/blocks_builder.go
(1 hunks)frac/sealed/sealing/blocks_builder_test.go
(1 hunks)frac/sealed/sealing/index.go
(1 hunks)frac/sealed/sealing/sealer.go
(1 hunks)frac/sealed/sealing/stats.go
(1 hunks)frac/sealed/token/table.go
(1 hunks)frac/sealed/token/table_entry.go
(1 hunks)frac/sealed/token/table_loader.go
(1 hunks)frac/sealed_index.go
(2 hunks)frac/sealed_loader.go
(4 hunks)fracmanager/async_searcher_test.go
(2 hunks)fracmanager/config.go
(2 hunks)fracmanager/fracmanager.go
(1 hunks)fracmanager/fraction_provider.go
(4 hunks)fracmanager/proxy_frac.go
(7 hunks)fracmanager/sealed_frac_cache.go
(5 hunks)fracmanager/sealed_frac_cache_test.go
(14 hunks)fracmanager/sealer_test.go
(5 hunks)fracmanager/searcher_test.go
(2 hunks)storage/block_former.go
(0 hunks)storage/blocks_stats.go
(0 hunks)storage/blocks_writer.go
(0 hunks)storage/index_block_header.go
(1 hunks)tests/setup/env.go
(5 hunks)
💤 Files with no reviewable changes (9)
- frac/seal_stats.go
- frac/active_sealer.go
- storage/blocks_stats.go
- frac/disk_blocks_writer.go
- frac/disk_blocks.go
- frac/active_sealer_test.go
- storage/block_former.go
- storage/blocks_writer.go
- frac/disk_blocks_producer.go
✅ Files skipped from review due to trivial changes (1)
- frac/sealed/token/table.go
🚧 Files skipped from review as they are similar to previous changes (27)
- storage/index_block_header.go
- frac/sealed/token/table_loader.go
- frac/sealed/block_offsets.go
- fracmanager/searcher_test.go
- fracmanager/async_searcher_test.go
- frac/sealed/token/table_entry.go
- frac/sealed_index.go
- bytespool/writer.go
- fracmanager/config.go
- frac/common/info.go
- cmd/index_analyzer/main.go
- cmd/seq-db/seq-db.go
- tests/setup/env.go
- fracmanager/fraction_provider.go
- fracmanager/fracmanager.go
- frac/sealed/block_info.go
- frac/active_docs_positions.go
- frac/common/seal_params.go
- frac/sealed_loader.go
- consts/consts.go
- fracmanager/sealed_frac_cache_test.go
- frac/remote.go
- frac/sealed/sealing/stats.go
- frac/sealed/sealing/index.go
- frac/sealed/sealing/blocks_builder_test.go
- frac/sealed.go
- frac/active.go
🧰 Additional context used
🧬 Code graph analysis (10)
frac/active_index.go (1)
frac/common/info.go (1)
Info
(21-40)
frac/fraction.go (1)
frac/common/info.go (1)
Info
(21-40)
frac/sealed/preloaded_data.go (2)
frac/common/info.go (1)
Info
(21-40)frac/sealed/token/table.go (1)
Table
(34-34)
fracmanager/sealed_frac_cache.go (1)
frac/common/info.go (1)
Info
(21-40)
cmd/distribution/main.go (2)
frac/common/info.go (1)
Info
(21-40)frac/sealed/block_info.go (1)
BlockInfo
(15-17)
frac/sealed/sealing/blocks_builder.go (5)
frac/sealed/seqids/blocks.go (3)
BlockMIDs
(10-12)BlockRIDs
(32-35)BlockParams
(57-59)frac/sealed/token/table_loader.go (1)
FieldTable
(113-116)frac/sealed/token/table_entry.go (1)
TableEntry
(4-12)seq/seq.go (3)
ID
(12-15)MID
(17-17)RID
(18-18)seq/doc_pos.go (1)
DocPos
(19-19)
frac/active_sealing_source.go (14)
frac/common/seal_params.go (1)
SealParams
(3-12)frac/common/info.go (1)
Info
(21-40)frac/active_lids.go (1)
TokenLIDs
(27-33)seq/doc_pos.go (2)
DocPos
(19-19)PackDocPos
(21-34)storage/docs_reader.go (1)
DocsReader
(11-14)frac/active.go (1)
Active
(32-64)frac/active_token_list.go (1)
TokenList
(60-74)frac/active_docs_positions.go (1)
DocsPositions
(9-13)fracmanager/config.go (1)
Config
(15-34)seq/seq.go (3)
ID
(12-15)MID
(17-17)RID
(18-18)consts/consts.go (3)
IDsPerBlock
(14-14)SdocsTmpFileSuffix
(56-56)SdocsFileSuffix
(55-55)bytespool/writer.go (3)
AcquireWriterSize
(18-30)ReleaseWriter
(32-37)Writer
(13-16)util/errors_collapser.go (1)
CollapseErrors
(25-27)storage/doc_block.go (1)
CompressDocBlock
(79-88)
frac/sealed/sealing/sealer.go (8)
frac/common/info.go (1)
Info
(21-40)seq/seq.go (1)
ID
(12-15)seq/doc_pos.go (1)
DocPos
(19-19)frac/common/seal_params.go (1)
SealParams
(3-12)frac/sealed/preloaded_data.go (2)
PreloadedData
(10-14)BlocksData
(16-20)consts/consts.go (2)
IndexTmpFileSuffix
(60-60)IndexFileSuffix
(59-59)frac/sealed/sealing/index.go (1)
NewIndexSealer
(44-48)util/fs.go (1)
MustSyncPath
(15-28)
fracmanager/proxy_frac.go (5)
frac/active.go (1)
Active
(32-64)frac/common/info.go (1)
Info
(21-40)frac/sealed/sealing/sealer.go (1)
Seal
(46-110)frac/common/seal_params.go (1)
SealParams
(3-12)frac/active_sealing_source.go (1)
NewActiveSealingSource
(62-100)
fracmanager/sealer_test.go (4)
frac/common/seal_params.go (1)
SealParams
(3-12)tests/common/util.go (1)
RecreateDir
(49-52)frac/active_sealing_source.go (1)
NewActiveSealingSource
(62-100)frac/sealed/sealing/sealer.go (1)
Seal
(46-110)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test
🔇 Additional comments (2)
frac/active_index.go (1)
9-9
: Migration to common.Info looks consistentField type and import updated appropriately.
Also applies to: 51-51
frac/fraction.go (1)
21-21
: Verify common.Info exposes Name() and String()frac/fraction.go (lines 21 and 33–39) calls info.Name() and info.String(); confirm *common.Info implements Name() string and String() string (or update frac/fraction.go to use the available methods) to avoid build breaks.
c5370b7
to
970150f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
frac/sealed/token/table_loader.go (1)
49-68
: Guard against empty FieldTable to avoid panic.
ft.Entries[0]
will panic if a block contains a field with zero entries (corrupt data or edge case). Add a defensive check.func TableFromBlocks(blocks []TableBlock) Table { table := make(Table) for _, block := range blocks { for _, ft := range block.FieldsTables { + if len(ft.Entries) == 0 { + // Defensive: skip empty field tables to avoid panics on corrupted/empty data. + continue + } fd, ok := table[ft.Field] minVal := ft.Entries[0].MinVal if !ok { fd = &FieldData{ MinVal: minVal, Entries: make([]*TableEntry, 0, len(ft.Entries)), } - } else if minVal < fd.MinVal { + } else if fd.MinVal == "" || minVal < fd.MinVal { fd.MinVal = minVal } for _, e := range ft.Entries { e.MinVal = "" fd.Entries = append(fd.Entries, e) } table[ft.Field] = fd } } return table }fracmanager/sealed_frac_cache.go (2)
52-74
: Write-lock the map during load; mark version as saved to avoid needless rewrite
LoadFromDisk
mutatesfc.fracCache
without acquiring the write lock. Also, after loading,savedVersion
stays 0 so the next sync will rewrite the same content.func (fc *sealedFracCache) LoadFromDisk(fileName string) { - content, err := os.ReadFile(fileName) + content, err := os.ReadFile(fileName) if err != nil { logger.Info("frac-cache read error, empty cache will be created", zap.Error(err), zap.String("filename", fileName), ) return } - err = json.Unmarshal(content, &fc.fracCache) + fc.fracCacheMu.Lock() + defer fc.fracCacheMu.Unlock() + err = json.Unmarshal(content, &fc.fracCache) if err != nil { logger.Warn("can't unmarshal frac-cache, new frac-cache will be created later on", zap.Error(err), ) return } + // Loaded state is in sync with disk; avoid immediate redundant save. + fc.savedVersion.Store(fc.version) logger.Info("frac-cache loaded from disk", zap.String("filename", fileName), zap.Int("cache_entries", len(fc.fracCache)), ) }
139-178
: FD leak and durability gap: close+fsync temp file before rename (Windows portability, crash safety)
SaveCacheToDisk
never closestmp
; on repeated calls this leaks descriptors. Also nofsync
before rename, risking data loss on power failure. Rename with an open handle fails on Windows.tmp, err := os.CreateTemp(fc.dataDir, fc.fileName+".") if err != nil { return fmt.Errorf("can't save frac-cache: %w", err) } +defer func() { _ = tmp.Close() }() +defer os.Remove(tmp.Name()) err = tmp.Chmod(defaultFilePermission) if err != nil { return fmt.Errorf("can't change frac-cache file permission: %w", err) } if _, err = tmp.Write(content); err != nil { return fmt.Errorf("can't save frac-cache: %w", err) } +// Ensure data hits disk and close before rename (required on Windows). +if err = tmp.Sync(); err != nil { + return fmt.Errorf("can't fsync frac-cache: %w", err) +} +if err = tmp.Close(); err != nil { + return fmt.Errorf("can't close frac-cache: %w", err) +} + if err = os.Rename(tmp.Name(), fc.fullPath); err != nil { return fmt.Errorf("can't rename tmp to actual frac-cache: %w", err) }
🧹 Nitpick comments (12)
util/fs.go (1)
31-45
: Add doc comment and wrap errors with context; confirm OS expectations.
- Exported function lacks a GoDoc; add one for clarity and linting.
- Wrap errors with path context to improve call‑site logs/debugging.
- Directory fsync isn’t portable on Windows. If Windows is in scope, either gate with build tags or provide a no‑op/windows‑specific impl.
Apply within this hunk:
+// SyncPath fsyncs the given path (file or directory). +// For durability of create/rename operations, pass the parent directory. func SyncPath(path string) error { d, err := os.Open(path) if err != nil { - return err + return fmt.Errorf("open %q: %w", path, err) } if err := d.Sync(); err != nil { _ = d.Close() - return err + return fmt.Errorf("sync %q: %w", path, err) } if err := d.Close(); err != nil { - return err + return fmt.Errorf("close %q: %w", path, err) } return nil }Outside this hunk, add fmt to imports:
import ( "errors" "fmt" "os" "go.uber.org/zap" "github.com/ozontech/seq-db/logger" )If Windows builds are required, add a stub:
// util/fs_windows.go //go:build windows package util func SyncPath(string) error { return nil }frac/sealed/token/table_loader.go (1)
108-111
: Nice clarifier; please also document critical invariants.The new comment helps. Consider explicitly stating the on-disk invariants relied upon by loaders:
- FieldTable.Entries is non-empty.
- Entries are ordered by StartTID.
- Only the first entry in a field-table block carries MinVal (others may be empty).
This will prevent accidental regressions and aligns with how TableFromBlocks/Unpack currently behave.
frac/sealed/block_info.go (1)
37-39
: Preserve underlying JSON error for diagnosticsWrap the json.Unmarshal error to avoid losing context.
- if err := json.Unmarshal(data[4:], b.Info); err != nil { - return errors.New("stats unmarshaling error") - } + if err := json.Unmarshal(data[4:], b.Info); err != nil { + return fmt.Errorf("stats unmarshaling error: %w", err) + }(Remember to import fmt.)
frac/sealed_loader.go (1)
81-88
: Minor: sanity-check IDsTable vs OffsetsAfter unpack, validate len(b.Offsets) > 0 before using it; fail fast if empty.
- blocksOffsets = b.Offsets + if len(b.Offsets) == 0 { + return idsTable, nil, errors.New("empty BlockOffsets") + } + blocksOffsets = b.Offsetsfracmanager/proxy_frac.go (1)
3-20
: Add missing import if using the Close() interface assertionIf you adopt the conditional close, you’ll need io for the interface type or use
any(src).(interface{ Close() error })
as in the diff (no import). If you chooseio.Closer
, add:import ( "context" "errors" "fmt" + "io" "sync" "time"
frac/sealed/sealing/index.go (2)
150-168
: Use int64 for file offsets to avoid overflow; fix call sitePositions in index files should be int64 to match file offsets and avoid 32-bit overflow.
-func (s *IndexSealer) writeBlocks(pos int, payloadWriter, headersWriter io.Writer, src Source) error { +func (s *IndexSealer) writeBlocks(pos int64, payloadWriter, headersWriter io.Writer, src Source) error { // Process each index block from the source for block := range s.indexBlocks(src) { - header, payload := block.Bin(int64(pos)) + header, payload := block.Bin(pos) // Write payload to main data section if _, err := payloadWriter.Write(payload); err != nil { return err } // Write header to registry if _, err := headersWriter.Write(header); err != nil { return err } - pos += len(payload) // Advance position for next block + pos += int64(len(payload)) // Advance position for next block } if s.lastErr != nil { return s.lastErr } return nil }Also applies to: 104-106
405-410
: Use DocsPositionsZstdLevel for positions blockCompression level should match docs positions, not IDs.
func (s *IndexSealer) packPosBlock(block idsSealBlock) indexBlock { s.buf1 = block.params.Pack(s.buf1[:0]) - b := s.newIndexBlockZSTD(s.buf1, s.params.IDsZstdLevel) + b := s.newIndexBlockZSTD(s.buf1, s.params.DocsPositionsZstdLevel) return b }fracmanager/sealed_frac_cache_test.go (5)
51-55
: Temp dir helpers: OK; consider t.TempDir for simplicity.tests_common usage is fine. For less boilerplate and auto-cleanup, consider t.TempDir() in new tests.
Also applies to: 75-79, 104-108, 159-163, 229-233, 268-272, 349-353, 412-416
75-79
: Use Name() consistently instead of Path in TestLoadFromDisk.Reduces coupling to internal serialization (Path json tag). Suggest:
- assert.Equal(t, "b", el.Path) + assert.Equal(t, "b", el.Name())
170-185
: Prefer Name() over filepath.Base when querying the cache.Slightly clearer and consistent with other assertions.
- fracFromDisk, has := f.GetFracInfo(filepath.Base(newInfo.Path)) + fracFromDisk, has := f.GetFracInfo(newInfo.Name())
50-72
: Make JSON empty-cache checks less brittle and fix expected/actual order.Compare JSON semantically and keep expected first for clearer diffs.
- assert.Equal(t, []byte("{}"), content) + assert.JSONEq(t, `{}`, string(content)) - assert.Equal(t, []byte("{}"), content) + assert.JSONEq(t, `{}`, string(content)) - assert.Equal(t, contents, []byte("{}")) + assert.JSONEq(t, `{}`, string(contents)) - assert.Equal(t, contents, []byte("{}")) + assert.JSONEq(t, `{}`, string(contents)) - assert.Equal(t, fracCacheFromDisk, []byte("{}")) + assert.JSONEq(t, `{}`, string(fracCacheFromDisk))Also applies to: 118-124, 153-156, 477-481
369-369
: Remove unused infos map in TestExtraFractionsRemoved.It’s written to but never read; drop it and the write to reduce noise.
- infos := map[string]*common.Info{} ... - infos[info.Name()] = info
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (48)
bytespool/writer.go
(1 hunks)cmd/distribution/main.go
(4 hunks)cmd/index_analyzer/main.go
(2 hunks)cmd/seq-db/seq-db.go
(2 hunks)consts/consts.go
(1 hunks)frac/active.go
(4 hunks)frac/active_docs_positions.go
(2 hunks)frac/active_index.go
(2 hunks)frac/active_sealer.go
(0 hunks)frac/active_sealer_test.go
(0 hunks)frac/active_sealing_source.go
(1 hunks)frac/common/info.go
(2 hunks)frac/common/seal_params.go
(1 hunks)frac/disk_blocks.go
(0 hunks)frac/disk_blocks_producer.go
(0 hunks)frac/disk_blocks_writer.go
(0 hunks)frac/fraction.go
(2 hunks)frac/remote.go
(7 hunks)frac/seal_stats.go
(0 hunks)frac/sealed.go
(9 hunks)frac/sealed/block_info.go
(2 hunks)frac/sealed/block_offsets.go
(1 hunks)frac/sealed/preloaded_data.go
(1 hunks)frac/sealed/sealing/blocks_builder.go
(1 hunks)frac/sealed/sealing/blocks_builder_test.go
(1 hunks)frac/sealed/sealing/index.go
(1 hunks)frac/sealed/sealing/sealer.go
(1 hunks)frac/sealed/sealing/stats.go
(1 hunks)frac/sealed/token/table.go
(1 hunks)frac/sealed/token/table_entry.go
(1 hunks)frac/sealed/token/table_loader.go
(1 hunks)frac/sealed_index.go
(2 hunks)frac/sealed_loader.go
(4 hunks)fracmanager/async_searcher_test.go
(2 hunks)fracmanager/config.go
(2 hunks)fracmanager/fracmanager.go
(1 hunks)fracmanager/fraction_provider.go
(4 hunks)fracmanager/proxy_frac.go
(7 hunks)fracmanager/sealed_frac_cache.go
(5 hunks)fracmanager/sealed_frac_cache_test.go
(14 hunks)fracmanager/sealer_test.go
(5 hunks)fracmanager/searcher_test.go
(2 hunks)storage/block_former.go
(0 hunks)storage/blocks_stats.go
(0 hunks)storage/blocks_writer.go
(0 hunks)storage/index_block_header.go
(1 hunks)tests/setup/env.go
(5 hunks)util/fs.go
(2 hunks)
💤 Files with no reviewable changes (9)
- frac/disk_blocks_writer.go
- frac/disk_blocks_producer.go
- storage/block_former.go
- storage/blocks_writer.go
- frac/disk_blocks.go
- frac/active_sealer_test.go
- frac/active_sealer.go
- storage/blocks_stats.go
- frac/seal_stats.go
🚧 Files skipped from review as they are similar to previous changes (21)
- frac/sealed/token/table.go
- bytespool/writer.go
- frac/common/seal_params.go
- fracmanager/config.go
- frac/common/info.go
- fracmanager/fraction_provider.go
- frac/sealed/sealing/blocks_builder_test.go
- tests/setup/env.go
- frac/sealed/preloaded_data.go
- consts/consts.go
- frac/sealed/sealing/sealer.go
- cmd/distribution/main.go
- fracmanager/async_searcher_test.go
- fracmanager/fracmanager.go
- fracmanager/searcher_test.go
- cmd/seq-db/seq-db.go
- frac/active_sealing_source.go
- frac/sealed/block_offsets.go
- frac/sealed/sealing/stats.go
- frac/sealed/sealing/blocks_builder.go
- frac/sealed/token/table_entry.go
🧰 Additional context used
🧬 Code graph analysis (17)
frac/active.go (1)
frac/common/info.go (2)
Info
(21-40)NewInfo
(42-56)
cmd/index_analyzer/main.go (2)
frac/sealed/block_info.go (1)
BlockInfo
(15-17)logger/logger.go (2)
Fatal
(82-84)Error
(74-76)
frac/sealed_index.go (1)
frac/common/info.go (1)
Info
(21-40)
frac/sealed_loader.go (3)
frac/sealed/preloaded_data.go (1)
BlocksData
(16-20)frac/common/info.go (1)
Info
(21-40)frac/sealed/block_offsets.go (1)
BlockOffsets
(8-11)
storage/index_block_header.go (1)
storage/codec.go (1)
Codec
(18-18)
fracmanager/sealer_test.go (5)
frac/common/seal_params.go (1)
SealParams
(3-12)tests/common/util.go (1)
RecreateDir
(49-52)frac/sealed/preloaded_data.go (1)
PreloadedData
(10-14)frac/active_sealing_source.go (1)
NewActiveSealingSource
(64-102)frac/sealed/sealing/sealer.go (1)
Seal
(46-110)
fracmanager/sealed_frac_cache.go (1)
frac/common/info.go (1)
Info
(21-40)
fracmanager/sealed_frac_cache_test.go (4)
frac/common/info.go (1)
Info
(21-40)tests/common/util.go (3)
GetTestTmpDir
(69-71)RecreateDir
(49-52)RemoveDir
(42-47)consts/consts.go (1)
FracCacheFileSuffix
(65-65)fracmanager/sealed_frac_cache.go (1)
NewSealedFracCache
(32-43)
frac/active_docs_positions.go (2)
seq/seq.go (1)
ID
(12-15)seq/doc_pos.go (1)
DocPos
(19-19)
frac/fraction.go (1)
frac/common/info.go (1)
Info
(21-40)
frac/sealed/sealing/index.go (10)
frac/common/seal_params.go (1)
SealParams
(3-12)storage/codec.go (3)
Codec
(18-18)CodecNo
(13-13)CodecZSTD
(15-15)storage/index_block_header.go (2)
IndexBlockHeader
(19-19)NewIndexBlockHeader
(25-34)frac/sealed/sealing/sealer.go (1)
Source
(22-30)bytespool/writer.go (3)
AcquireWriterSize
(18-30)ReleaseWriter
(32-37)Writer
(13-16)frac/common/info.go (1)
Info
(21-40)frac/sealed/token/table_loader.go (3)
FieldTable
(113-116)TableBlock
(109-111)TableFromBlocks
(49-71)consts/consts.go (3)
RegularBlockSize
(16-16)IDsPerBlock
(14-14)LIDBlockCap
(15-15)frac/sealed/block_offsets.go (1)
BlockOffsets
(8-11)seq/seq.go (3)
ID
(12-15)MID
(17-17)RID
(18-18)
util/fs.go (1)
logger/logger.go (1)
Panic
(78-80)
frac/sealed/block_info.go (1)
frac/common/info.go (1)
Info
(21-40)
frac/sealed.go (6)
frac/common/info.go (1)
Info
(21-40)frac/sealed/preloaded_data.go (2)
BlocksData
(16-20)PreloadedData
(10-14)consts/consts.go (2)
SdocsFileSuffix
(55-55)DocsFileSuffix
(52-52)storage/docs_reader.go (1)
NewDocsReader
(16-21)frac/index_cache.go (1)
IndexCache
(10-18)frac/sealed/block_info.go (1)
BlockInfo
(15-17)
frac/remote.go (3)
frac/common/info.go (1)
Info
(21-40)frac/sealed/preloaded_data.go (1)
BlocksData
(16-20)frac/sealed_loader.go (1)
Loader
(18-22)
fracmanager/proxy_frac.go (6)
frac/active.go (1)
Active
(32-64)frac/common/info.go (1)
Info
(21-40)frac/sealed/sealing/sealer.go (1)
Seal
(46-110)frac/common/seal_params.go (1)
SealParams
(3-12)frac/sealed.go (1)
Sealed
(31-57)frac/active_sealing_source.go (1)
NewActiveSealingSource
(64-102)
frac/active_index.go (1)
frac/common/info.go (1)
Info
(21-40)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test
🔇 Additional comments (15)
util/fs.go (1)
16-19
: Good extraction; behavior preserved.Delegating to SyncPath centralizes fsync logic while keeping the panic behavior and structured logging. LGTM.
storage/index_block_header.go (1)
25-33
: Makepos
uint64 to match on-disk type and remove lossy cast
SetPos
stores an unsigned 64-bit value; acceptingint64
risks accidental negatives and silent wrap when cast touint64
.Apply:
-func NewIndexBlockHeader(pos int64, ext1, ext2 uint64, size, rawSize uint32, codec Codec) IndexBlockHeader { +func NewIndexBlockHeader(pos uint64, ext1, ext2 uint64, size, rawSize uint32, codec Codec) IndexBlockHeader { header := NewEmptyIndexBlockHeader() header.SetExt1(ext1) header.SetExt2(ext2) header.SetLen(size) header.SetRawLen(rawSize) header.SetCodec(codec) - header.SetPos(uint64(pos)) + header.SetPos(pos) return header }To find call sites that need updating:
#!/bin/bash rg -nP --type=go '\bNewIndexBlockHeader\(' -C2frac/fraction.go (1)
20-27
: *Interface break — verified: implementers use common.InfoRepo-wide search found no remaining Info() *Info signatures; sealed Source already uses Info() *common.Info (frac/sealed/sealing/sealer.go:23).
frac/sealed_index.go (1)
56-60
: Type switch to common.Info verified — no remaining frac.Info usagesAll Info() implementations return *common.Info; one test mock returns common.Info by value (frac/sealed/sealing/blocks_builder_test.go:28) — acceptable.
frac/active_index.go (1)
157-157
: Confirm common.Info.Name() exists or switch to PathIf common.Info doesn’t implement Name(), this line won’t compile. Prefer using info.Path (or add Name() string on common.Info).
#!/bin/bash # Verify common.Info has Name() rg -n -C2 -g '!**/vendor/**' '(?m)^func\s*\(\s*\*?\s*Info\s*\)\s*Name\s*\(' frac/common || true # Find all call sites of .Name() on Info to adjust if needed rg -n -C2 '\.Name\(\)' --type=go | sed -n '1,200p'If Name() is not present, apply:
- res.IDs.ApplyHint(dp.info.Name()) + res.IDs.ApplyHint(dp.info.Path)cmd/index_analyzer/main.go (1)
93-97
: Good fix: handle BlockInfo.Unpack errorCapturing and fatally logging the Unpack error prevents corrupt state usage. Looks solid.
frac/active.go (1)
304-310
: Info() copy-on-read is correctReturning a copy avoids external mutation. LGTM.
frac/sealed_loader.go (1)
24-56
: API shift to BlocksData/common.Info looks consistentThe loader now fills BlocksData and logs via common.Info; flow and error handling read clean.
Consider a quick runtime assert to ensure BlocksOffsets and IDsTable lengths align with expected block counts to catch malformed indexes early.
frac/active_docs_positions.go (1)
16-22
: lidToPos unbounded growth; confirm lifecycle reset or bound itlidToPos appends per document and can grow large on long‑running actives. If the lifecycle guarantees reset (e.g., Active.releaseMem sets DocsPositions = nil), fine; otherwise consider bounding or chunking.
#!/bin/bash # Verify DocsPositions is released on Active lifecycle rg -n -C2 'DocsPositions\s*=' frac/active.go # Optional: check any explicit trims/resets of lidToPos rg -n 'lidToPos\s*=' --type=go -C2If needed, a minimal safeguard:
- dp.lidToPos = append(dp.lidToPos, pos[i]) + dp.lidToPos = append(dp.lidToPos, pos[i]) + // Optional cap guard: trim capacity spikes occasionally + if cap(dp.lidToPos) > 2*len(dp.lidToPos) { + tmp := make([]seq.DocPos, len(dp.lidToPos)) + copy(tmp, dp.lidToPos) + dp.lidToPos = tmp + }Also applies to: 43-59
fracmanager/sealer_test.go (2)
121-125
: Don’t assert then continue; return on error to avoid nil derefOn NewActiveSealingSource failure, continuing may pass nil src to sealing.Seal.
- seal := func(active *frac.Active, params common.SealParams) (*sealed.PreloadedData, error) { - src, err := frac.NewActiveSealingSource(active, params) - assert.NoError(b, err) - return sealing.Seal(src, params) - } + seal := func(active *frac.Active, params common.SealParams) (*sealed.PreloadedData, error) { + src, err := frac.NewActiveSealingSource(active, params) + if err != nil { + return nil, err + } + return sealing.Seal(src, params) + }
148-151
: Invalid benchmark loop; use the canonical patterntesting.B has no Loop(); this won’t compile.
- for b.Loop() { - _, err = seal(active, params) - assert.NoError(b, err) - } + b.ResetTimer() + for i := 0; i < b.N; i++ { + if _, err = seal(active, params); err != nil { + b.Fatal(err) + } + }fracmanager/proxy_frac.go (1)
121-159
: Seal: ensure wg.Done on all paths, revert readonly on failure, and close src on errorOn error paths,
sealWg.Done()
is never called andreadonly
remains true, leaving the fraction stuck in “sealing” and deadlocking waiters. Also,ActiveSealingSource
isn’t cleaned up ifsealing.Seal
fails (same concern raised earlier). Fix with named returns, defers, and conditional close.-func (f *proxyFrac) Seal(params common.SealParams) (*frac.Sealed, error) { +func (f *proxyFrac) Seal(params common.SealParams) (sealed *frac.Sealed, err error) { f.useMu.Lock() if f.isSuicidedState() { f.useMu.Unlock() return nil, ErrSealingFractionSuicided } if !f.isActiveState() { f.useMu.Unlock() return nil, errors.New("sealing fraction is not active") } f.readonly = true active := f.active f.sealWg.Add(1) // It's important to put wg.Add() inside a lock, otherwise we might call wg.Wait() before it f.useMu.Unlock() + // Always unblock waiters + defer f.sealWg.Done() + // If sealing fails, restore writability + defer func() { + if err != nil { + f.useMu.Lock() + f.readonly = false + f.useMu.Unlock() + } + }() + f.WaitWriteIdle() - src, err := frac.NewActiveSealingSource(active, params) + src, err := frac.NewActiveSealingSource(active, params) if err != nil { return nil, err } - preloaded, err := sealing.Seal(src, params) + // Best-effort cleanup if the source implements Close() + if closer, ok := any(src).(interface{ Close() error }); ok { + defer func() { + if err != nil { + _ = closer.Close() + } + }() + } + + preloaded, err := sealing.Seal(src, params) if err != nil { return nil, err } - sealed := f.fp.NewSealedPreloaded(active.BaseFileName, preloaded) + sealed = f.fp.NewSealedPreloaded(active.BaseFileName, preloaded) f.useMu.Lock() f.sealed = sealed f.active = nil f.useMu.Unlock() - f.sealWg.Done() - active.Release() return sealed, nil }frac/remote.go (1)
192-214
: LGTM: BlocksData wiring in data providerIDs/LIDs/Offsets are correctly sourced from BlocksData and match the new loaders/providers.
frac/sealed.go (1)
353-375
: LGTM: sealed provider now sources BlocksData; token table cachedWiring aligns with new PreloadedData/BlocksData and loaders.
fracmanager/sealed_frac_cache_test.go (1)
16-16
: Migration to common.Info looks consistent — verification inconclusive, re-run searchesripgrep output showed instance calls like fm.active.frac.Info() (e.g. fracmanager/fracmanager_test.go:71; fracmanager/fracmanager.go:594,662) but the earlier pattern search was ambiguous. Re-run and confirm no remaining type refs:
rg -nP --type=go '\bfrac.Info\b'
rg -nP --type=go 'map\s*[\sstring\s]\s**\s*frac.Info'LGTM after confirmation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
fracmanager/sealed_frac_cache_test.go (1)
29-36
: *loadFracCache return type migrated to map[string]common.Info — OK; tiny nitLooks good. Minor nit: since err is guaranteed nil on the success path, consider returning (fracCache, nil) for clarity.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
frac/sealed/token/table.go
(1 hunks)fracmanager/sealed_frac_cache_test.go
(14 hunks)fracmanager/sealer_test.go
(5 hunks)tests/setup/env.go
(5 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/setup/env.go
- frac/sealed/token/table.go
🧰 Additional context used
🧬 Code graph analysis (2)
fracmanager/sealer_test.go (5)
frac/common/seal_params.go (1)
SealParams
(3-12)tests/common/util.go (1)
RecreateDir
(49-52)frac/sealed/preloaded_data.go (1)
PreloadedData
(10-14)frac/active_sealing_source.go (1)
NewActiveSealingSource
(64-102)frac/sealed/sealing/sealer.go (1)
Seal
(46-110)
fracmanager/sealed_frac_cache_test.go (4)
frac/common/info.go (1)
Info
(21-40)tests/common/util.go (3)
GetTestTmpDir
(69-71)RecreateDir
(49-52)RemoveDir
(42-47)consts/consts.go (1)
FracCacheFileSuffix
(65-65)fracmanager/sealed_frac_cache.go (1)
NewSealedFracCache
(32-43)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test
🔇 Additional comments (7)
fracmanager/sealer_test.go (2)
121-125
: Return on error instead of asserting and continuing
Continuing after a failed NewActiveSealingSource risks a nil deref in sealing.Seal.- src, err := frac.NewActiveSealingSource(active, params) - assert.NoError(b, err) - return sealing.Seal(src, params) + src, err := frac.NewActiveSealingSource(active, params) + if err != nil { + return nil, err + } + return sealing.Seal(src, params)
148-151
: Replace b.Loop() with the standard benchmark loop
testing.B has no Loop(); this won’t compile. Use the canonical for i := 0; i < b.N; i++ pattern and fail fast on error.- for b.Loop() { - _, err = seal(active, params) - assert.NoError(b, err) - } + b.ResetTimer() + for i := 0; i < b.N; i++ { + if _, err = seal(active, params); err != nil { + b.Fatal(err) + } + }fracmanager/sealed_frac_cache_test.go (5)
16-19
: Migrate to frac/common and testscommon alias — LGTMImports align with the refactor. Using testscommon for test utils addresses prior feedback.
170-184
: Same concern about Path persisted as JSON "name"This repeats the Path-as-name persistence. See earlier comment; ensure consistency expectations are met.
51-55
: Switch to testscommon helpers for tmp dirs — LGTMConsistent usage across tests; setup/teardown is clear.
Also applies to: 75-79, 105-107, 159-163, 229-233, 349-353, 411-416
287-288
: *Info maps updated to map[string]common.Info — LGTMType migration is consistent with the new common.Info.
Also applies to: 369-370
125-139
: No backward‐compatibility issue with JSON “name” field
Thejson:"name"
tag onInfo.Path
intentionally stores the full path, whileInfo.Name()
returns its basename. Map keys remain bare names (e.g."c"
), and existing fixtures using short names continue to unmarshal correctly.
defer fp.Stop() | ||
|
||
dataDir := filepath.Join(b.TempDir(), "BenchmarkSealing") | ||
common.RecreateDir(dataDir) | ||
testscommon.RecreateDir(dataDir) | ||
|
||
active := fp.NewActive(filepath.Join(dataDir, "test")) | ||
err := fillActiveFraction(active) | ||
assert.NoError(b, err) | ||
fp.Stop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid double Stop; call it once
You both defer Stop and call Stop explicitly. Double shutdown can cause races or errors if Stop isn’t idempotent. Remove the deferred call and keep the explicit Stop after filling.
- defer fp.Stop()
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
defer fp.Stop() | |
dataDir := filepath.Join(b.TempDir(), "BenchmarkSealing") | |
common.RecreateDir(dataDir) | |
testscommon.RecreateDir(dataDir) | |
active := fp.NewActive(filepath.Join(dataDir, "test")) | |
err := fillActiveFraction(active) | |
assert.NoError(b, err) | |
fp.Stop() | |
dataDir := filepath.Join(b.TempDir(), "BenchmarkSealing") | |
testscommon.RecreateDir(dataDir) | |
active := fp.NewActive(filepath.Join(dataDir, "test")) | |
err := fillActiveFraction(active) | |
assert.NoError(b, err) | |
fp.Stop() |
🤖 Prompt for AI Agents
In fracmanager/sealer_test.go around lines 111 to 119: the code both defers
fp.Stop() and calls fp.Stop() explicitly after fillActiveFraction, causing a
double shutdown; remove the deferred fp.Stop() and keep the explicit fp.Stop()
after filling to ensure Stop is called exactly once and avoid potential races or
errors.
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Tests
Chores