[SP-2993] feat: add support for recursive search #19

matiasdaloia · 2025-08-06T12:47:26Z

Summary by CodeRabbit

New Features
- Recursive scanning with per-node thresholds, early stopping, deduplicated component results, and richer result metadata (vendor, versions, rank). Import tool accepts qdrant-host/port flags.
Documentation
- Big overhaul: Quick Start, build/run, CSV import, deployment (systemd), troubleshooting, and contributor guides updated.
Refactor
- Switched to distance-based scoring and node-driven scanning; local-focused build/test/lint workflow.
Chores / Removed
- Removed Docker-first packaging/scripts, Dockerfile, docker‑compose; added streamlined Makefile, CI, and lint targets; updated changelog to 0.5.0.

coderabbitai · 2025-09-22T13:39:09Z

Walkthrough

Implements recursive, node-driven scanning with per-node thresholding and deduplication; switches to distance-based scoring; expands domain models and mapper; updates Qdrant repository interface and implementation; adds a CSV import CLI and README updates; removes Docker packaging/scripts and replaces CI/build with a streamlined Makefile and build variables.

Changes

Cohort / File(s)	Summary
Build & Tooling `Makefile`, `go.mod`, `Dockerfile`, `docker-compose.yml`, `.github/workflows/*`	Introduces a streamlined Makefile (new targets, VERSION defaulting, BUILD_DIR, LDFLAGS), updates go toolchain/deps, removes Dockerfile/docker-compose and many container scripts, and updates CI workflows to new make targets.
Documentation & Changelog `README.md`, `CHANGELOG.md`, `scripts/README.md`	Rewrites Quick Start and deployment docs (favor local builds/systemd), adds Importing Data section, updates changelog to v0.5.0 and historical entries, removes Docker-first instructions.
Removed Packaging & Deployment Scripts *`scripts/`** `scripts/create-collection-snapshots.sh`, `scripts/create-docker-package.sh`, `scripts/docker-deploy.sh`, `scripts/import-collections.sh`, `scripts/import-single-collection.sh`, `scripts/setup-tls.sh`	Deletes numerous Docker/deployment/packaging/snapshot/import/TLS scripts and related packaging tooling.
CLI Import Tool `cmd/import/main.go`	Adds `--qdrant-host`/`--qdrant-port` flags, wires them into Qdrant client, improves error wrapping/logging, batching, parsing safety, and resource handling.
Domain Model — Scan & Components `internal/domain/entities/scan_request.go`, `.../component.go`, `.../language.go`, `.../version.go`, `.../metrics.go`, `.../scan_response.go`	Adds `RecursiveThreshold`, `MinAcceptedScore`, `QueryLimit`; extends `FolderNode` with simhash/lang/children; enriches ComponentGroup/SearchResult fields; adds `IndexedLangExtensions`; minor metrics/comment fixes.
Mapper `internal/mapper/scan_mapper.go`, `internal/mapper/scan_mapper_impl.go`	Adds `ChildrenToDomain` to ScanMapper; maps new recursive/min-score fields in ProtoToDomain; minor import/order tweaks.
Handler `internal/handler/scan_handler.go`	Adds request validation, maps proto→domain, validates, invokes service `ScanFolder`, returns structured protobuf status with timing.
Repository Interface & Types `internal/repository/scan_repository.go`	Adds `CollectionStats` type; removes `topK` from `SearchByHashes` signature; adds `CollectionExists` and `GetAllSupportedCollections` methods; interface docs added.
Repository Impl (Qdrant) `internal/repository/scan_repository_qdrant_impl.go`	New constructor `NewScanRepositoryQdrantImpl`; switches scoring to distance-based (adds `ScoreToDistance`), updates `SearchByHashes` signature/error wrapping, adds collection-existence fallbacks, payload parsing, ranking bounds, and debug traces.
Service Logic `internal/service/scan_service.go`, `internal/service/scan_service_impl.go`	Renames repo field/constructor param; replaces flat top-K search with recursive `scanNode` traversal honoring `RecursiveThreshold` and `MinAcceptedScore`; adds `hasHighScoreMatch`, `deduplicateComponents`; changes `processComponentGroups` signature to include path and min score.
Server, Config & Validation `cmd/server/main.go`, `internal/config/config.go`, `internal/protocol/*`, `internal/validation/validator.go`	Adds `Config.SetDefaults()`, feeder import; minor startup/logging tweaks, nolint directives, package comments, and import reorderings.
Lint Config `.golangci.yml`	Overhauls golangci-lint config: enables many linters, adjusts timeouts, rules, and path exclusions.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant C as Client (gRPC/REST)
  participant H as Handler
  participant S as ScanService
  participant R as Repository (Qdrant)

  C->>H: HFHRequest (root, thresholds, rankThreshold...)
  H->>S: ScanFolder(domain req)
  S->>S: start scanNode(root)
  loop traverse nodes
    S->>R: SearchByHashes(dirHash,nameHash,contentHash,langExt,rankThreshold)
    R-->>S: ComponentGroups (distance-based)
    alt has match >= RecursiveThreshold
      Note right of S #dff0d8: Early-stop this branch
      S->>S: processComponentGroups(path, MinAcceptedScore)
    else
      S->>S: recurse into children
      S->>S: processComponentGroups(path, MinAcceptedScore)
    end
  end
  S->>S: deduplicateComponents (keep highest per component)
  S-->>H: Aggregated ScanResults
  H-->>C: HFHResponse (status, timing, results)

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

chore:SP-3376 update papi dependency to latest version licenses#19 — Updates Git dependency versions (papi) similar to go.mod dependency changes; likely related to dependency version management.
chore: add support for macos universal binary scanoss.cc#95 — Adjusts CI/release build targets and architecture outputs akin to the workflow and Makefile architecture-specific target changes.

Poem

I hop through folders, sniff and score,
I stop where thresholds sweep the floor.
I turn distances into tidy song,
I dedupe carrots, keep the strong.
A joyful rabbit builds and runs — hop along! 🐇🥕

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title succinctly reflects the primary enhancement of the pull request by indicating the addition of recursive search support, matching the extensive changes across service, repository, mapper, and handler layers.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/mdaloia/SP-2993-Folder-Hashing-Service-Support-recursive-search

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4a1ce08 and ac56ea4.

📒 Files selected for processing (2)

Makefile (1 hunks)
internal/domain/entities/metrics.go (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

internal/domain/entities/metrics.go

🧰 Additional context used

🪛 checkmake (0.2.2)

Makefile

[warning] 14-14: Missing required phony target "all"

(minphony)

[warning] 14-14: Missing required phony target "clean"

(minphony)

[warning] 14-14: Missing required phony target "test"

(minphony)

[warning] 49-49: Target "build" should be declared PHONY.

(phonydeclared)

[warning] 80-80: Target "ci" should be declared PHONY.

(phonydeclared)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (7)

Makefile (1)

48-48: OK; consider embedding version for parity with builds (optional).

If helpful for local diagnostics, mirror ldflags used in build targets.

internal/domain/entities/scan_request.go (1)

7-10: Default not enforced; align code with comment or update comment.

Field MinAcceptedScore mentions default 0.15, but no defaulting here. Apply default in mapper/service (see suggested mapper change) or adjust the comment.
internal/repository/scan_repository_qdrant_impl.go (4)
187-193: Replace fmt.Printf with structured logging; avoid stdout in repository code.

Use ctxzap logger and pass ctx into helpers; gate at debug level.
-	fmt.Printf("Stage 1 query returned %d results\n", len(stage1Results))
+	s.Debugf("Stage 1 query returned %d results", len(stage1Results))
-		fmt.Printf("Returning early with %d results (<=20)\n", len(stage1Results))
+		s.Debugf("Returning early with %d results (<=20)", len(stage1Results))
-	fmt.Printf("Stage 2 query (dirs filtering) returned %d results\n", len(stage2Results))
+	s.Debugf("Stage 2 query (dirs filtering) returned %d results", len(stage2Results))
-		fmt.Printf("Returning with %d results after stage 2 (<=20)\n", len(stage2Results))
+		s.Debugf("Returning with %d results after stage 2 (<=20)", len(stage2Results))
And refactor processSearchResults to receive ctx and use logger:
-	return r.processSearchResults(searchResp)
+	return r.processSearchResults(ctx, searchResp)
-func (r *ScanRepositoryQdrantImpl) processSearchResults(searchResp []*qdrant.ScoredPoint) ([]entities.ComponentGroup, error) {
+func (r *ScanRepositoryQdrantImpl) processSearchResults(ctx context.Context, searchResp []*qdrant.ScoredPoint) ([]entities.ComponentGroup, error) {
+	s := ctxzap.Extract(ctx).Sugar()
 ...
-	fmt.Printf("processSearchResults: converted %d points to %d results\n", len(searchResp), len(allResults))
+	s.Debugf("processSearchResults: converted %d points to %d results", len(searchResp), len(allResults))
 ...
-	fmt.Printf("processSearchResults: grouped into %d component groups\n", len(purlGroups))
+	s.Debugf("processSearchResults: grouped into %d component groups", len(purlGroups))
Also applies to: 232-237, 407-413

227-248: Ensure order before thresholding; explicitly sort stage2Results by increasing distance.

Assume Qdrant returns sorted, but make it explicit to avoid brittle early-break logic.
 	stage2Results, err := r.client.Query(ctx, stage2Query)
 	if err != nil {
 		return nil, fmt.Errorf("stage 2 search failed: %w", err)
 	}
+	// Ensure results are ordered by ascending distance (lower is better)
+	sort.Slice(stage2Results, func(i, j int) bool { return stage2Results[i].Score < stage2Results[j].Score })
262-273: Stale comment: stage2IDs vs “stage3IDs”.

The note says it “should be stage3IDs,” but code uses stage2IDs. Either compute stage3IDs or remove/fix the comment to prevent confusion.

568-578: Constant/comment mismatch in distance/score conversion.

k = 0.065 doesn’t match “-ln(0.2)/30” (~0.0536). Align the constant or fix the comment; same in both functions.
-const k = 0.065 // -ln(0.2) / 30
+const k = 0.065 // Tuned decay; was previously documented as -ln(0.2)/30
internal/service/scan_service_impl.go (1)
102-110: Consider clarifying the threshold check logic.

The shouldCheckThreshold expression !(isRoot && len(node.Children) > 0) is logically correct but somewhat subtle. It means:

Root nodes with children always explore children (skip threshold check)

All other nodes check the threshold

Consider adding a brief inline comment explaining the boolean logic, or using a more explicit condition:
// Only check threshold for non-root nodes, or root nodes without children
shouldCheckThreshold := !isRoot || len(node.Children) == 0
This equivalent formulation may be clearer to readers.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c5fe09a and e7a97d6.

⛔ Files ignored due to path filters (1)

go.sum is excluded by !**/*.sum

📒 Files selected for processing (9)

Makefile (1 hunks)
README.md (1 hunks)
go.mod (1 hunks)
internal/domain/entities/scan_request.go (1 hunks)
internal/handler/scan_handler.go (0 hunks)
internal/mapper/scan_mapper_impl.go (1 hunks)
internal/repository/scan_repository.go (1 hunks)
internal/repository/scan_repository_qdrant_impl.go (6 hunks)
internal/service/scan_service_impl.go (2 hunks)

💤 Files with no reviewable changes (1)

internal/handler/scan_handler.go

🧰 Additional context used

🧬 Code graph analysis (3)

internal/service/scan_service_impl.go (5)

internal/repository/scan_repository.go (1)

ScanRepository (16-28)

internal/service/scan_service.go (1)

ScanService (9-12)

internal/domain/entities/scan_request.go (2)

ScanRequest (4-17)

FolderNode (20-33)

internal/domain/entities/scan_response.go (2)

ScanResponse (4-7)

ScanResult (10-15)

internal/domain/entities/component.go (2)

ComponentGroup (4-11)

Version (14-17)

internal/repository/scan_repository.go (2)

internal/domain/entities/scan_request.go (1)

LanguageExtensions (36-36)

internal/domain/entities/component.go (1)

ComponentGroup (4-11)

internal/repository/scan_repository_qdrant_impl.go (2)

internal/domain/entities/scan_request.go (1)

LanguageExtensions (36-36)

internal/domain/entities/component.go (1)

ComponentGroup (4-11)

🔇 Additional comments (9)

internal/repository/scan_repository_qdrant_impl.go (2)

551-552: OK on distance→score conversion for API output.

Returning match scores [0..1] externally while ranking internally by distance is sound.

151-179: Clarify ScoreThreshold semantics for distance metrics (L2/Hamming).
For distance-based scores, ScoreThreshold is a maximum acceptable distance—any result with distance > threshold is excluded. Ensure your thresholds (e.g. 30) match the maximum distances you intend to allow and adjust tighter (smaller) or looser (larger) as needed.

README.md (1)

109-152: README flags match implementation. The -dir, -top-purls, and -overwrite flags in cmd/import/main.go are present and correctly documented.

internal/repository/scan_repository.go (1)

18-18: All SearchByHashes call sites updated

Service layer and repository implementation use the new signature without topK; no further changes needed.

go.mod (1)

13-13: Tidy and verify github.com/scanoss/papi v0.22.0
Run go mod tidy, compile the code and run tests; confirm there are no breaking changes in the scanningv2 API.

internal/service/scan_service_impl.go (4)

13-20: LGTM! Clean field rename.

The renaming of scanRepo to repo improves readability and aligns with common Go conventions for shorter field names.

47-83: LGTM! Effective filtering logic.

The updated signature and filtering implementation correctly:

Filters versions by minAcceptedScore

Excludes groups with no qualifying versions

Only creates results when relevant matches exist

This prevents low-quality results from cluttering the response.

127-137: LGTM! Clean helper function.

The implementation correctly checks for any version meeting the threshold across all component groups. The use of >= is appropriate for threshold semantics.

23-45: Confirm recursiveThreshold short-circuits subtree scanning

Current logic stops descending into a node’s children once any component meets the recursiveThreshold, which may skip deeper, higher-scoring matches. No docs or tests explicitly define this behavior—confirm it’s intentional.

internal/mapper/scan_mapper_impl.go

internal/service/scan_service_impl.go

… duplicated components

…e readmes

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

scripts/README.md (1)
89-102: Add language identifier to fenced code block.

The fenced code block lacks a language identifier. Add text or an appropriate identifier to satisfy markdown linting.

Apply this diff:
-```
+```text
 /usr/local/etc/scanoss/folder-hashing-api/
   └── app-config-prod.json

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fd98961 and 4b6fd3c.

📒 Files selected for processing (12)

CHANGELOG.md (1 hunks)
Dockerfile (0 hunks)
README.md (3 hunks)
cmd/import/main.go (2 hunks)
docker-compose.yml (0 hunks)
scripts/README.md (1 hunks)
scripts/create-collection-snapshots.sh (0 hunks)
scripts/create-docker-package.sh (0 hunks)
scripts/docker-deploy.sh (0 hunks)
scripts/import-collections.sh (0 hunks)
scripts/import-single-collection.sh (0 hunks)
scripts/setup-tls.sh (0 hunks)

💤 Files with no reviewable changes (8)

scripts/create-docker-package.sh
scripts/docker-deploy.sh
scripts/setup-tls.sh
scripts/create-collection-snapshots.sh
scripts/import-single-collection.sh
scripts/import-collections.sh
docker-compose.yml
Dockerfile

🧰 Additional context used

🪛 markdownlint-cli2 (0.18.1)

README.md

40-40: Bare URL used

(MD034, no-bare-urls)

scripts/README.md

89-89: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (4)

cmd/import/main.go (1)

37-38: LGTM! Clean addition of configurable Qdrant connection parameters.

The new CLI flags for --qdrant-host and --qdrant-port are well-integrated, with sensible defaults from existing constants and proper usage in the client configuration.

Also applies to: 69-70

CHANGELOG.md (1)

9-9: Verify the release date.

The release date 2025-10-08 is set for October 2025, but this PR was created on 2025-08-06 (August 2025). Ensure this future date is intentional and not a typo.

README.md (2)

8-32: LGTM! Clear prerequisites and quick start guide.

The new Prerequisites and Quick Start sections provide a straightforward path for users to get up and running with the service. The step-by-step instructions are clear and include verification steps.

124-186: LGTM! Comprehensive import tool documentation.

The new "Importing Data" section provides excellent documentation for the CSV import tool, including:

Clear usage examples

Command-line options table

Explanation of how the tool works

End-to-end workflow example

This will help users understand how to populate the vector database.

coderabbitai

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

internal/config/config.go (1)
137-139: Reconsider enabling dynamic logging by default

SetDefaults now flips Logging.DynamicLogging from the old zero-value (false) to true. With zap-logging-helper this spins up the dynamic logging HTTP listener on every boot, which is a behavioral regression: deployments that relied on the prior default will start a new server on localhost:60061 and fail if that port is occupied (or expose runtime level toggling without opting in). Please keep the default disabled unless you also update the consuming code/docs to expect it.
-	c.Logging.DynamicLogging = true
-	c.Logging.DynamicPort = "localhost:60061"
+	c.Logging.DynamicLogging = false
+	c.Logging.DynamicPort = "localhost:60061"
Based on learnings
internal/repository/scan_repository_qdrant_impl.go (1)
263-276: Fix incorrect variable reference in finalFilterPreferred.

Line 269's comment (in Spanish) correctly notes that finalFilterPreferred should reference stage3IDs, not stage2IDs. The current code uses stage2IDs which includes candidates from an earlier stage, potentially causing incorrect filtering in the final query.

The fix requires defining stage3IDs from the appropriate result set before using it. However, the current code structure doesn't create a stage3 result set—it goes directly from stage2Results to finalFilter. You need to either:

Create an explicit stage3IDs slice from the filtered stage2Results (lines 243-249), or

Use a more descriptive variable name like topStage2IDs or filteredStage2IDs and update the comment accordingly

If option 1 is intended:
 	// Stage 3: Final refinement using nested prefetch for the best results
-	stage2IDs := make([]uint64, 0, len(stage2Results))
+	stage3IDs := make([]uint64, 0, len(stage2Results))
 	for _, point := range stage2Results {
 		if point.Score > (stage2Results[0].Score*1.1)+5 {
 			break
 		}
-		stage2IDs = append(stage2IDs, point.Id.GetNum())
+		stage3IDs = append(stage3IDs, point.Id.GetNum())
 	}
 
 	finalFilter := &qdrant.Filter{
 		Must: []*qdrant.Condition{
 			{
 				ConditionOneOf: &qdrant.Condition_HasId{
 					HasId: &qdrant.HasIdCondition{
-						HasId: r.convertIDsToPointIDs(stage2IDs),
+						HasId: r.convertIDsToPointIDs(stage3IDs),
 					},
 				},
 			},
 		},
 	}
 
 	finalFilterPreferred := &qdrant.Filter{
 		Must: append(
 			[]*qdrant.Condition{
 				{
 					ConditionOneOf: &qdrant.Condition_HasId{
 						HasId: &qdrant.HasIdCondition{
-							HasId: r.convertIDsToPointIDs(stage2IDs), // Nota: debería ser stage3IDs, no stage2IDs
+							HasId: r.convertIDsToPointIDs(stage3IDs),
 						},
 					},
 				},
 			},
 			exactRank1Conditions...,
 		),
 	}

♻️ Duplicate comments (1)

internal/mapper/scan_mapper_impl.go (1)
25-32: Restore the documented MinAcceptedScore default

Mapping a nil/omitted MinAcceptedScore straight through keeps it at zero, which contradicts the struct comment in entities.ScanRequest (“default: 0.15”) and materially widens the result set by accepting every low-score match. Apply the domain default before returning the request (same concern raised earlier).
-	return &entities.ScanRequest{
+	sr := &entities.ScanRequest{
 		RankThreshold:      int(req.RankThreshold),
 		RecursiveThreshold: req.RecursiveThreshold,
 		MinAcceptedScore:   req.MinAcceptedScore,
 		Category:           req.Category,
 		QueryLimit:         int(req.QueryLimit),
 		Root:               m.ChildrenToDomain(req.Root),
-	}
+	}
+	if sr.MinAcceptedScore == 0 {
+		sr.MinAcceptedScore = 0.15
+	}
+	return sr
+```

</blockquote></details>

</blockquote></details>

<details>
<summary>🧹 Nitpick comments (3)</summary><blockquote>

<details>
<summary>internal/repository/scan_repository_qdrant_impl.go (2)</summary><blockquote>

`398-398`: **Extract magic numbers to named constants.**

The scoring and filtering logic uses several magic numbers (1.1, 3.0, 0.2, 2, 5) that make the thresholding algorithm harder to understand and maintain. Consider extracting these as named constants with explanatory comments.



Add constants at the package level:

```go
const (
	// Scoring threshold multipliers and offsets
	scoreMultiplier        = 1.1  // Allow 10% score variance
	scoreOffsetSmall       = 2.0  // Offset for zero-score edge cases
	scoreOffsetMedium      = 3.0  // Offset for garbage result filtering
	scoreOffsetLarge       = 5.0  // Offset for stage filtering
	scoreSimilarityPercent = 0.2  // 20% similarity threshold
)
Then use them in the code:
-	scoreThreshold := searchResp[0].Score*1.1 + 3.0
+	scoreThreshold := searchResp[0].Score*scoreMultiplier + scoreOffsetMedium
Also applies to: 523-523, 279-279, 245-245

157-157: Clarify comment phrasing.

The comments "limit to garbage results" at lines 157 and 168 are ambiguous. Consider rephrasing to "exclude garbage results" or "filter out low-quality results" for clarity.

Also applies to: 168-168
internal/service/scan_service_impl.go (1)

46-47: Use Infof for formatted logging.

SugaredLogger.Info ignores %d, so this logs “Scan completed: found %d results ”. Switch to Infof (or Info with fmt.Sprintf) to format correctly.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4b6fd3c and aafe6c3.

📒 Files selected for processing (22)

.golangci.yml (1 hunks)
Makefile (1 hunks)
cmd/import/main.go (17 hunks)
cmd/server/main.go (4 hunks)
go.mod (1 hunks)
internal/config/config.go (4 hunks)
internal/domain/entities/component.go (2 hunks)
internal/domain/entities/language.go (4 hunks)
internal/domain/entities/metrics.go (1 hunks)
internal/domain/entities/scan_request.go (3 hunks)
internal/domain/entities/scan_response.go (1 hunks)
internal/domain/entities/version.go (1 hunks)
internal/handler/scan_handler.go (3 hunks)
internal/mapper/scan_mapper.go (1 hunks)
internal/mapper/scan_mapper_impl.go (1 hunks)
internal/protocol/grpc/server.go (1 hunks)
internal/protocol/rest/server.go (1 hunks)
internal/repository/scan_repository.go (2 hunks)
internal/repository/scan_repository_qdrant_impl.go (18 hunks)
internal/service/scan_service.go (1 hunks)
internal/service/scan_service_impl.go (2 hunks)
internal/validation/validator.go (1 hunks)

✅ Files skipped from review due to trivial changes (1)

internal/domain/entities/scan_response.go

🧰 Additional context used

🧬 Code graph analysis (6)

internal/domain/entities/language.go (1)

internal/domain/entities/scan_request.go (1)

LanguageExtensions (36-36)

internal/service/scan_service.go (2)

internal/domain/entities/scan_request.go (1)

ScanRequest (4-17)

internal/domain/entities/scan_response.go (1)

ScanResponse (4-7)

internal/protocol/grpc/server.go (2)

internal/protocol/rest/server.go (1)

RunServer (35-63)

internal/config/config.go (1)

Config (38-72)

internal/handler/scan_handler.go (2)

internal/service/scan_service.go (1)

ScanService (11-14)

internal/mapper/scan_mapper.go (1)

ScanMapper (11-15)

internal/service/scan_service_impl.go (5)

internal/repository/scan_repository.go (1)

ScanRepository (19-31)

internal/service/scan_service.go (1)

ScanService (11-14)

internal/domain/entities/scan_request.go (2)

ScanRequest (4-17)

FolderNode (20-33)

internal/domain/entities/scan_response.go (2)

ScanResponse (4-7)

ScanResult (10-15)

internal/domain/entities/component.go (2)

ComponentGroup (5-12)

Version (15-18)

internal/repository/scan_repository.go (2)

internal/domain/entities/scan_request.go (1)

LanguageExtensions (36-36)

internal/domain/entities/component.go (1)

ComponentGroup (5-12)

🪛 checkmake (0.2.2)

Makefile

[warning] 1-1: Missing required phony target "all"

(minphony)

[warning] 76-76: Target "ci" should be declared PHONY.

(phonydeclared)

🔇 Additional comments (12)

internal/domain/entities/metrics.go (1)

15-15: LGTM: Comment capitalization matches function name.

Good practice to capitalize the comment to match the exported function name.

internal/service/scan_service.go (1)

1-14: Docstring addition looks good

Thanks for adding the package-level context; it helps clarify the service surface.

go.mod (1)

3-13: papi upgrade acknowledged

Moving to go1.23.x and bumping github.com/scanoss/papi to v0.22.0 keeps the module in sync with the latest helper features. Based on learnings

internal/protocol/grpc/server.go (1)

27-65: Scoped shutdown helper reads cleaner

The short declaration keeps oltpShutdown local while preserving the teardown path—looks good.

internal/domain/entities/version.go (1)

19-21: Clearer AppVersion comment

Nice clarification on how to inject the version via ldflags; matches the build tooling updates.

.golangci.yml (1)

1-235: LGTM! Comprehensive linting configuration.

The expanded linting setup significantly improves code quality enforcement with appropriate timeouts, a well-chosen linter suite, sensible per-linter settings, and practical exclusion rules. This aligns well with the PR's broader refactoring and quality improvements.

internal/protocol/rest/server.go (1)

21-35: LGTM!

The import reordering improves organization, and the nolint directive appropriately documents the context parameter position constraint imposed by the existing API design.

internal/validation/validator.go (1)

1-26: LGTM!

The added package documentation and improved formatting enhance code clarity without altering behavior.

internal/handler/scan_handler.go (1)

1-104: LGTM!

The enhanced validation flow (protobuf validation → domain validation) and structured error responses improve robustness and API contract enforcement. The implementation correctly handles validation failures with appropriate status codes.

cmd/server/main.go (1)

20-110: LGTM!

The import consolidation improves readability, the nolint directive appropriately documents the early-exit pattern before defer setup, and the REST port check change (!= "" instead of len > 0) is more idiomatic for string emptiness checks in Go.

Makefile (1)

11-11: Confirmed entities.AppVersion exists
AppVersion is declared in internal/domain/entities/version.go (line 21); LDFLAGS embedding in the Makefile is valid.

internal/mapper/scan_mapper.go (1)

1-15: Approve ChildrenToDomain implementation

Found implementation at internal/mapper/scan_mapper_impl.go:55; no further changes needed.

cmd/import/main.go

internal/domain/entities/metrics.go

internal/repository/scan_repository_qdrant_impl.go

internal/service/scan_service_impl.go

Makefile

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

.github/workflows/release.yml (1)
24-45: Release artifacts no longer exist with the expected names.

make build_amd64/build_arm64 now emit target/folder-hashing-api-linux-* (no scanoss- prefix), yet the release step still uploads target/scanoss-folder-hashing-api-linux-*. On the next tagged release, action-gh-release will fail because those files are missing. Please align the file list with the new build output names.
-            target/scanoss-folder-hashing-api-linux-amd64
-            target/scanoss-folder-hashing-api-linux-arm64
-            target/scanoss-hfh-import-linux-amd64
-            target/scanoss-hfh-import-linux-arm64
+            target/folder-hashing-api-linux-amd64
+            target/folder-hashing-api-linux-arm64
+            target/hfh-import-linux-amd64
+            target/hfh-import-linux-arm64

♻️ Duplicate comments (1)

Makefile (1)
1-1: Expand the .PHONY list for the new targets.

We now call targets such as ci, build_amd64, build_import_amd64, test-coverage, etc., but they are not marked phony. Marking them prevents make from treating similarly named files as up-to-date and matches the last review’s guidance.
-.PHONY: help test lint fmt build clean run
+.PHONY: help test lint fmt build clean run ci version tidy build_amd64 build_arm64 build_import_amd64 build_import_arm64 test-coverage clean-testcache lint-fix vet package_amd64 package_arm64

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aafe6c3 and 4a1ce08.

📒 Files selected for processing (3)

.github/workflows/go-ci.yml (1 hunks)
.github/workflows/release.yml (1 hunks)
Makefile (1 hunks)

🧰 Additional context used

🪛 checkmake (0.2.2)

Makefile

[warning] 76-76: Target "ci" should be declared PHONY.

(phonydeclared)

[warning] 1-1: Missing required phony target "all"

(minphony)

Makefile

matiasdaloia self-assigned this Aug 6, 2025

matiasdaloia added the enhancement New feature or request label Aug 6, 2025

matiasdaloia force-pushed the feature/mdaloia/SP-2993-Folder-Hashing-Service-Support-recursive-search branch 2 times, most recently from a26dae1 to 3619481 Compare September 22, 2025 13:23

coderabbitai bot reviewed Oct 7, 2025

View reviewed changes

internal/mapper/scan_mapper_impl.go Outdated Show resolved Hide resolved

internal/service/scan_service_impl.go Outdated Show resolved Hide resolved

matiasdaloia and others added 8 commits October 8, 2025 16:44

[SP-2993] feat: add recursive search

6202da8

[SP-2993] chore: remove unused code

119f1bb

chore: update readme

4a582e4

[SP-2993] chore: rename min_cutoff_threshold to recursive_threshold

0bee92e

[SP-2993] feat: always scan root node

f883662

[SP-2993] feat: add min_accepted_score and modify logic to not report…

d2988f9

… duplicated components

[SP-2993] chore: update papi version

56f34fb

scanning parameter tune-up

fd98961

matiasdaloia force-pushed the feature/mdaloia/SP-2993-Folder-Hashing-Service-Support-recursive-search branch from e7a97d6 to fd98961 Compare October 8, 2025 14:44

[SP-2993] chore: remove docker related files, update changelog, updat…

4b6fd3c

…e readmes

coderabbitai bot reviewed Oct 8, 2025

View reviewed changes

[SP-2993] chore: update makefile, fix linting errors

aafe6c3

coderabbitai bot reviewed Oct 8, 2025

View reviewed changes

[SP-2993] chore: update workflows and makefile

4a1ce08

coderabbitai bot reviewed Oct 8, 2025

View reviewed changes

Makefile Show resolved Hide resolved

[SP-2993] chore: final adjustments

ac56ea4

matiasdaloia merged commit 4240da0 into main Oct 8, 2025
3 checks passed

matiasdaloia deleted the feature/mdaloia/SP-2993-Folder-Hashing-Service-Support-recursive-search branch October 8, 2025 16:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SP-2993] feat: add support for recursive search #19

[SP-2993] feat: add support for recursive search #19

Uh oh!

matiasdaloia commented Aug 6, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 22, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[SP-2993] feat: add support for recursive search #19

[SP-2993] feat: add support for recursive search #19

Uh oh!

Conversation

matiasdaloia commented Aug 6, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

matiasdaloia commented Aug 6, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 22, 2025 •

edited

Loading