Skip to content

Conversation

isasmendiagus
Copy link
Contributor

SCANOSS snippet report generation

  • Implement snippet choice functionality for SCANOSS integration
  • Add support for processing exclude paths from ort.yml configuration
  • Add dedicated SCANOSS snippet report

@sschuberth

This comment was marked as outdated.

@isasmendiagus isasmendiagus force-pushed the feat/scanoss/parse-ort-yml-file-on-scanoss-integration-cherry-pick branch from 29719d8 to aa42ab6 Compare March 5, 2025 15:04
@isasmendiagus

This comment was marked as outdated.

@sschuberth sschuberth changed the title feat(scanoss-plugin): implement ort.yml configuration parsing & add SCANOSS snippet report generation Implement ort.yml parsing snippet report generation for SCANOSS Mar 5, 2025
Copy link

codecov bot commented Mar 5, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 56.46%. Comparing base (b13bfe0) to head (2a8bda9).
Report is 13 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #10004      +/-   ##
============================================
+ Coverage     56.39%   56.46%   +0.06%     
- Complexity     1602     1603       +1     
============================================
  Files           331      331              
  Lines         12261    12261              
  Branches       1141     1141              
============================================
+ Hits           6915     6923       +8     
+ Misses         4897     4889       -8     
  Partials        449      449              
Flag Coverage Δ
funTest-non-docker 33.42% <ø> (ø)
test-ubuntu-24.04 40.23% <ø> (ø)
test-windows-2022 40.21% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@sschuberth

This comment was marked as outdated.

@isasmendiagus isasmendiagus force-pushed the feat/scanoss/parse-ort-yml-file-on-scanoss-integration-cherry-pick branch from aa42ab6 to 5f67f5c Compare March 7, 2025 08:17
@isasmendiagus isasmendiagus force-pushed the feat/scanoss/parse-ort-yml-file-on-scanoss-integration-cherry-pick branch from 5f67f5c to e45cf6e Compare March 7, 2025 12:33
@sschuberth

This comment was marked as outdated.

@isasmendiagus isasmendiagus force-pushed the feat/scanoss/parse-ort-yml-file-on-scanoss-integration-cherry-pick branch from 7e6ed63 to e45cf6e Compare March 12, 2025 07:18
@isasmendiagus isasmendiagus marked this pull request as ready for review March 12, 2025 07:19
@isasmendiagus isasmendiagus requested a review from a team as a code owner March 12, 2025 07:19
@isasmendiagus isasmendiagus marked this pull request as draft March 12, 2025 09:11
@isasmendiagus isasmendiagus force-pushed the feat/scanoss/parse-ort-yml-file-on-scanoss-integration-cherry-pick branch 6 times, most recently from 7f3895b to e36f5ec Compare March 12, 2025 22:59
@isasmendiagus isasmendiagus marked this pull request as ready for review March 12, 2025 23:00
val snippets = getSnippets(details)

if (sourceLocations.size != snippets.size) {
logger.warn("number of local line ranges does not match with oss lines on file '$file'")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sschuberth In addition to the warnings, should an issue be also created?

Same comment for the other warning underneath.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it even a warning? What is "bad" about the condition being triggered? As a user, I could not tell from the message.

@isasmendiagus isasmendiagus force-pushed the feat/scanoss/parse-ort-yml-file-on-scanoss-integration-cherry-pick branch from e36f5ec to 3ba0341 Compare March 13, 2025 15:47
@nnobelis

This comment was marked as outdated.

)

ScanFileResult(fileName, it.fileDetails)
val rawResults: List<String> = when {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please drop the explicit : List<String> type declaration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did the whole logic to anonymize files names (in order to not expose them to the SCANOSS service) go?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've refactored the implementation by replacing the direct fingerprint generation and API calls from ORT with the Java SCANOSS SDK, which now handles these operations internally.

The path anonymization functionality isn't currently implemented in the SCANOSS Java SDK. As soon as this feature becomes available in the SDK, we plan to integrate it into our ORT implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in previous calls, we've agreed to proceed with integrating the Scanoss Java SDK package now, even without path anonymization functionality. We plan to implement this feature once it becomes available in the SDK

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I'm a bit surprised by this decision, but since the original request for anonymization came from Bosch via @nnobelis, I'm fine with it.

However, that's again something that could be done as a preparing commit: Remove anonymization from the existing implementation beforehand, arguing in the commit message with what you wrote above.

That way, the actual migration to the new SCANOSS SDK becomes more of a 1:1 migration feature-wise, and thus easier to compare and review.

* several snippets are created in ORT each containing a single Purl.
*/
private fun getSnippets(details: ScanFileDetails): Set<Snippet> {
private fun getSnippets(details: ScanFileDetails): List<Snippet> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need to return a list now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We changed the return type from Set to List because we need to maintain a consistent ordering of snippets to ensure we always select the first PURL as the primary identifier

@isasmendiagus isasmendiagus force-pushed the feat/scanoss/parse-ort-yml-file-on-scanoss-integration-cherry-pick branch from c546cdd to e38d058 Compare March 27, 2025 18:31
@isasmendiagus isasmendiagus force-pushed the feat/scanoss/parse-ort-yml-file-on-scanoss-integration-cherry-pick branch from e38d058 to 866648d Compare March 28, 2025 09:22
"plugins/reporters/asciidoc/src/main/resources/pdf-theme/pdf-theme.yml",
"plugins/reporters/asciidoc/src/main/resources/templates/freemarker_implicit.ftl",
"plugins/reporters/fossid/src/main/resources/templates/freemarker_implicit.ftl",
"plugins/reporters/scanoss/src/main/resources/templates/freemarker_implicit.ftl",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make a cut here and only keep the preceding 5 commits in this PR, and create separate PRs for the next 4 commits? (Probably 2 or 3 other PRs that each focus on report generation or refactoring.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's how I propose to organize them:

PR 1: Migrate to SCANOSS SDK

  • refactor(scanoss): Remove path anonymization from SCANOSS implementation
  • refactor(scanoss): Replace direct API calls with SCANOSS SDK
  • refactor(scanoss): Set SCANOSS matcher property to null

PR 2: Add Exclusion Pattern & Snippet Choice Support

  • feat(scanoss): Add exclusion pattern support to SCANOSS
  • feat(scanoss): Add snippet choice parsing for scan results

PR 3: Report Generation

  • feat(scanoss): Add snippet report generation
  • feat(scanoss): Add release date to snippet findings
  • refactor(scanoss): ScanOssResultParser to improve snippets findings

Since these changes build on each other, these PRs would need to be merged sequentially. Would you prefer create all three PRs at once and mark the dependent ones as "Draft" until their predecessors are merged, or should I submit them one at a time as each gets approved?

I can prepare these separate PRs right away once we agree on the approach.

Copy link
Member

@sschuberth sschuberth May 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you prefer create all three PRs at once and mark the dependent ones as "Draft" until their predecessors are merged, or should I submit them one at a time as each gets approved?

Either way is fine with me. Maybe the latter creates a bit less "clutter" due to open draft PRs.

I can prepare these separate PRs right away once we agree on the approach.

Great, thanks. Please go ahead as you see fit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First PR is #10265

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@isasmendiagus, is it ok to close this PR already to clean things up, as all its remainders are now included in #10287?

@isasmendiagus isasmendiagus force-pushed the feat/scanoss/parse-ort-yml-file-on-scanoss-integration-cherry-pick branch from 866648d to 145a225 Compare May 5, 2025 14:51
Remove the path anonymization functionality from the existing SCANOSS
implementation as preparation for migrating to the Java SCANOSS SDK.

This is a temporary removal. While path anonymization is not yet available
in the SDK, we plan to implement this feature in the upstream SDK in the
future.

This approach allows us to consolidate all SCANOSS functionality in the SDK
rather than maintaining custom implementations.

Signed-off-by: Agustin Isasmendi <[email protected]>
Replace custom direct API calls to SCANOSS with the official Java SDK.
This change improves maintainability by leveraging the SDK's functionality
instead of maintaining custom implementation for API interactions.

Signed-off-by: Agustin Isasmendi <[email protected]>
It forces the SCANOSS scanner's matcher property to null to prevent loading
results from scan storage. This follows the same approach implemented for
other snippet scanners, where the consensus was that snippet scanner
results should never come from scan storage.

This fixes an issue where `context.excludes` was being nullified in
`ScanOss.scanPath()`, preventing proper application of exclusion
patterns.

Signed-off-by: Agustin Isasmendi <[email protected]>
@isasmendiagus isasmendiagus force-pushed the feat/scanoss/parse-ort-yml-file-on-scanoss-integration-cherry-pick branch from 145a225 to 2a8bda9 Compare May 5, 2025 15:18
Implement exclusion filtering to respect path patterns specified in the
`.ort.yml` configuration. The scanner now properly excludes files matching
the patterns during the scan process.

Signed-off-by: Agustin Isasmendi <[email protected]>
Implement snippet choice processing functionality.  It handles findings
according to two different scenarios:
- Original findings that should be included
- Non-relevant findings that should be removed

The implementation converts ORT's SnippetChoices into SCANOSS-specific rule
types

Signed-off-by: Agustin Isasmendi <[email protected]>
Implement functionality to generate snippet findings reports from
SCANOSS scan results.

Signed-off-by: Agustin Isasmendi <[email protected]>
Include releaseDate in snippetFindings additionalData and add a new
column to display this information in generated reports.

Signed-off-by: Agustin Isasmendi <[email protected]>
* Generate one Snippet for each detected line range
* Remove duplicate licenses to optimize results
* Remove identified snippets from the summary

Signed-off-by: Agustin Isasmendi <[email protected]>
Generate test files with random content using /dev/urandom to avoid
any potential matches when scanning ORT source code.

Signed-off-by: Agustin Isasmendi <[email protected]>
@isasmendiagus isasmendiagus force-pushed the feat/scanoss/parse-ort-yml-file-on-scanoss-integration-cherry-pick branch from 2a8bda9 to 642f189 Compare May 6, 2025 11:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants