Skip to content

Conversation

isasmendiagus
Copy link
Contributor

Key Changes

Corrected Path Mapping

  • Before: Source location path showed the OSS file while snippet location showed SCANOSS file content URL https://api.scanoss.com/file_contents/...

  • After: Source location now correctly shows the scanned file and snippet location shows the actual matched OSS file
    See atached image for check diff.

See the attached diff image for a visual representation of the path mapping changes
image

Fixed Snippet Generation Logic:

  • Before: Snippets were generated using a cartesian product of PURLs × line ranges, creating duplicate snippets for the same code match.

  • After: One snippet is created per line range, using the first PURL as primary identifier and storing all PURLs in metadata.

@isasmendiagus isasmendiagus force-pushed the refactor/scanoss/result-parser-function branch from 682bb7a to 96644b5 Compare May 12, 2025 16:01
@isasmendiagus isasmendiagus marked this pull request as ready for review May 12, 2025 16:01
@isasmendiagus isasmendiagus requested a review from a team as a code owner May 12, 2025 16:01
Copy link

codecov bot commented May 12, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 56.66%. Comparing base (feecb6d) to head (0bbb508).
Report is 15 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #10310      +/-   ##
============================================
+ Coverage     56.63%   56.66%   +0.03%     
- Complexity     1615     1617       +2     
============================================
  Files           332      332              
  Lines         12281    12281              
  Branches       1139     1139              
============================================
+ Hits           6955     6959       +4     
+ Misses         4881     4877       -4     
  Partials        445      445              
Flag Coverage Δ
funTest-docker 70.89% <ø> (ø)
funTest-non-docker 33.39% <ø> (ø)
test-ubuntu-24.04 40.62% <ø> (+0.02%) ⬆️
test-windows-2022 40.60% <ø> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@isasmendiagus isasmendiagus force-pushed the refactor/scanoss/result-parser-function branch from 96644b5 to da11101 Compare May 12, 2025 17:01
@nnobelis
Copy link
Member

@isasmendiagus Just a question since your are on the improvements. Is the anonymous mode still on your radar ?

@isasmendiagus
Copy link
Contributor Author

@isasmendiagus Just a question since your are on the improvements. Is the anonymous mode still on your radar ?

Hi @nnobelis, yes. We're about to release the scanoss.java library with path anonymization. Then adding the feature to ORT should be straight forward.

Question here about priorities, do you prefer the generic report first and then path anonymization? Or the other way around?

@nnobelis
Copy link
Member

Question here about priorities, do you prefer the generic report first and then path anonymization? Or the other way around?

For us, path anonymization is more important, if @sschuberth can live with delaying the generic report a bit more....

@sschuberth
Copy link
Member

For us, path anonymization is more important, if @sschuberth can live with delaying the generic report a bit more....

Fine with me.

@isasmendiagus isasmendiagus force-pushed the refactor/scanoss/result-parser-function branch from da11101 to 7631d6c Compare May 14, 2025 16:30
@sschuberth

This comment was marked as outdated.

@isasmendiagus isasmendiagus requested a review from sschuberth May 15, 2025 05:32
@isasmendiagus isasmendiagus force-pushed the refactor/scanoss/result-parser-function branch 2 times, most recently from 3990c8e to f36d535 Compare May 19, 2025 15:58
@isasmendiagus isasmendiagus requested a review from sschuberth May 19, 2025 15:59
Copy link
Member

@sschuberth sschuberth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, just some final minor tweaks required!

"response" : {
"status" : 200,
"body" : "{ \"utils/src/main/kotlin/random-data-05-06-11.kt\": [ { \"id\": \"snippet\", \"status\": \"pending\", \"lines\": \"1-240\", \"oss_lines\": \"128-367\", \"matched\": \"99%\", \"purl\": [ \"pkg:github/scanoss/ort\" ], \"vendor\": \"scanoss\", \"component\": \"ort\", \"version\": \"e654028\", \"latest\": \"b12f8ee\", \"url\": \"https://github.com/scanoss/ort\", \"release_date\": \"2021-03-18\", \"file\": \"utils/src/main/kotlin/random-data-05-06-11.kt\", \"url_hash\": \"37faa38a820322fa93bf7a8fa8290bb8\", \"file_hash\": \"871fb0c5188c2f620d9b997e225b0095\", \"source_hash\": \"2e91edbe430c4eb195a977d326d6d6c0\", \"file_url\": \"https://osskb.org/api/file_contents/871fb0c5188c2f620d9b997e225b0095\", \"licenses\": [ { \"name\": \"Apache-2.0\", \"patent_hints\": \"yes\", \"copyleft\": \"no\", \"checklist_url\": \"https://www.osadl.org/fileadmin/checklists/unreflicenses/Apache-2.0.txt\", \"osadl_updated\": \"2022-03-17 13:38\", \"source\": \"file_spdx_tag\" }, { \"name\": \"Apache-2.0\", \"patent_hints\": \"yes\", \"copyleft\": \"no\", \"checklist_url\": \"https://www.osadl.org/fileadmin/checklists/unreflicenses/Apache-2.0.txt\", \"osadl_updated\": \"2022-03-17 13:38\", \"source\": \"scancode\" } ], \"server\": { \"version\": \"4.4.2\", \"kb_version\": { \"monthly\": \"22.02\", \"daily\": \"22.03.25\" } } } ], \"5530105e-0752-4750-9c07-4e4604b879a5\": [ { \"id\": \"file\", \"status\": \"pending\", \"lines\": \"all\", \"oss_lines\": \"all\", \"matched\": \"100%\", \"purl\": [ \"pkg:github/scanoss/ort\" ], \"vendor\": \"scanoss\", \"component\": \"ort\", \"version\": \"e654028\", \"latest\": \"b12f8ee\", \"url\": \"https://github.com/scanoss/ort\", \"release_date\": \"2021-03-18\", \"file\": \"scanner/src/main/kotlin/random-data-05-07-04.kt\", \"url_hash\": \"37faa38a820322fa93bf7a8fa8290bb8\", \"file_hash\": \"5c8ab9be40df937e46c53509481107cd\", \"source_hash\": \"5c8ab9be40df937e46c53509481107cd\", \"file_url\": \"https://osskb.org/api/file_contents/5c8ab9be40df937e46c53509481107cd\", \"licenses\": [ { \"name\": \"Apache-2.0\", \"patent_hints\": \"yes\", \"copyleft\": \"no\", \"checklist_url\": \"https://www.osadl.org/fileadmin/checklists/unreflicenses/Apache-2.0.txt\", \"osadl_updated\": \"2022-03-17 13:38\", \"source\": \"file_spdx_tag\" }, { \"name\": \"Apache-2.0\", \"patent_hints\": \"yes\", \"copyleft\": \"no\", \"checklist_url\": \"https://www.osadl.org/fileadmin/checklists/unreflicenses/Apache-2.0.txt\", \"osadl_updated\": \"2022-03-17 13:38\", \"source\": \"scancode\" } ], \"server\": { \"version\": \"4.4.2\", \"kb_version\": { \"monthly\": \"22.02\", \"daily\": \"22.03.25\" } } } ]}",
"body" : "{ \"utils/src/main/kotlin/random-data-05-06-11.kt\": [ { \"id\": \"snippet\", \"status\": \"pending\", \"lines\": \"1-240\", \"oss_lines\": \"128-367\", \"matched\": \"99%\", \"purl\": [ \"pkg:github/scanoss/ort\" ], \"vendor\": \"scanoss\", \"component\": \"ort\", \"version\": \"e654028\", \"latest\": \"b12f8ee\", \"url\": \"https://github.com/scanoss/ort\", \"release_date\": \"2021-03-18\", \"file\": \"examples/example.rules.kts\", \"url_hash\": \"37faa38a820322fa93bf7a8fa8290bb8\", \"file_hash\": \"871fb0c5188c2f620d9b997e225b0095\", \"source_hash\": \"2e91edbe430c4eb195a977d326d6d6c0\", \"file_url\": \"https://osskb.org/api/file_contents/871fb0c5188c2f620d9b997e225b0095\", \"licenses\": [ { \"name\": \"Apache-2.0\", \"patent_hints\": \"yes\", \"copyleft\": \"no\", \"checklist_url\": \"https://www.osadl.org/fileadmin/checklists/unreflicenses/Apache-2.0.txt\", \"osadl_updated\": \"2022-03-17 13:38\", \"source\": \"file_spdx_tag\" }, { \"name\": \"Apache-2.0\", \"patent_hints\": \"yes\", \"copyleft\": \"no\", \"checklist_url\": \"https://www.osadl.org/fileadmin/checklists/unreflicenses/Apache-2.0.txt\", \"osadl_updated\": \"2022-03-17 13:38\", \"source\": \"scancode\" } ], \"server\": { \"version\": \"4.4.2\", \"kb_version\": { \"monthly\": \"22.02\", \"daily\": \"22.03.25\" } } } ], \"scanner/src/main/kotlin/random-data-05-07-04.kt\": [ { \"id\": \"file\", \"status\": \"pending\", \"lines\": \"all\", \"oss_lines\": \"all\", \"matched\": \"100%\", \"purl\": [ \"pkg:github/scanoss/ort\" ], \"vendor\": \"scanoss\", \"component\": \"ort\", \"version\": \"e654028\", \"latest\": \"b12f8ee\", \"url\": \"https://github.com/scanoss/ort\", \"release_date\": \"2021-03-18\", \"file\": \"scanner/src/main/kotlin/random-data-05-07-04.kt\", \"url_hash\": \"37faa38a820322fa93bf7a8fa8290bb8\", \"file_hash\": \"5c8ab9be40df937e46c53509481107cd\", \"source_hash\": \"5c8ab9be40df937e46c53509481107cd\", \"file_url\": \"https://osskb.org/api/file_contents/5c8ab9be40df937e46c53509481107cd\", \"licenses\": [ { \"name\": \"Apache-2.0\", \"patent_hints\": \"yes\", \"copyleft\": \"no\", \"checklist_url\": \"https://www.osadl.org/fileadmin/checklists/unreflicenses/Apache-2.0.txt\", \"osadl_updated\": \"2022-03-17 13:38\", \"source\": \"file_spdx_tag\" }, { \"name\": \"Apache-2.0\", \"patent_hints\": \"yes\", \"copyleft\": \"no\", \"checklist_url\": \"https://www.osadl.org/fileadmin/checklists/unreflicenses/Apache-2.0.txt\", \"osadl_updated\": \"2022-03-17 13:38\", \"source\": \"scancode\" } ], \"server\": { \"version\": \"4.4.2\", \"kb_version\": { \"monthly\": \"22.02\", \"daily\": \"22.03.25\" } } } ]}",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe something went wrong here: This file and the file below should not be edited as part of this commit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your concern, but the changes to the test files are necessary because:

  1. After modifying ScanOssResultParser.kt to use result.filePath instead of details.file, the existing tests would fail since they expected the previous behavior.

  2. The changes in the test JSON files (scanoss-response.json and scanoss-multi-response.json) were specifically made to differentiate between local paths (what we now display to users) and remote paths

If you prefer, I can separate these changes into two distinct commits: one for the functional modification in the main code and another to update the corresponding tests.

Copy link
Member

@sschuberth sschuberth May 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you prefer, I can separate these changes into two distinct commits: one for the functional modification in the main code and another to update the corresponding tests.

No, commits should be atomic in the sense that cherry picking any of them individually should not break tests. So if code changes requires test (asset) changes, these should indeed be done in the same commit.

My mistake was that I did restore plugins/scanners/scanoss/src/test/resources from the previous commit to check if tests would still pass without the changes, but I did run the wrong test, i.e. ScanOssResultParserTest which got touched in this commit, not realizing that it's ScanOssScannerDirectoryTest / ScanOssScannerFileTest that would fail.

What also confused me is that some UUID only get replaced now, although the anonymization feature was removed before. But that's because due to the bug that's fixed in this commit this did not surface earlier. I've verified that now all UUIDs got replaced.

The `licenses` variable is constant in the loop, so extract it to determine
the resulting `license` only once.

Signed-off-by: Agustin Isasmendi <[email protected]>
Add test cases to verify behavior when combining the same license from
different sources and handling empty license arrays with NOASSERTION value.

Signed-off-by: Agustin Isasmendi <[email protected]>
Correct issue where `getLicenseFindings()`, `getCopyrightFindings()` and
`sourceLocations` incorrectly showed remote filepath instead of local file
path.

Signed-off-by: Agustin Isasmendi <[email protected]>
Correct issue where `snippetFindings` location incorrectly used OSSKB
URLs instead of remote file paths.

Signed-off-by: Agustin Isasmendi <[email protected]>
Rename variables to clearly identify local vs OSS contexts, making code
more readable and reducing confusion between different path references.

Signed-off-by: Agustin Isasmendi <[email protected]>
Replace cartesian product approach (one snippet per PURL-location pair)
with one snippet per location using primary PURL as identifier and storing
all related PURLs in metadata.

Signed-off-by: Agustin Isasmendi <[email protected]>
Replace separate collection iteration with explicit source-to-OSS location
pairing using zip operation.

Signed-off-by: Agustin Isasmendi <[email protected]>
Modify `generateSummary()` to filter out snippets that are already
identified.

Signed-off-by: Agustin Isasmendi <[email protected]>
Change all instances of "ScanOSS" to "SCANOSS" for consistent branding.

Signed-off-by: Agustin Isasmendi <[email protected]>
@isasmendiagus isasmendiagus force-pushed the refactor/scanoss/result-parser-function branch from f36d535 to 0bbb508 Compare May 20, 2025 09:20
@sschuberth sschuberth enabled auto-merge (rebase) May 20, 2025 09:51
@sschuberth sschuberth merged commit 3f4b2f6 into oss-review-toolkit:main May 20, 2025
34 of 38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants