-
Notifications
You must be signed in to change notification settings - Fork 352
SCANOSS Scanner Improvements #10310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SCANOSS Scanner Improvements #10310
Conversation
682bb7a
to
96644b5
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #10310 +/- ##
============================================
+ Coverage 56.63% 56.66% +0.03%
- Complexity 1615 1617 +2
============================================
Files 332 332
Lines 12281 12281
Branches 1139 1139
============================================
+ Hits 6955 6959 +4
+ Misses 4881 4877 -4
Partials 445 445
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
96644b5
to
da11101
Compare
plugins/scanners/scanoss/src/main/kotlin/ScanOssResultParser.kt
Outdated
Show resolved
Hide resolved
plugins/scanners/scanoss/src/main/kotlin/ScanOssResultParser.kt
Outdated
Show resolved
Hide resolved
plugins/scanners/scanoss/src/main/kotlin/ScanOssResultParser.kt
Outdated
Show resolved
Hide resolved
plugins/scanners/scanoss/src/main/kotlin/ScanOssResultParser.kt
Outdated
Show resolved
Hide resolved
plugins/scanners/scanoss/src/main/kotlin/ScanOssResultParser.kt
Outdated
Show resolved
Hide resolved
plugins/scanners/scanoss/src/main/kotlin/ScanOssResultParser.kt
Outdated
Show resolved
Hide resolved
plugins/scanners/scanoss/src/main/kotlin/ScanOssResultParser.kt
Outdated
Show resolved
Hide resolved
plugins/scanners/scanoss/src/test/assets/scanoss-identified-snippet.json
Outdated
Show resolved
Hide resolved
plugins/scanners/scanoss/src/test/kotlin/ScanOssResultParserTest.kt
Outdated
Show resolved
Hide resolved
plugins/scanners/scanoss/src/test/kotlin/ScanOssResultParserTest.kt
Outdated
Show resolved
Hide resolved
plugins/scanners/scanoss/src/test/kotlin/ScanOssResultParserTest.kt
Outdated
Show resolved
Hide resolved
@isasmendiagus Just a question since your are on the improvements. Is the anonymous mode still on your radar ? |
plugins/scanners/scanoss/src/test/kotlin/ScanOssResultParserTest.kt
Outdated
Show resolved
Hide resolved
Hi @nnobelis, yes. We're about to release the scanoss.java library with path anonymization. Then adding the feature to ORT should be straight forward. Question here about priorities, do you prefer the generic report first and then path anonymization? Or the other way around? |
For us, path anonymization is more important, if @sschuberth can live with delaying the generic report a bit more.... |
Fine with me. |
da11101
to
7631d6c
Compare
This comment was marked as outdated.
This comment was marked as outdated.
plugins/scanners/scanoss/src/test/kotlin/ScanOssResultParserTest.kt
Outdated
Show resolved
Hide resolved
plugins/scanners/scanoss/src/test/kotlin/ScanOssResultParserTest.kt
Outdated
Show resolved
Hide resolved
plugins/scanners/scanoss/src/main/kotlin/ScanOssResultParser.kt
Outdated
Show resolved
Hide resolved
plugins/scanners/scanoss/src/main/kotlin/ScanOssResultParser.kt
Outdated
Show resolved
Hide resolved
plugins/scanners/scanoss/src/test/kotlin/ScanOssResultParserTest.kt
Outdated
Show resolved
Hide resolved
plugins/scanners/scanoss/src/test/kotlin/ScanOssScannerDirectoryTest.kt
Outdated
Show resolved
Hide resolved
3990c8e
to
f36d535
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, just some final minor tweaks required!
"response" : { | ||
"status" : 200, | ||
"body" : "{ \"utils/src/main/kotlin/random-data-05-06-11.kt\": [ { \"id\": \"snippet\", \"status\": \"pending\", \"lines\": \"1-240\", \"oss_lines\": \"128-367\", \"matched\": \"99%\", \"purl\": [ \"pkg:github/scanoss/ort\" ], \"vendor\": \"scanoss\", \"component\": \"ort\", \"version\": \"e654028\", \"latest\": \"b12f8ee\", \"url\": \"https://github.com/scanoss/ort\", \"release_date\": \"2021-03-18\", \"file\": \"utils/src/main/kotlin/random-data-05-06-11.kt\", \"url_hash\": \"37faa38a820322fa93bf7a8fa8290bb8\", \"file_hash\": \"871fb0c5188c2f620d9b997e225b0095\", \"source_hash\": \"2e91edbe430c4eb195a977d326d6d6c0\", \"file_url\": \"https://osskb.org/api/file_contents/871fb0c5188c2f620d9b997e225b0095\", \"licenses\": [ { \"name\": \"Apache-2.0\", \"patent_hints\": \"yes\", \"copyleft\": \"no\", \"checklist_url\": \"https://www.osadl.org/fileadmin/checklists/unreflicenses/Apache-2.0.txt\", \"osadl_updated\": \"2022-03-17 13:38\", \"source\": \"file_spdx_tag\" }, { \"name\": \"Apache-2.0\", \"patent_hints\": \"yes\", \"copyleft\": \"no\", \"checklist_url\": \"https://www.osadl.org/fileadmin/checklists/unreflicenses/Apache-2.0.txt\", \"osadl_updated\": \"2022-03-17 13:38\", \"source\": \"scancode\" } ], \"server\": { \"version\": \"4.4.2\", \"kb_version\": { \"monthly\": \"22.02\", \"daily\": \"22.03.25\" } } } ], \"5530105e-0752-4750-9c07-4e4604b879a5\": [ { \"id\": \"file\", \"status\": \"pending\", \"lines\": \"all\", \"oss_lines\": \"all\", \"matched\": \"100%\", \"purl\": [ \"pkg:github/scanoss/ort\" ], \"vendor\": \"scanoss\", \"component\": \"ort\", \"version\": \"e654028\", \"latest\": \"b12f8ee\", \"url\": \"https://github.com/scanoss/ort\", \"release_date\": \"2021-03-18\", \"file\": \"scanner/src/main/kotlin/random-data-05-07-04.kt\", \"url_hash\": \"37faa38a820322fa93bf7a8fa8290bb8\", \"file_hash\": \"5c8ab9be40df937e46c53509481107cd\", \"source_hash\": \"5c8ab9be40df937e46c53509481107cd\", \"file_url\": \"https://osskb.org/api/file_contents/5c8ab9be40df937e46c53509481107cd\", \"licenses\": [ { \"name\": \"Apache-2.0\", \"patent_hints\": \"yes\", \"copyleft\": \"no\", \"checklist_url\": \"https://www.osadl.org/fileadmin/checklists/unreflicenses/Apache-2.0.txt\", \"osadl_updated\": \"2022-03-17 13:38\", \"source\": \"file_spdx_tag\" }, { \"name\": \"Apache-2.0\", \"patent_hints\": \"yes\", \"copyleft\": \"no\", \"checklist_url\": \"https://www.osadl.org/fileadmin/checklists/unreflicenses/Apache-2.0.txt\", \"osadl_updated\": \"2022-03-17 13:38\", \"source\": \"scancode\" } ], \"server\": { \"version\": \"4.4.2\", \"kb_version\": { \"monthly\": \"22.02\", \"daily\": \"22.03.25\" } } } ]}", | ||
"body" : "{ \"utils/src/main/kotlin/random-data-05-06-11.kt\": [ { \"id\": \"snippet\", \"status\": \"pending\", \"lines\": \"1-240\", \"oss_lines\": \"128-367\", \"matched\": \"99%\", \"purl\": [ \"pkg:github/scanoss/ort\" ], \"vendor\": \"scanoss\", \"component\": \"ort\", \"version\": \"e654028\", \"latest\": \"b12f8ee\", \"url\": \"https://github.com/scanoss/ort\", \"release_date\": \"2021-03-18\", \"file\": \"examples/example.rules.kts\", \"url_hash\": \"37faa38a820322fa93bf7a8fa8290bb8\", \"file_hash\": \"871fb0c5188c2f620d9b997e225b0095\", \"source_hash\": \"2e91edbe430c4eb195a977d326d6d6c0\", \"file_url\": \"https://osskb.org/api/file_contents/871fb0c5188c2f620d9b997e225b0095\", \"licenses\": [ { \"name\": \"Apache-2.0\", \"patent_hints\": \"yes\", \"copyleft\": \"no\", \"checklist_url\": \"https://www.osadl.org/fileadmin/checklists/unreflicenses/Apache-2.0.txt\", \"osadl_updated\": \"2022-03-17 13:38\", \"source\": \"file_spdx_tag\" }, { \"name\": \"Apache-2.0\", \"patent_hints\": \"yes\", \"copyleft\": \"no\", \"checklist_url\": \"https://www.osadl.org/fileadmin/checklists/unreflicenses/Apache-2.0.txt\", \"osadl_updated\": \"2022-03-17 13:38\", \"source\": \"scancode\" } ], \"server\": { \"version\": \"4.4.2\", \"kb_version\": { \"monthly\": \"22.02\", \"daily\": \"22.03.25\" } } } ], \"scanner/src/main/kotlin/random-data-05-07-04.kt\": [ { \"id\": \"file\", \"status\": \"pending\", \"lines\": \"all\", \"oss_lines\": \"all\", \"matched\": \"100%\", \"purl\": [ \"pkg:github/scanoss/ort\" ], \"vendor\": \"scanoss\", \"component\": \"ort\", \"version\": \"e654028\", \"latest\": \"b12f8ee\", \"url\": \"https://github.com/scanoss/ort\", \"release_date\": \"2021-03-18\", \"file\": \"scanner/src/main/kotlin/random-data-05-07-04.kt\", \"url_hash\": \"37faa38a820322fa93bf7a8fa8290bb8\", \"file_hash\": \"5c8ab9be40df937e46c53509481107cd\", \"source_hash\": \"5c8ab9be40df937e46c53509481107cd\", \"file_url\": \"https://osskb.org/api/file_contents/5c8ab9be40df937e46c53509481107cd\", \"licenses\": [ { \"name\": \"Apache-2.0\", \"patent_hints\": \"yes\", \"copyleft\": \"no\", \"checklist_url\": \"https://www.osadl.org/fileadmin/checklists/unreflicenses/Apache-2.0.txt\", \"osadl_updated\": \"2022-03-17 13:38\", \"source\": \"file_spdx_tag\" }, { \"name\": \"Apache-2.0\", \"patent_hints\": \"yes\", \"copyleft\": \"no\", \"checklist_url\": \"https://www.osadl.org/fileadmin/checklists/unreflicenses/Apache-2.0.txt\", \"osadl_updated\": \"2022-03-17 13:38\", \"source\": \"scancode\" } ], \"server\": { \"version\": \"4.4.2\", \"kb_version\": { \"monthly\": \"22.02\", \"daily\": \"22.03.25\" } } } ]}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe something went wrong here: This file and the file below should not be edited as part of this commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand your concern, but the changes to the test files are necessary because:
-
After modifying
ScanOssResultParser.kt
to useresult.filePath
instead ofdetails.file
, the existing tests would fail since they expected the previous behavior. -
The changes in the test JSON files (
scanoss-response.json
andscanoss-multi-response.json
) were specifically made to differentiate between local paths (what we now display to users) and remote paths
If you prefer, I can separate these changes into two distinct commits: one for the functional modification in the main code and another to update the corresponding tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you prefer, I can separate these changes into two distinct commits: one for the functional modification in the main code and another to update the corresponding tests.
No, commits should be atomic in the sense that cherry picking any of them individually should not break tests. So if code changes requires test (asset) changes, these should indeed be done in the same commit.
My mistake was that I did restore plugins/scanners/scanoss/src/test/resources
from the previous commit to check if tests would still pass without the changes, but I did run the wrong test, i.e. ScanOssResultParserTest
which got touched in this commit, not realizing that it's ScanOssScannerDirectoryTest
/ ScanOssScannerFileTest
that would fail.
What also confused me is that some UUID only get replaced now, although the anonymization feature was removed before. But that's because due to the bug that's fixed in this commit this did not surface earlier. I've verified that now all UUIDs got replaced.
plugins/scanners/scanoss/src/test/kotlin/ScanOssScannerDirectoryTest.kt
Outdated
Show resolved
Hide resolved
plugins/scanners/scanoss/src/main/kotlin/ScanOssResultParser.kt
Outdated
Show resolved
Hide resolved
The `licenses` variable is constant in the loop, so extract it to determine the resulting `license` only once. Signed-off-by: Agustin Isasmendi <[email protected]>
Add test cases to verify behavior when combining the same license from different sources and handling empty license arrays with NOASSERTION value. Signed-off-by: Agustin Isasmendi <[email protected]>
Correct issue where `getLicenseFindings()`, `getCopyrightFindings()` and `sourceLocations` incorrectly showed remote filepath instead of local file path. Signed-off-by: Agustin Isasmendi <[email protected]>
Correct issue where `snippetFindings` location incorrectly used OSSKB URLs instead of remote file paths. Signed-off-by: Agustin Isasmendi <[email protected]>
Rename variables to clearly identify local vs OSS contexts, making code more readable and reducing confusion between different path references. Signed-off-by: Agustin Isasmendi <[email protected]>
Replace cartesian product approach (one snippet per PURL-location pair) with one snippet per location using primary PURL as identifier and storing all related PURLs in metadata. Signed-off-by: Agustin Isasmendi <[email protected]>
Replace separate collection iteration with explicit source-to-OSS location pairing using zip operation. Signed-off-by: Agustin Isasmendi <[email protected]>
Modify `generateSummary()` to filter out snippets that are already identified. Signed-off-by: Agustin Isasmendi <[email protected]>
Change all instances of "ScanOSS" to "SCANOSS" for consistent branding. Signed-off-by: Agustin Isasmendi <[email protected]>
f36d535
to
0bbb508
Compare
Key Changes
Corrected Path Mapping
Before: Source location path showed the OSS file while snippet location showed SCANOSS file content URL
https://api.scanoss.com/file_contents/...
After: Source location now correctly shows the scanned file and snippet location shows the actual matched OSS file
See atached image for check diff.
See the attached diff image for a visual representation of the path mapping changes

Fixed Snippet Generation Logic:
Before: Snippets were generated using a cartesian product of PURLs × line ranges, creating duplicate snippets for the same code match.
After: One snippet is created per line range, using the first PURL as primary identifier and storing all PURLs in metadata.