Skip to content

Conversation

agschrei
Copy link
Contributor

This PR adds additional methods to Scanner that accept a list of file paths to be scanned.

Why we think this is an added convenience for users:
We are currently using scanoss.py to build a git pre-commit hook. Ideally this hook should only scan files that are in the git staging area which was difficult to achieve with the existing methods exposed by Scanner.
There are some things that can be done by directly calling internal methods directly, but that's just asking for trouble.

Caveats:

  • In its current form the scan_files method is essentially a carbon copy of the scan_folder method with the only difference that, instead of walking a filesystem directory, it simply iterates over a list of paths. The code may benefit from DRYing things up to reduce this duplication, I am happy to take a stab at it with some guidance.
  • The scan_files_with_options method will raise an exception if invoked with dependency scan mode enabled. This is because the interface for the dependency scan also expects a single root path that it will then scan with scancode. I have not refactored this yet so flagging the scan mode as unsupported seemed like the best option for now.

Any feedback on this is welcome. I am using my own custom build for our pre-commit hook right now, but would of course prefer to upstream this if possible.

@eeisegn eeisegn self-assigned this Jan 30, 2024
@eeisegn eeisegn self-requested a review January 30, 2024 10:17
@eeisegn
Copy link
Contributor

eeisegn commented Jan 30, 2024

@agschrei Thank you very much for the submission!

How do you plan to use this function? From the CLI or as part of an SDK integration?
Trying to figure out if you're looking for this feature to exposed as part of the CLI class, or simply consumed as an SDK in another script?

@agschrei
Copy link
Contributor Author

Thanks for the quick reply here @eeisegn - For my specific use case it's sufficient to have the feature exposed via the SDK.

I have a python script that consumes the scanoss.py package to implement checks as part of a pre-commit hook.
It queries git for the staged files and the idea is to then only scan these files (in parallel). I couldn't find a way to do that with the existing Scanner interface that didn't feel hacky, so I figured I'd try to upstream it.

@eeisegn eeisegn merged commit 6afc5f8 into scanoss:main Feb 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants