-
Notifications
You must be signed in to change notification settings - Fork 8
[RHAIENG-1115] Smoke tests for notebooks #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,122 @@ | ||
| name: Smoke Tests for Notebooks | ||
|
|
||
| on: | ||
| pull_request: | ||
| types: [opened, synchronize, reopened] | ||
| paths: | ||
| - "notebooks/**/*.ipynb" | ||
| - ".github/workflows/validate-and-execute-notebooks.yml" | ||
| push: | ||
| branches: [ main ] | ||
| paths: | ||
| - "notebooks/**/*.ipynb" | ||
| - ".github/workflows/validate-and-execute-notebooks.yml" | ||
|
|
||
| permissions: | ||
| contents: read | ||
|
|
||
| jobs: | ||
| validate_tests: | ||
| runs-on: ubuntu-latest | ||
| strategy: | ||
| matrix: | ||
| # Set the notebooks to validate, wildcards are allowed | ||
| notebooks_to_validate: ["notebooks/**/*.ipynb"] | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
|
|
||
| - name: Set up Python | ||
| uses: actions/setup-python@v5 | ||
| with: | ||
| python-version: "3.12" | ||
| cache: pip | ||
|
|
||
| - name: Install Testing Tools | ||
| run: | | ||
| pip install nbformat ipykernel | ||
| ipython kernel install --name "python3" --user | ||
|
|
||
| - name: Validate Notebooks | ||
| run: | | ||
| set -ux | ||
|
|
||
| shopt -s globstar nullglob | ||
|
|
||
| for FILE in ${{ matrix.notebooks_to_validate }}; do | ||
| if [ -f "$FILE" ]; then | ||
| echo "Validating notebook '$FILE'..." | ||
|
|
||
| FILE="$FILE" python - <<'PY' | ||
| import os, sys, nbformat | ||
|
|
||
| f = os.environ['FILE'] | ||
|
|
||
| n = nbformat.read(f, nbformat.NO_CONVERT) | ||
|
|
||
| # Validate the notebook | ||
| nbformat.validate(n) | ||
|
|
||
| has_parameters_cell = False | ||
|
|
||
| for cell in n.cells: | ||
| if cell.cell_type == 'code': | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. also add s check to ensure that the notebook has atleast one parameter tag. This will make the notebook testable There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added! |
||
|
|
||
| # Fail test if notebook has any code cell with outputs | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For these checks here, I wonder if it would be cleaner to use |
||
| if cell.outputs: | ||
| print(f'Code cell in notebook {f} has outputs') | ||
| sys.exit(1) | ||
|
|
||
| # Fail test if notebook has any code cell with execution count | ||
| if cell.execution_count: | ||
| print(f'Code cell in notebook {f} has execution count') | ||
| sys.exit(1) | ||
|
|
||
| # Check for code cells tagged with 'parameters' | ||
| if 'tags' in cell.metadata and 'parameters' in cell.metadata.tags: | ||
| has_parameters_cell = True | ||
|
|
||
| # Fail test if notebook doesn't have any code cell tagged with 'parameters' | ||
| if not has_parameters_cell: | ||
| print(f'Notebook {f} does not have any cell tagged with \'parameters\'') | ||
| sys.exit(1) | ||
| PY | ||
|
|
||
| fi | ||
| done | ||
|
|
||
| execute_tests: | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we run execute only if validate_tests pass? To do so There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point, changed. |
||
| needs: validate_tests | ||
| runs-on: ubuntu-latest | ||
| strategy: | ||
| matrix: | ||
| # Set the notebooks to execute | ||
| notebook_to_execute: ["notebooks/use-cases/document-conversion-standard.ipynb"] | ||
|
|
||
| # Set the files use in each notebook execution | ||
| file_to_use: ["https://raw.githubusercontent.com/py-pdf/sample-files/refs/heads/main/001-trivial/minimal-document.pdf"] | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
|
|
||
| - name: Set up Python | ||
| uses: actions/setup-python@v5 | ||
| with: | ||
| python-version: "3.12" | ||
| cache: pip | ||
|
|
||
| - name: Install Testing Tools | ||
| run: | | ||
| pip install papermill ipykernel | ||
| ipython kernel install --name "python3" --user | ||
|
|
||
| - name: Execute Notebooks | ||
| run: | | ||
| set -ux | ||
|
|
||
| NOTEBOOK="${{ matrix.notebook_to_execute }}" | ||
| FILE="${{ matrix.file_to_use }}" | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is great to have something in CI to validate notebook. As a long term solution, I plan to create pytests and call them from workflows. This will lets us scale and maintain as the repo grows. |
||
|
|
||
| echo "Executing notebook '$NOTEBOOK' with file '$FILE'..." | ||
|
|
||
| papermill $NOTEBOOK $NOTEBOOK.tmp.ipynb -b $(echo -n "files: [\"$FILE\"]" | base64 -w 0) | ||
|
|
||
| echo "✓ Notebook $NOTEBOOK executed successfully" | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a fan of having a python script in line within the action. As follow up work could we split this out into it's own file. I really like the
nbformat.validate()check here.