Skip to content

Conversation

@fabianofranz
Copy link
Contributor

@fabianofranz fabianofranz commented Oct 7, 2025

Description

Basic GitHub actions with smoke tests for notebooks. It performs basic validation with nbformat's validate on all notebooks in notebooks/**/*.ipynb and does a full run (with default settings) of selected notebooks using papermill.

How Has This Been Tested?

Tested using act CLI:

act -W .github/workflows/validate-and-execute-notebooks.yml

Workflow run:
https://github.com/opendatahub-io/odh-data-processing/actions/runs/18574482508?pr=20

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

Summary by CodeRabbit

  • Tests

    • Added CI checks that validate notebook integrity (structure, no pre-existing outputs) and execute notebooks automatically, including parameterized runs and improved logging for troubleshooting.
  • Chores

    • Added a dedicated CI workflow to ensure a consistent Python environment and kernel registration to make notebook validation and execution more reliable.

@coderabbitai
Copy link

coderabbitai bot commented Oct 7, 2025

Walkthrough

Adds a GitHub Actions workflow that validates notebooks (ensures code cells have no outputs/execution counts) and executes notebooks via Papermill with a base64-encoded external-file parameter; both jobs run on Python 3.12, register an ipykernel, and use matrix strategies.

Changes

Cohort / File(s) Summary
CI workflow: notebook validation and execution
.github/workflows/validate-and-execute-notebooks.yml
Adds a workflow with two jobs: validate_tests (checkout, setup Python 3.12, pip cache, install nbformat and ipykernel, register kernel, matrix-driven notebook validation ensuring code cells have no outputs/execution counts) and execute_tests (checkout, setup Python 3.12, pip cache, install papermill and ipykernel, register kernel, run Papermill on a notebook passing a base64-encoded external-file parameter).

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Dev as Developer
  participant GH as GitHub Actions
  participant V as validate_tests
  participant E as execute_tests
  participant PM as Papermill
  participant NB as Notebook

  Dev->>GH: push / PR triggers workflow
  GH->>V: start validate_tests (matrix over notebooks/files)
  V->>V: checkout, setup-python 3.12, pip cache, install nbformat/ipykernel, register kernel
  loop per notebook
    V->>NB: load with nbformat, check validity, assert no code-cell outputs/execution counts
    NB-->>V: validation result (fail on violations)
  end
  GH->>E: start execute_tests (matrix selects notebook + external file)
  E->>E: checkout, setup-python 3.12, pip cache, install papermill/ipykernel, register kernel
  E->>PM: papermill run (pass base64-encoded external file param)
  PM->>NB: execute notebook with registered kernel
  NB-->>PM: execution outputs/logs
  PM-->>E: exit status / logs (success/failure)
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

I hop through YAML fields at night,
I sniff for stray outputs hidden from sight.
Kernels wake and papermill spins the day,
Base64 carrots show the data's way.
CI burrows on—green lights make me bright. 🥕

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The PR title "[RHAIENG-1115] Smoke tests for notebooks" directly and accurately reflects the main change in the pull request, which is adding a GitHub Actions workflow for smoke testing notebooks. The title is concise, specific, and clearly communicates the primary intent without unnecessary detail or vague language. A reviewer scanning the pull request history would immediately understand that this change introduces automated smoke tests for notebooks in the CI/CD pipeline.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@fabianofranz
Copy link
Contributor Author

Depends on #19.

@fabianofranz
Copy link
Contributor Author

@alinaryan Still WIP, but let me know if you have comments about how this is going. Tks!

@fabianofranz
Copy link
Contributor Author

@shruthis4 Still WIP, but let me know if you have comments about how this is going. Tks!

@fabianofranz fabianofranz force-pushed the RHAIENG-1115-notebook-smoke-tests branch from d6a578f to d5d081a Compare October 14, 2025 19:40
@fabianofranz fabianofranz marked this pull request as ready for review October 14, 2025 19:43
@fabianofranz fabianofranz requested a review from a team as a code owner October 14, 2025 19:43
@fabianofranz
Copy link
Contributor Author

Ready for review.

@fabianofranz fabianofranz force-pushed the RHAIENG-1115-notebook-smoke-tests branch from d5d081a to 5b76306 Compare October 14, 2025 19:47
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 14cd429 and 5b76306.

📒 Files selected for processing (1)
  • .github/workflows/validate-and-execute-notebooks.yml (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

@fabianofranz fabianofranz force-pushed the RHAIENG-1115-notebook-smoke-tests branch 2 times, most recently from 6bc5f85 to abe1ade Compare October 14, 2025 20:23

# Check if the notebook has any code cells with outputs or execution counts
for cell in n.cells:
if cell.cell_type == 'code':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also add s check to ensure that the notebook has atleast one parameter tag. This will make the notebook testable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added!

set -ux

NOTEBOOK="${{ matrix.notebook_to_execute }}"
FILE="${{ matrix.file_to_use }}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great to have something in CI to validate notebook. As a long term solution, I plan to create pytests and call them from workflows. This will lets us scale and maintain as the repo grows.

fi
done

execute_tests:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we run execute only if validate_tests pass? To do so
execute_tests:
needs: validate_tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, changed.

@fabianofranz fabianofranz force-pushed the RHAIENG-1115-notebook-smoke-tests branch from abe1ade to 0bbfd2e Compare October 16, 2025 19:52
@fabianofranz
Copy link
Contributor Author

@shruthis4 thanks for the review, comments addressed.

Copy link
Contributor

@shruthis4 shruthis4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@shruthis4
Copy link
Contributor

Forgot to ask, can you also add the workflow run to the PR?

Signed-off-by: Fabiano Franz <[email protected]>
Signed-off-by: Fabiano Franz <[email protected]>
@fabianofranz fabianofranz force-pushed the RHAIENG-1115-notebook-smoke-tests branch from 0bbfd2e to 7b1f2a1 Compare October 16, 2025 20:52
@fabianofranz
Copy link
Contributor Author

for cell in n.cells:
if cell.cell_type == 'code':

# Fail test if notebook has any code cell with outputs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For these checks here, I wonder if it would be cleaner to use https://pypi.org/project/nb-clean/ instead of writing our own script to do it.

if [ -f "$FILE" ]; then
echo "Validating notebook '$FILE'..."

FILE="$FILE" python - <<'PY'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of having a python script in line within the action. As follow up work could we split this out into it's own file. I really like the nbformat.validate() check here.

@alimaredia alimaredia merged commit 9a42a3a into opendatahub-io:main Oct 17, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants