Skip to content

Fix handling of null with spread operator and improve test suite using mutation testing #129

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

dwnusbaum
Copy link
Member

Each commit can be reviewed independently. 4a9c4e2 and c920ac6 were largely done by an AI agent. I used VS code with the GitHub Copilot Chat extension in agent mode with Claude Sonnet 4 as the model.

4a9c4e2 was done using the following initial prompt:

Some of my tests are flaky when I try to run them concurrently because of static state in ProxyGeneratorAdapter. Can you adjust my test assertion methods like assertIntercept or assertInterceptExact to match proxy classes that end with _groovyProxy in a way that the exact numeric ID used by ProxyGeneratorAdapter does not matter, and then remove the reflection-related code in the relevant tests? All other interception should still be matched exactly. Thanks.

The agent more or less figured out what to do, but was trying to introduce new variants of assertIntercept, so I had to prompt a bit further to refine things. There was also some back-and-forth where the agent got pretty confused about regex escaping, but it was able to clean things up with some further prompting and suggestions of what approach to use.

c920ac6 was done using the following initial prompt:

Can you please introduce an abstract superclass for SandboxTransformerTest and TheTest that has all of the utility methods and setup code from SandboxTransformerTest? Right now TheTest is running all of its tests as well as all of the tests from SandboxTransformerTest, so many of the tests are running twice for no reason

That more or less worked, but there were some missing imports. The agent wanted to reset SandboxTransformerTest using git history and try again, but I stopped it and suggested that the problem might just be some missing imports, at which point it made the necessary changes successfully.

58d6069 and 5df914c were mostly done by me, although I used the agent to set up Pitest (https://pitest.org/). I asked the agent to summarize the Pitest report and indicate important cases where the test suite could be improved. The summary and suggestions were fine but not great.

I asked the agent to write some new tests to cover the gaps show by the mutation testing, and it was able to create new tests following the pattern of the existing one, but it went a bit too far and introduced a lot of redundant tests. For example, it tried to generate 6 new spread operator-related tests when all we really needed was one to cover the case of null in the list being spread over (e.g. ['a', null, 'ab']*.length()). Interestingly, this specific case actually surfaced a bug in how groovy-sandbox handles null with the spread operator (doesn't really matter for Pipeline because spread method calls are not supported in CPS mode). It was interesting to see the agent check the expected results by running groovy, although it didn't notice on its own that our implementation had a bug. I tried getting the agent to refine the tests, but ultimately it was simpler to just do that on my own looking at the mutation testing report for interesting coverage gaps.

I did try to see if the agent could generate something like blocksUnintendedCallsToNonSyntheticConstructors2, and although it did try quite a lot of things, it never managed to trigger the critical case. This was not that surprising though to me - this case is pretty subtle, the logic is totally unique to our project, and it even took me a while to get the test right.

Testing done

See above as well as new automated tests.

Submitter checklist

  • Make sure you are opening from a topic/feature/bugfix branch (right side) and not your main branch!
  • Ensure that the pull request title represents the desired changelog entry
  • Please describe what you did
  • Link to relevant issues in GitHub or Jira
  • Link to relevant pull requests, esp. upstream and downstream changes
  • Ensure you have provided tests that demonstrate the feature works or the issue is fixed

@dwnusbaum dwnusbaum requested a review from a team as a code owner June 3, 2025 21:43
@dwnusbaum dwnusbaum added the bug label Jun 3, 2025
@dwnusbaum dwnusbaum requested review from Copilot and removed request for a team June 3, 2025 21:46
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request fixes the handling of null values with the spread operator while improving the test suite through mutation testing and better test organization. Key changes include removing redundant reflection code in tests, introducing an abstract superclass for sandbox tests to avoid duplicate executions, and adding explicit handling of null values in the checkedCall method.

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/test/java/org/kohsuke/groovy/sandbox/TheTest.java Updates test base class usage and removes reflection-based counter resets.
src/test/java/org/kohsuke/groovy/sandbox/SandboxTransformerTest.java Cleans up test setup, updates interception comments, and adds new constructor and loop interception tests.
src/test/java/org/kohsuke/groovy/sandbox/AbstractSandboxTest.java Introduces an abstract superclass for sandbox tests to centralize utility methods and setup.
src/main/java/org/kohsuke/groovy/sandbox/impl/Checker.java Adds explicit handling of null values in the spread operator implementation.
pom.xml Adds Pitest configuration to support mutation testing with updated output formats.
Comments suppressed due to low confidence (4)

src/main/java/org/kohsuke/groovy/sandbox/impl/Checker.java:73

  • The added else branch to handle null values in spread operator calls correctly addresses the bug; please confirm that this behavior is consistent with similar cases elsewhere in the codebase.
if (it!=null) { ... } else { r.add(null); }

src/test/java/org/kohsuke/groovy/sandbox/SandboxTransformerTest.java:540

  • Ensure that the updated comment accurately reflects the interception methods currently in use, maintaining consistency between the tests and documented behavior.
// These next 10 interceptions are from Checker.checkedCreateRange and Checker.checkedComparison.

src/test/java/org/kohsuke/groovy/sandbox/SandboxTransformerTest.java:412

  • Review the expected exception message in blocksUnintendedCallsToNonSyntheticConstructors2 to ensure it reliably captures the intended behavior considering any dynamic proxy naming conventions.
e -> assertThat(e.getMessage(), equalTo(...

pom.xml:47

  • [nitpick] Consider adding a short comment describing the purpose of the pitest-maven configuration to aid future maintainers in understanding the mutation testing setup.
<build> ... </build>

@dwnusbaum
Copy link
Member Author

Let's try to get a code review from Copilot to see what it thinks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant