Fix handling of `null` with spread operator and improve test suite using mutation testing #129

dwnusbaum · 2025-06-03T21:43:24Z

Each commit can be reviewed independently. 4a9c4e2 and c920ac6 were largely done by an AI agent. I used VS code with the GitHub Copilot Chat extension in agent mode with Claude Sonnet 4 as the model.

4a9c4e2 was done using the following initial prompt:

Some of my tests are flaky when I try to run them concurrently because of static state in ProxyGeneratorAdapter. Can you adjust my test assertion methods like assertIntercept or assertInterceptExact to match proxy classes that end with _groovyProxy in a way that the exact numeric ID used by ProxyGeneratorAdapter does not matter, and then remove the reflection-related code in the relevant tests? All other interception should still be matched exactly. Thanks.

The agent more or less figured out what to do, but was trying to introduce new variants of assertIntercept, so I had to prompt a bit further to refine things. There was also some back-and-forth where the agent got pretty confused about regex escaping, but it was able to clean things up with some further prompting and suggestions of what approach to use.

c920ac6 was done using the following initial prompt:

Can you please introduce an abstract superclass for SandboxTransformerTest and TheTest that has all of the utility methods and setup code from SandboxTransformerTest? Right now TheTest is running all of its tests as well as all of the tests from SandboxTransformerTest, so many of the tests are running twice for no reason

That more or less worked, but there were some missing imports. The agent wanted to reset SandboxTransformerTest using git history and try again, but I stopped it and suggested that the problem might just be some missing imports, at which point it made the necessary changes successfully.

58d6069 and 5df914c were mostly done by me, although I used the agent to set up Pitest (https://pitest.org/). I asked the agent to summarize the Pitest report and indicate important cases where the test suite could be improved. The summary and suggestions were fine but not great.

I asked the agent to write some new tests to cover the gaps show by the mutation testing, and it was able to create new tests following the pattern of the existing one, but it went a bit too far and introduced a lot of redundant tests. For example, it tried to generate 6 new spread operator-related tests when all we really needed was one to cover the case of null in the list being spread over (e.g. ['a', null, 'ab']*.length()). Interestingly, this specific case actually surfaced a bug in how groovy-sandbox handles null with the spread operator (doesn't really matter for Pipeline because spread method calls are not supported in CPS mode). It was interesting to see the agent check the expected results by running groovy, although it didn't notice on its own that our implementation had a bug. I tried getting the agent to refine the tests, but ultimately it was simpler to just do that on my own looking at the mutation testing report for interesting coverage gaps.

I did try to see if the agent could generate something like blocksUnintendedCallsToNonSyntheticConstructors2, and although it did try quite a lot of things, it never managed to trigger the critical case. This was not that surprising though to me - this case is pretty subtle, the logic is totally unique to our project, and it even took me a while to get the test right.

Testing done

See above as well as new automated tests.

Submitter checklist

Make sure you are opening from a topic/feature/bugfix branch (right side) and not your main branch!
Ensure that the pull request title represents the desired changelog entry
Please describe what you did
Link to relevant issues in GitHub or Jira
Link to relevant pull requests, esp. upstream and downstream changes
Ensure you have provided tests that demonstrate the feature works or the issue is fixed

… SandboxTransformerTest do not run twice

Copilot

Pull Request Overview

This pull request fixes the handling of null values with the spread operator while improving the test suite through mutation testing and better test organization. Key changes include removing redundant reflection code in tests, introducing an abstract superclass for sandbox tests to avoid duplicate executions, and adding explicit handling of null values in the checkedCall method.

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
src/test/java/org/kohsuke/groovy/sandbox/TheTest.java	Updates test base class usage and removes reflection-based counter resets.
src/test/java/org/kohsuke/groovy/sandbox/SandboxTransformerTest.java	Cleans up test setup, updates interception comments, and adds new constructor and loop interception tests.
src/test/java/org/kohsuke/groovy/sandbox/AbstractSandboxTest.java	Introduces an abstract superclass for sandbox tests to centralize utility methods and setup.
src/main/java/org/kohsuke/groovy/sandbox/impl/Checker.java	Adds explicit handling of null values in the spread operator implementation.
pom.xml	Adds Pitest configuration to support mutation testing with updated output formats.

Comments suppressed due to low confidence (4)

src/main/java/org/kohsuke/groovy/sandbox/impl/Checker.java:73

The added else branch to handle null values in spread operator calls correctly addresses the bug; please confirm that this behavior is consistent with similar cases elsewhere in the codebase.

if (it!=null) { ... } else { r.add(null); }

src/test/java/org/kohsuke/groovy/sandbox/SandboxTransformerTest.java:540

Ensure that the updated comment accurately reflects the interception methods currently in use, maintaining consistency between the tests and documented behavior.

// These next 10 interceptions are from Checker.checkedCreateRange and Checker.checkedComparison.

src/test/java/org/kohsuke/groovy/sandbox/SandboxTransformerTest.java:412

Review the expected exception message in blocksUnintendedCallsToNonSyntheticConstructors2 to ensure it reliably captures the intended behavior considering any dynamic proxy naming conventions.

e -> assertThat(e.getMessage(), equalTo(...

pom.xml:47

[nitpick] Consider adding a short comment describing the purpose of the pitest-maven configuration to aid future maintainers in understanding the mutation testing setup.

<build> ... </build>

dwnusbaum · 2025-06-03T21:46:40Z

Let's try to get a code review from Copilot to see what it thinks.

dwnusbaum added 4 commits June 3, 2025 12:24

Replace reflection for matching proxy classes with regex-based matcher

4a9c4e2

Refactor test classes to use new abstract base class so that tests in…

c920ac6

… SandboxTransformerTest do not run twice

Fix bug with handling of spread method calls

58d6069

Slightly improve strength of test suite using mutation testing

5df914c

dwnusbaum requested a review from a team as a code owner June 3, 2025 21:43

dwnusbaum added the bug label Jun 3, 2025

dwnusbaum requested review from Copilot and removed request for a team June 3, 2025 21:46

Copilot AI reviewed Jun 3, 2025

View reviewed changes

Add explanatory comment to pitest-maven config in pom.xml

e140945

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix handling of `null` with spread operator and improve test suite using mutation testing #129

Fix handling of `null` with spread operator and improve test suite using mutation testing #129

Uh oh!

dwnusbaum commented Jun 3, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

dwnusbaum commented Jun 3, 2025

Uh oh!

Uh oh!

Fix handling of null with spread operator and improve test suite using mutation testing #129

Are you sure you want to change the base?

Fix handling of null with spread operator and improve test suite using mutation testing #129

Uh oh!

Conversation

dwnusbaum commented Jun 3, 2025

Testing done

Submitter checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

dwnusbaum commented Jun 3, 2025

Uh oh!

Uh oh!

Fix handling of `null` with spread operator and improve test suite using mutation testing #129

Fix handling of `null` with spread operator and improve test suite using mutation testing #129