Skip to content

Conversation

@stephentoub
Copy link
Member

The Regex compiler and source generator weren't uncapturing captures inside of a negative lookaround. That then leads both to subsequent backreferences matching when they shouldn't.

Fixes #97455

The Regex compiler and source generator weren't uncapturing captures inside of a negative lookaround. That then leads both to subsequent backreferences matching when they shouldn't.
@ghost
Copy link

ghost commented Jan 24, 2024

Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions
See info in area-owners.md if you want to be subscribed.

Issue Details

The Regex compiler and source generator weren't uncapturing captures inside of a negative lookaround. That then leads both to subsequent backreferences matching when they shouldn't.

Fixes #97455

Author: stephentoub
Assignees: -
Labels:

area-System.Text.RegularExpressions

Milestone: -

Copy link
Member

@danmoseley danmoseley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable - I assume these tests still run and pass on .NET Framework..?

@danmoseley
Copy link
Member

No analogous issue with positive lookarounds?

@stephentoub
Copy link
Member Author

Seems reasonable - I assume these tests still run and pass on .NET Framework..?

Yes

@stephentoub
Copy link
Member Author

stephentoub commented Jan 25, 2024

No analogous issue with positive lookarounds?

No, those captures are supposed to persist (and do).

@stephentoub stephentoub merged commit 40387e9 into dotnet:main Jan 25, 2024
@stephentoub stephentoub deleted the fixnegativelookaroundcapture branch January 25, 2024 19:46
yield return (@"(?=(\d))4\1", "43", RegexOptions.None, 0, 2, false, "");

// Zero-width negative lookbehind assertion: Actual - "(\\w){6}(?<!XXX)def"
yield return (@"(\w){6}(?<!XXX)def", "XXXabcdef", RegexOptions.None, 0, 9, true, "XXXabcdef");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub are we covered for testing analogous case for negative lookbehind? eg., (?<!(b)a)\1 shouldn't match bb

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not from a test perspective; the src fix covers both directions. Feel free to submit a pr.

@danmoseley
Copy link
Member

does https://regex101.com/r/jsQ12z/1 mean that regex101 is still using 6.0?

@stephentoub
Copy link
Member Author

does https://regex101.com/r/jsQ12z/1 mean that regex101 is still using 6.0?

More likely is it's using the interpreter

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Regex evaluation bug - discrepancy between compiled and non-compiled regex

2 participants