Skip to content

Optimize symbolic regex Antimirov mode to avoid so much allocation #60918

@stephentoub

Description

@stephentoub

When in Antimirov mode, we logically maintain a list of states we're currently in and, for each step, generate a new list of states based on looking at each current state and determining all the states it could lead us to. However, rather than something like:

HashSet<Node> nextStates = _nextStates;
HashSet<Node> currentStates = _currentStates;
foreach (Node current in currentStates)
{
    foreach (Node next in Transition(current))
    {
        nextStates.Add(next);
    }
}
currentStates.Clear();
_nextStates = currentStates;
_currentStates = nextStates;

it's implemented using an immutable SymbolicRegexNode, where every additional state found involves unioning in the new state into the next states node, which means generating one or more new nodes for each additional state we union in.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions