Skip to content

Conversation

@mpdude
Copy link
Contributor

@mpdude mpdude commented Oct 1, 2025

When the Paginator creates the dedicated count query, it starts with a clone of the query underlying the pagination.

If that query has previously been executed, it contains a result set mapping (RSM) in the AbstractQuery::$_resultSetMapping property. That RSM may have been provided by the user, or, more probably, will have been created by the DQL parser when the query was first analyzed.

Either way, the RSM does not match or accurately reflect the actual count query. The count query modifies at least the select clause through either an output or a tree walker. That change needs to be reflected in the RSM for hydration to work correctly.

In the output walker case, the code in Paginator::getCountQuery even creates and sets a new RSM explicitly. In the tree walker case, the RSM was not taken care of.

My current suggestion is not to clone the query in preparation for counting the items, but instead create a copy of the query to start with. This will leave the RSM as null in the new query.

I was wondering whether there might be any good reasons to keep the existing RSM since it might have been provided by the user. But, since we modify the query and completely replace the select clause with the scalar expression for counting the rows, I don't see how any pre-existing RSM could apply to the new query.

Nevertheless, reviewers might want to take a step back and consider other options we might have to fix this:

  • Query::__clone marks a query as dirty when it is cloned. Should a dirty query drop its RSM to make sure it is re-created upon the next parse()?
  • Should we make a difference between RSMs provided by the client and those derived by the parser? Maybe only put user-provided ones in the $_resultSetMapping property and keep/use them unconditionally, but treat parser-derived RSMs differently and forget about them as soon as the query gets dirty?

@mpdude mpdude changed the base branch from 3.5.x to 2.20.x October 1, 2025 17:34
@mpdude mpdude changed the title paginator confused result set mapping initialized Paginator with output walker returns wrong result when the query has already been executed Oct 1, 2025
@mpdude mpdude force-pushed the paginator-confused-result-set-mapping-initialized branch from eef61d3 to 3e34b8e Compare October 1, 2025 17:35
@mpdude mpdude requested a review from Copilot October 1, 2025 17:47
Copilot

This comment was marked as resolved.

@mpdude mpdude changed the title Paginator with output walker returns wrong result when the query has already been executed Paginator with output walker returns count 0 when the query has previously been executed Oct 1, 2025
@janopae
Copy link

janopae commented Oct 2, 2025

Query::__clone marks a query as dirty when it is cloned. Should a dirty query drop its RSM to make sure it is re-created upon the next parse()?

I don't really think this is about cloning. In case we modified an existing query that has already been executed without cloning, we'd run into the same issue – say

$query = $queryBuilder->select('something')->getQuery();
$query->getResult();
$query->setHint(Query::HINT_CUSTOM_TREE_WALKERS, ChangeSelectClause::class);
$query->getResult();

The point is that we add a Tree Walker, and that might alter the SELECT clause (and therefore, the result set map).

If we wanted to solve this on the level of the Query, we'd need to distinguish between

  1. walkers that do not alter the SELECT clause (in this case, a user provided result set map must be kept)
  2. walkers that completely replace the selected columns (in this case, we should discard any previous result set map, user provided or not)
  3. walkers that change the SELECT clause but keep some of the original fields (in this case, we can't know what to do with a user-defined RSM – the only one knowing might be the Tree Walker itself. A parser generated RSM without user modifications could be discarded and re-generated.)

For the third case with a user-defined RSM, the current code structure is a bit problematic, because we always start with an empty RSM when parsing. So the walkers can't know any user-defined modifications. Can we change this without breaking existing Tree Walkers? I'm not sure.

In case we can't provide the TreeWalkers with information about a previous RSM with user modifications, and we can't get the TreeWalkers to tell us whether they modify the selected columns, I'd say we can't handle this on Query level. In this case, we should make providing the correct RSM the responsibility of the one setting the TreeWalker.

This is what this PR does: The Paginator knows it doesn't need any of the previous RSM for counting, as the TreeWalker it sets will completely replace the selected columns. Therefore, it creates a new Query with an empty RSM.

Copy link
Member

@greg0ire greg0ire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should docs/en/cookbook/dql-custom-walkers.rst also be updated to avoid cloning?

a potentially existing result set mapping (either set directly by the user,
or taken from the parser result from a previous invocation of Query::parse())
to the new query object. This is fine, since we are going to completely change the
select clause anyway, so a previously existing result set mapping (RSM) is probably wrong anyway.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's avoid repeating "anyway"?

Suggested change
select clause anyway, so a previously existing result set mapping (RSM) is probably wrong anyway.
select clause, so a previously existing result set mapping (RSM) is probably wrong anyway.

@janopae
Copy link

janopae commented Oct 16, 2025

Should docs/en/cookbook/dql-custom-walkers.rst also be updated to avoid cloning?

👍 for updating docs/en/cookbook/dql-custom-walkers.rst, but you shouldn't avoid cloning.

As I said earlier, this is not about cloning. It's about resetting the result set mapping.

If we operated on the same (uncloned) Query the user provided, we'd run into exactly the same problem: If you alter the SELECT clause of a query after a RSM has been set, the RSM will become incompatible.

There are 2 ways a RSM can be set:

  1. manually by the user
  2. when you execute the query and the RSM has not been set, a new RSM fitting to the current SELECT clause will be generated and saved to the Query object. Each subsequent execution will use that RSM unless a new one is set by the user.

In the Paginator, we don't know if either of these happened to the query earlier, but we want to execute a query based on the user query which selects something very different: The potential count of the results. Therefore, any RSM set or generated previously will be incompatible with the SELECT clause we want.

There is no way to reset the RSM to null and trigger the generation of a new, fitting RSM, because the RSM setter does not accept null. That's why this PR resorts to creating a new Query (which has its RSM set to null), which will automatically gereate a RSM fitting to our SELECT COUNT(*) once executed the first time.

If your TreeWalker only modifies conditional parts of the query and not the SELECT statement, then you can keep the RSM. This allows to proceed with user-defined modifications to the RSM. Cloning can absolutely make sense in those cases.

For example:

After this PR, the Paginator will still clone the main query to paginate.

The user specifies a query they want to be paginated. The Paginator will extend it with pagination, but clone it to avoid messing with the original object. It's basically the same query, but with LIMIT and OFFSET appended. User modifications to the RSM should be kept.

In order to calculate the LIMIT and OFFSET values, we need to count the potential results. This is a new Query, not a clone, because we ask for something different: The count, not the objects. Any previous RSM wouldn't make sense here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants