-
Notifications
You must be signed in to change notification settings - Fork 202
Entity query data producer. #911
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Entity query data producer. #911
Conversation
public function resolve(string $type, int $limit = 10, ?int $offset = NULL, ?array $conditions = NULL, ?string $language = NULL, ?array $bundles = NULL, RefinableCacheableDependencyInterface $metadata): array { | ||
|
||
// Make sure that max limit is not crossed. | ||
if ($limit > static::SIZE_MAX) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove that limit. It's up to the implementation to make sure it's not above a certain threshold.
} | ||
|
||
// Make sure offset is zero or positive. | ||
if (!isset($offset) || $offset < 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use max() for this: https://www.php.net/manual/de/function.max.php
$offset = max($offset ?: 0, 0);
* id = "entity_query", | ||
* name = @Translation("Load entities"), | ||
* description = @Translation("Loads entities."), | ||
* produces = @ContextDefinition("entities", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"entities" is not a typed data type. It actuall produces a list of string. So the context definition here should be a "multiple" (multiple => true) of "string".
$query->condition('langcode', $language); | ||
} | ||
|
||
foreach ($conditions as $condition) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using assoc arrays for the conditions is not ideal but I don't have a better solution atm.
Addressed, please re-check |
Codecov Report
@@ Coverage Diff @@
## 8.x-4.x #911 +/- ##
==========================================
Coverage ? 52.79%
Complexity ? 648
==========================================
Files ? 114
Lines ? 1627
Branches ? 0
==========================================
Hits ? 859
Misses ? 768
Partials ? 0
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this data producer is just too generic. If you need the results on an entity query, it sounds like you want to write a custom data producer that does exactly what you need. Modeling that with a data producer and abstract inputs just feels like the wrong approach and might lead to unreadable code and bad DX. Do you have a different opinion?
Instead, we could maybe come up with an abstract base class that can be used to create entity query data producers with ease. Also with functionality for pagination, etc. through helper methods on the base class. Thoughts? |
The DX is actually not that bad, it's still quite simple and easy to write. Look what we use now:
It's pretty clean code and also pretty clear what it does. The biggest downside here is the compose and duplicating arguments in the follow up data producer (type, language). So what would be the benefit of having abstract class then? What's your vision here? Because I see it actually more complicated, but it's also possible I missed some point :) So I'd create my specific entity query data producer by extending the abstract one. I call the resolve method of abstract one to execute entity query and get ids. But as I want to have an entities I need to instantiate entity_load_multiple resolver and resolve ids to entities. This seems to be more complicated that what we have now. Have I missed something? |
That validates my point actually. The code you just linked is not secure. You are opening of anyone to provide custom filters while also letting them choose the operator. We had a security issue for just that a while ago. By the way, it would be advisable to use ENUMs for your filters all the way. Not sure if you do that atm.? With this PR, you are basically just shadowing the normal Entity Query API by abstracting the nice method calls into harder to debug array structures. I'd really prefer an abstract class that easily allows people to roll their own custom entity query producers with custom input & output. Such a custom producer could then also decide to directly load the entities instead of just returning the ids. |
Agreed that this dataproducer is not secure by default, which is not good. My idea would be to add a context allowed_filters like this: $registry->addFieldResolver('Query', 'jobQuery',
$builder->compose(
$builder->produce('entity_query', [
'type' => $builder->fromValue('node'),
'limit' => $builder->fromArgument('limit'),
'offset' => $builder->fromArgument('offset'),
'language' => $builder->fromArgument('language'),
'conditions' => $builder->fromArgument('filters'),
'bundles' => $builder->fromValue(Job::JOB_TYPES),
'allowed_filters' => $builder->fromValue('title', 'field_job_region'),
]),
$builder->produce('entity_load_multiple', [
'type' => $builder->fromValue('node'),
'ids' => $builder->fromParent(),
'language' => $builder->fromArgument('language'),
])
)
); That way an attacker can only filter by title and field_job_region on our jobs. Otherwise we can throw an exception. I don't think we need to restrict the operators. The operators are only a security issue if they are used on access restricted fields. But here we have an allow list of fields that can be filtered, so any operator is fine. |
My point is that there's no good argument to shadow-wrap the entity query API using an abstract assoc-array configuration. Instead, let's just provide a convenient base class or trait that can be used to easily roll custom data producers with use-case-specific inputs. |
The problem with an abstract base class is that you always need to write code when you want to use entity_query. IMO that is not necessary - if we can restrict the allowed filters then that seems like a good enough generic solution. It is not perfect because it does not map out the condition filters as you say. Currently we are using our entity_query only in 2 places, so that might also be an indicator that we don't actually need it that often. So I'm a bit divided here. I think we can do the 'allowed_filters' addition in this pull request, even if we don't merge it we can keep it as patch for jobiqo. Then leave the abstract base class for another PR if somebody wants to work on that. |
af0fbac
to
502ae12
Compare
I was actually looking for this (or an example to try and recreate it). I think better documentation on how to make specific classes for this would be better. I can see how opening up the entire Entity Query API like this without requiring the creation of specific code makes it easy to shoot yourself in the foot. My biggest question is actually somewhat tangential (so not sure if this PR would be the best place to discuss). It's about how to handle the limits, since access controls could remove results which means that if 10 entities are requested you could end up only being able to load 8. I think the pattern suggested in this PR (query first -> then load) may also suffer from this since we could have more query results than we're allowed to view. Having a single class that does both the querying and loading could take care of this since it could detect that it has loaded fewer entities than the limit and possibly load more. |
Another thought: I believe it's common practice in GraphQL to use |
I implemented generic pagination support according to the Relay Connection specification in https://github.com/goalgorilla/open_social/blob/feature/graphql-base/modules/custom/social_graphql/src/Plugin/GraphQL/DataProducer/QueryEntityBase.php This is used https://github.com/goalgorilla/open_social/blob/feature/graphql-base/modules/social_features/social_user/src/Plugin/GraphQL/DataProducer/QueryUser.php and can easily be re-used in other entity query based data producers. This leaves the logic of the extending class mostly free for needed filtering and possibly makes this PR unnecessary :) |
* | ||
* @DataProducer( | ||
* id = "entity_query", | ||
* name = @Translation("Load entities"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This data producer doesn't actually load entities but only finds their IDs
* "limit" = @ContextDefinition("integer", | ||
* label = @Translation("Limit"), | ||
* required = FALSE, | ||
* default_value = 10 | ||
* ), | ||
* "offset" = @ContextDefinition("integer", | ||
* label = @Translation("Offset"), | ||
* required = FALSE, | ||
* default_value = 0 | ||
* ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Offset based queries in Drupal are somewhat difficult in Drupal because Drupal allows access handlers that can not be put into a database query. This means that access filtering will be done after the fact which could mean that 10 entities are selected but none are actually sent back to the client.
For this same reason, generic "count" queries are not often seen in GraphQL schemas. On one hand because GraphQL is created for data sets where this isn't pratcial, on the other hand because access a single access check may make a count inaccurate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this can happen in some cases. I think we should add that to the docs as well.
Since we have been using these 2 dataproducers successfully in jobiqo for years I think we should merge an improved version of this. I think this should be done in a new pull request to preserve the state of this one for historic reasons. TODO:
|
Added some docs and fixed minor coding standard issues in #1360 , then merged it. As I wanted to have this change in a single commit for better using it as a patch I merged the other pull request. Sorry for the loss on attribution @rthideaway ! Closing this pull request as well, please open follow-ups if there are still issues. |
Absolutely no worries! :) I fully understand and agree with the procedure here. |
Splitting up new data producers as agreed in #859
This PR introduces Entity query data producer