-
-
Notifications
You must be signed in to change notification settings - Fork 4.5k
fix(similarity): Fix backfill query retry logic #76081
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Suspect IssuesThis pull request was deployed and Sentry observed the following issues:
Did you find this useful? React with a 👍 or 👎 |
@jangjodi I don't think the retry will help, here's the query plan (I can send you the query on slack if you want to run it yourself) You can see that the filter (ie times_seen and type=1) removes 177k rows. That's the reason this is timing out. I would recommend moving these filters into memory. One side effect of doing this is that you will sometimes end up with empty batches, which will cause you to exit early here: sentry/src/sentry/tasks/embeddings_grouping/utils.py Lines 199 to 214 in 10847d4
Instead, you should probably check for |
Move iteration over query result into retry function so that the query can be retried Remove unneeded count call Return batch end group id from retry function
I think it's ok to experiment with just the |
Move iteration over query result into
_make_postgres_call_with_filterfunction so that the query actually runs in this functionThis will allow the retry logic to actually work and for us to remove unneeded count call
Return batch end group id from
_make_postgres_call_with_filtersince it could be one of the groups that we filtered out