Skip to content

Conversation

@grahamalama
Copy link
Contributor

@grahamalama grahamalama commented Mar 20, 2024

Built on the groundwork of #903 (cheers @leplatrem, when I squash this I'll make sure you're a coauthor), this PR adds the classes that allow us to manage our Dead Letter Queue.

We included a basic heartbeat check in this PR to assert that we are able to read / write from the queue. I removed the integration with the app (in the /bugzilla_webhook route) for now. The idea is that we'll deploy this change first to validate that we can connect to the directory we're mounting on the infra side. Then, we'll follow on with how we integrate it into the /bugzilla_webhook route.

@grahamalama grahamalama requested a review from a team as a code owner March 20, 2024 18:32
@grahamalama grahamalama added the enhancement New feature or request label Mar 20, 2024
Copy link
Contributor

@alexcottner alexcottner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good

- warn when we have a bug in the queue with no items
- various debug messages
Copy link
Contributor

@leplatrem leplatrem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Graham for bringing this to another level 🙏 💯

@grahamalama
Copy link
Contributor Author

One more big though I'm having about the architecture of this PR:

We've been calling this a (singular) queue. But almost feels like we're building a queue manager. Like to me, what we return from get_all() is not a queue, but a collection of queues. And each bug has its own queue of failing messages. Thoughts?

@alexcottner
Copy link
Contributor

alexcottner commented Mar 22, 2024

We've been calling this a (singular) queue. But almost feels like we're building a queue manager. Like to me, what we return from get_all() is not a queue, but a collection of queues. And each bug has its own queue of failing messages. Thoughts?

In my head, this is a very simple partitioned queue. We have a single queue where things are going, but blocking only occurs within the partition (per bug). These terms would be more well defined if we were using an actual queue service with partitions and compute nodes.

- turn back into a method
- allow callers to pass a bug_id to filter size by bug
Return iterator of items from get()
Return dict of bug id, items from get_all()
Return backend.get_all() from retrieve (instead of flat list)
Copy link
Contributor

@leplatrem leplatrem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is great for a first iteration.

The retries field seems unused. We can drop it if we don't have any API to manipulate it.

Ideally I would like us to tackle these two in follow-ups:

Copy link
Contributor

@alexcottner alexcottner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really good. Left comment about an additional possible test scenario.

This also means we're marking a webhook event's time property as not optional
This allows us to fetch the item identifiers in the queue without loading the items into memory
Also, document QueueItemRetrievalError
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants