Skip to content

Conversation

cjerad
Copy link
Contributor

@cjerad cjerad commented Apr 18, 2022

Issue #, if available:

#556 Initial NTH v2 design proposal

Description of changes:

Add matchLabels property to the Terminator spec.

NTH will take no action when an SQS message targets a node that does not have the requisite labels and values.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@cjerad cjerad requested a review from snay2 April 18, 2022 14:00
@cjerad cjerad marked this pull request as ready for review April 18, 2022 14:04
@cjerad cjerad requested a review from a team as a code owner April 18, 2022 14:04
Expect(drainedNodes).To(BeEmpty())
})

It("does not delete the message from the SQS queue", func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How long do messages stay on the SQS queue? It seems like there could be a failure mode where NTH spends all its time re-processing messages on the queue (because, e.g., the labels didn't match) and isn't able to respond to new messages in time for k8s to gracefully handle a node termination. That's probably uncommon, but it is basically what mechanical-fish described in #494.

Is that a concern with this way of handling node labels? It seems like the philosophy of the terminator is "I only respond to requests that match my parameters; anything else I will leave for someone else to handle". So it may not be possible to prevent that situation. Curious what your thoughts are.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Queues have a message retention period setting. A dead-letter queue can also be configured.

Leaving the message in the queue gives another Terminator instance a chance to match and handle it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants