Skip to content

Conversation

@kaisecheng
Copy link
Contributor

@kaisecheng kaisecheng commented Aug 25, 2021

Fixed: #1034

Logstash fails to do post-register (fetch cluster_uuid, install index template, set up ilm) when Elasticsearch changes from unhealthy to healthy. Prior to the change, Logstash classifies a connection as alive when it fails to get license details. This commit enhances the connection check to include a license check, so post-register only execute once the license is valid

CI is red due to elasticsearch client gem issue.

steps to test

  1. start elasticsearch cluster
  2. remove master node
  3. start logstash
  4. curl 'logstash:9600/_node/stats?pretty' , cluster_uuid should be absent
  5. start master node
  6. curl 'logstash:9600/_node/stats?pretty' , cluster_uuid should exist

@kaisecheng kaisecheng marked this pull request as ready for review August 25, 2021 17:29

def successful_connection?
!!maximum_seen_major_version
!!maximum_seen_major_version && alive_urls_count > 0
Copy link
Contributor Author

@kaisecheng kaisecheng Aug 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once the license check passed, alive count +1

Copy link
Contributor

@kares kares left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 implementation of the bug fix simple

although we're relying on the fact that a "cluster_uuid" : "_na_" is followed by a failing (50X from /_license) license - in theory we could end up with "_na_" in plugin-metadata if ES recovery happened between the / and /_license check ... but I guess that's very low risk atm.

also left one suggestion whether we want to assert the cluster_uuid in the integration test

subject.multi_receive([event1, event2])
@es.indices.refresh
r = @es.search(index: 'logstash-*')
expect(r).to have_hits(2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the issue relates to reporting incorrect cluster_uuid
could we assert that subject.plugin_metadata.get(:cluster_uuid) is as expected (not empty and != "_na_")

dump version
@kaisecheng
Copy link
Contributor Author

kaisecheng commented Aug 26, 2021

You are right. The goal of the improvement is not to give an atomic solution that guarantees cluster_uuid an ID of ES cluster. Metricbeat could still get "na" because connections can be dead anytime. Looking at the design of post-register, it aims to not blocking the process by minor flaw, so there is no retry for fetching uuid, installing template and setting ilm. Following the same mindset, this change is made for the best effort.

Thanks for reviewing!

@kaisecheng kaisecheng requested a review from kares August 26, 2021 15:44
Copy link
Contributor

@kares kares left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one suggestion about testing string equality properly

r = @es.search(index: 'logstash-*')
expect(r).to have_hits(2)
expect(subject.plugin_metadata.get(:cluster_uuid)).not_to be_empty
expect(subject.plugin_metadata.get(:cluster_uuid)).not_to be("_na_")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
expect(subject.plugin_metadata.get(:cluster_uuid)).not_to be("_na_")
expect(subject.plugin_metadata.get(:cluster_uuid)).not_to eq("_na_")

@kaisecheng kaisecheng merged commit 91b2de5 into logstash-plugins:master Aug 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

unable to fetch cluster_uuid when Elasticsearch cluster has recovered

3 participants