Skip to content

Conversation

Blazer-007
Copy link
Member

@Blazer-007 Blazer-007 commented Jun 11, 2025

Dear Gobblin maintainers,

Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!

JIRA

Description

  • Here are some details about my PR, including screenshots (if applicable):

  • Added change to OpenTelemetryMetrics to emit ExponentialHistogramMetrics

  • Created package org.apache.gobblin.metrics.opentelemetry; which contains different classes required for OTelMetrics emission

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

  • OpenTelemetryHelperTest

  • OpenTelemetryInstrumentationTest

  • OpenTelemetryMetricTest

Commits

  • My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

@Blazer-007 Blazer-007 requested a review from Copilot June 11, 2025 14:45
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for emitting OpenTelemetry metrics in Gobblin, including the integration of exponential histogram metrics and long counter metrics via newly introduced classes and activity types.

  • Added new activity and helper classes to emit and manage OpenTelemetry metrics (e.g., EmitOTelMetrics, EmitOTelMetricsImpl).
  • Updated workflows, workers, and job launchers to incorporate metric emission, and added unit tests for the new metrics functionality.

Reviewed Changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
CommitStepWorkflowImpl.java Added metric emission using EmitOTelMetrics with new attribute assignments.
WorkFulfillmentWorker.java Included EmitOTelMetricsImpl in the activity implementation array.
ExecuteGobblinJobLauncher.java Integrated metric emission before and after workflow execution, including latency histograms.
EmitOTelMetricsImpl.java & EmitOTelMetrics.java New activity interface and implementation for OpenTelemetry metric emission.
ActivityType.java & GobblinTemporalConfigurationKeys.java Updated to define new activity type EMIT_OTEL_METRICS and configuration keys.
Various files under gobblin-metrics Added support for OpenTelemetry metrics instrumentation, including counters, histograms, and tests.
ConfigurationKeys.java (gobblin-api) Updated to include OpenTelemetry-related configuration keys.
Comments suppressed due to low confidence (1)

gobblin-temporal/src/main/java/org/apache/gobblin/temporal/ddm/workflow/impl/CommitStepWorkflowImpl.java:51

  • Consider replacing the literal 'currState' with the constant GaaSOpenTelemetryMetricsConstants.DimensionKeys.CURR_STATE for consistency with the rest of the codebase.
attributes.put("currState", "processWUStart");

@Getter
@AllArgsConstructor
public enum GaaSOpenTelemetryMetrics {
GAAS_JOB_STATUS("gaas_job_status", "Gaas job status counter", "1", OpenTelemetryMetricType.LONG_COUNTER),
Copy link
Contributor

@abhishekmjain abhishekmjain Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 is metric unit here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly s in the below one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 is default unit here , although we should use curly annotation as mentioned in wiki but have kept default for simplicity
s represents second to measure latency metrics
https://opentelemetry.io/docs/specs/semconv/general/metrics/

String openTelemetryClassName = state.getProp(ConfigurationKeys.METRICS_REPORTING_OPENTELEMETRY_CLASSNAME,
ConfigurationKeys.DEFAULT_METRICS_REPORTING_OPENTELEMETRY_CLASSNAME);
Class<?> metricsClass = Class.forName(openTelemetryClassName);
Method getInstanceMethod = metricsClass.getMethod("getInstance", State.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we use MethodUtils helper class instead of invoking methods ourself?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name,
attrs,
this.meter.histogramBuilder(name)
.setDescription(metric.getMetricDescription())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

description and unit can be used directly

log.info("FINISHED - ExecuteGobblinWorkflow.execute = {}", execGobblinStats);
attributes.put(CURR_STATE, JOB_COMPLETE);
emitOTelMetrics.emitLongCounterMetric(GaaSOpenTelemetryMetrics.GAAS_JOB_STATUS, 1L, attributes, finalProps);
attributes.remove(CURR_STATE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please elaborate on why are we doing this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing specific, just avoiding creating a new Map object

Comment on lines 23 to 24
public static final String STATE = "state";
public static final String CURR_STATE = "currState";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the difference between state and currState?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

State -- GenWU, ProcessWU, CommitStep
CurrState -- GenWUStart, GenWUComplete, ...

Comment on lines 108 to 110
public OpenTelemetryMetric getOrCreate(GaaSOpenTelemetryMetrics metric) {
return this.metrics.computeIfAbsent(metric.getMetricName(), name -> createMetric(metric));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the metrics map uses only metricName as key. If two metrics share the same name but differ in type (eg, LONG_COUNTER vs DOUBLE_HISTOGRAM), this can silently cause incorrect caching behavior. It would be better to use a composite key like metricName + "_" + metricType to avoid collisions

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two attributes, two metrics, or two events MUST NOT share the same name. Different entities (attribute and metric, metric and event) MAY share the same name.
https://opentelemetry.io/docs/specs/semconv/general/naming/#name-reuse-prohibition


@Getter
@AllArgsConstructor
public enum GaaSOpenTelemetryMetrics {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can move the metric creation logic into the GaaSOpenTelemetryMetrics enum itself instead of centralizing it in OpenTelemetryInstrumentation.createMetric.

Currently, the enum acts as a metadata holder, while the actual instantiation logic(via switch-case) is external and hence scattered. This becomes harder to extend if we introduce new metric types(eg gauges, timers, etc). A more extensible approach would be to let each enum constant hold a factory method and expose a createMetric() method

@Blazer-007 Blazer-007 force-pushed the virai_add_mdm_status_metric branch from d96262a to 832a86c Compare July 28, 2025 06:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants