Skip to content

Conversation

@ardatan
Copy link
Member

@ardatan ardatan commented Oct 21, 2025

Hive Console Client integration
Ref ROUTER-102
Blocked by graphql-hive/console#7143

Documentation -> graphql-hive/console#7171

TODOs:

@ardatan ardatan marked this pull request as draft October 21, 2025 14:50
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @ardatan, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant new feature: the ability for the GraphQL router to report detailed usage metrics to the GraphQL Hive Console. This integration provides valuable insights into how the router is being used, including operation names, execution times, and error rates. The reporting mechanism is highly configurable, allowing users to control aspects like sampling rates, excluded operations, and reporting intervals, ensuring efficient and relevant data collection. This enhancement is crucial for monitoring and optimizing GraphQL API performance and adoption.

Highlights

  • Hive Console Usage Reporting Integration: The router now integrates with the hive-console-client to report GraphQL operation usage data, providing insights into router activity.
  • Configurable Usage Settings: A new UsageConfig struct has been introduced, allowing detailed configuration of reporting parameters such as API token, endpoint, sample rate, excluded operations, client headers, buffer size, and various timeouts.
  • Dynamic Usage Agent Initialization and Background Task: A UsageAgent is initialized based on the provided UsageConfig and registered as a background task, ensuring that usage reports are periodically flushed to the Hive Console.
  • GraphQL Pipeline Integration: The GraphQL execution pipeline has been modified to capture operation details, execution duration, and error counts, which are then conditionally sent to the UsageAgent for reporting.
  • Dependency Updates and Forked Parser: The project's dependencies have been updated, notably replacing the graphql-parser crate with graphql-parser-hive-fork across Cargo.lock and Cargo.toml files, and adding new crates like md5 and webpki-roots.
  • Error Counting in Execution Output: The PlanExecutionOutput now includes an error_count field, which tracks the number of errors encountered during query plan execution, providing crucial data for usage reporting.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces Hive Console Client integration for usage reporting in the Hive Router. It includes changes to Cargo.lock and Cargo.toml files to add the new dependency, modifications to bin/router/src/lib.rs and bin/router/src/pipeline/mod.rs to implement the usage reporting logic, and a new file bin/router/src/pipeline/usage.rs for sending usage reports. The shared state is also updated to include the usage agent. I have provided review comments to address potential issues related to error handling and code clarity.

@ardatan ardatan force-pushed the hive-usage-reporting branch 2 times, most recently from b11336d to 7c73c86 Compare October 23, 2025 14:56
@ardatan ardatan force-pushed the hive-usage-reporting branch 3 times, most recently from 25e7e44 to 5c9a3ac Compare October 28, 2025 12:50
@ardatan ardatan marked this pull request as ready for review October 29, 2025 14:03
@ardatan ardatan force-pushed the hive-usage-reporting branch from 25e2b93 to 61308f1 Compare October 29, 2025 14:03
}
let client_name = get_header_value(req, &usage_config.client_name_header);
let client_version = get_header_value(req, &usage_config.client_version_header);
let timestamp = SystemTime::now()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be as_millis instead of sec*1000

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

})
}

pub fn send_usage_report(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the name here is not good, as the function really collects the operation.
Maybe collect_usage_report?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this function is called from the hotpath, just for the sake of micro-perf, let's #[inline] it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

}

fn get_header_value(req: &HttpRequest, header_name: &str) -> Option<String> {
req.headers()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value of the header doesn't have to be a String here. As long as you don't need, it can remain &str here and return as such.
Even if eventually it wll be cloned internally by the usage-agent, i don't think it should happen here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍


use crate::background_tasks::BackgroundTask;

pub fn from_config(router_config: &HiveRouterConfig) -> Option<UsageAgent> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should get only the config part that's relevant to it, and then turn UsageAgent. The condition/decision on making the agent should happen in the caller function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, naming! probably create_hive_usage_agrent is a better name here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

usage_config.request_timeout,
usage_config.accept_invalid_certs,
flush_interval,
"hive-router".to_string(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is the user-agent to use?
You can use ROUTER_VERSION const and append it here, so we'll have something like hive-router@VERSION or hive-router/VERSION.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍


pub fn send_usage_report(
schema: Arc<Document<'static, String>>,
start: Instant,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like passing the Instant here over and over can be replaced with measuring the total time and then just pass it here as Duration?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

};
usage_agent
.add_report(execution_report)
.unwrap_or_else(|err| tracing::error!("Failed to send usage report: {}", err));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I feel like unwrap_or_else here could be more readable with a if let or match on the Result.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

#[async_trait]
impl BackgroundTask for UsageAgent {
fn id(&self) -> &str {
"usage_report_flush_interval"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hive_console_usage_report_task

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

@github-actions
Copy link

github-actions bot commented Oct 29, 2025

🐋 This PR was built and pushed to the following Docker images:

Image Names: ghcr.io/graphql-hive/router

Platforms: linux/amd64,linux/arm64

Image Tags: ghcr.io/graphql-hive/router:pr-499 ghcr.io/graphql-hive/router:sha-f37f916

Docker metadata
{
"buildx.build.ref": "builder-3120460a-6daa-42e0-a588-a96b5a30c41a/builder-3120460a-6daa-42e0-a588-a96b5a30c41a0/zxdo7heuqk99hrrp7r3fqyph5",
"containerimage.descriptor": {
  "mediaType": "application/vnd.oci.image.index.v1+json",
  "digest": "sha256:543ce52f3909f85c94da6848507319c6eae02efe51b57a5175d25e84a0a82f22",
  "size": 1609
},
"containerimage.digest": "sha256:543ce52f3909f85c94da6848507319c6eae02efe51b57a5175d25e84a0a82f22",
"image.name": "ghcr.io/graphql-hive/router:pr-499,ghcr.io/graphql-hive/router:sha-f37f916"
}

true => Some(JwtAuthRuntime::init(bg_tasks_manager, &router_config.jwt).await?),
false => None,
};
let usage_agent = pipeline::usage_reporting::from_config(&router_config).map(Arc::new);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace with the new fn name i suggested above.

nit: use just the fn name, not full import here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

@github-actions
Copy link

github-actions bot commented Oct 29, 2025

k6-benchmark results

     ✓ response code was 200
     ✓ no graphql errors
     ✓ valid response structure

     █ setup

     checks.........................: 100.00% ✓ 224673      ✗ 0    
     data_received..................: 6.6 GB  219 MB/s
     data_sent......................: 88 MB   2.9 MB/s
     http_req_blocked...............: avg=3.26µs   min=686ns  med=1.75µs  max=5.63ms   p(90)=2.52µs  p(95)=2.87µs  
     http_req_connecting............: avg=810ns    min=0s     med=0s      max=2.43ms   p(90)=0s      p(95)=0s      
     http_req_duration..............: avg=19.57ms  min=2.06ms med=18.68ms max=105.13ms p(90)=26.79ms p(95)=29.75ms 
       { expected_response:true }...: avg=19.57ms  min=2.06ms med=18.68ms max=105.13ms p(90)=26.79ms p(95)=29.75ms 
     http_req_failed................: 0.00%   ✓ 0           ✗ 74911
     http_req_receiving.............: avg=138.29µs min=28.4µs med=45.29µs max=64.13ms  p(90)=79.71µs p(95)=382.57µs
     http_req_sending...............: avg=21.83µs  min=4.76µs med=10.23µs max=20.48ms  p(90)=15.04µs p(95)=22.09µs 
     http_req_tls_handshaking.......: avg=0s       min=0s     med=0s      max=0s       p(90)=0s      p(95)=0s      
     http_req_waiting...............: avg=19.41ms  min=2.01ms med=18.55ms max=104.61ms p(90)=26.53ms p(95)=29.49ms 
     http_reqs......................: 74911   2492.270504/s
     iteration_duration.............: avg=20.02ms  min=5.67ms med=19ms    max=245.4ms  p(90)=27.18ms p(95)=30.18ms 
     iterations.....................: 74891   2491.605109/s
     vus............................: 50      min=50        max=50 
     vus_max........................: 50      min=50        max=50 

pub metadata: SchemaMetadata,
pub planner: Planner,
pub subgraph_executor_map: SubgraphExecutorMap,
pub schema: Arc<Document<'static, String>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to supergraph_schema as it's not clear if that's a supergraphg or a public-api schema

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

pub override_labels_evaluator: OverrideLabelsEvaluator,
pub cors_runtime: Option<Cors>,
pub jwt_auth_runtime: Option<JwtAuthRuntime>,
pub usage_agent: Option<Arc<UsageAgent>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should pick one path:
1- call it usage_agent and then we need to abstract it a bit (like we did with supergraph loading)
2-call it hive_usage_agent here.

I tend to go with 2 for now

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

CORSConfig(#[from] Box<CORSConfigError>),
#[error("invalid override labels config: {0}")]
OverrideLabelsCompile(#[from] Box<OverrideLabelsCompileError>),
#[error("error creating usage agent: {0}")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error creating hive usage agent:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

#[serde(deny_unknown_fields)]
pub struct UsageReportingConfig {
/// Your [Registry Access Token](https://the-guild.dev/graphql/hive/docs/management/targets#registry-access-tokens) with write permission.
pub token: String,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be access_token as this is explicitly how we call it everywhere in Console.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍


/// Configuration for usage reporting to GraphQL Hive.
#[serde(default)]
pub usage_reporting: Option<usage_reporting::UsageReportingConfig>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have a enabled field in UsageReportingConfig, this one should not be wrapped with Option.
It should be:

#[serde(default = "usage_reporting::UsageReportingConfig::default")]
    pub usage_reporting: usage_reporting::UsageReportingConfig,

And the impl Default for UsageReportingConfig should be implemented to configure it with enabled: false

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually we don't have an enabled flag in UsageReportingConfig.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I think we should. We need to be explicit on these, otherwise it might end up with lack of ability to enable/disale via things like env vars.

Copy link
Member Author

@ardatan ardatan Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we introduce an enabled flag, user will need to define enabled: true even if they provide the env variables. Hive Gateway doesn't have an enabled flag for example. But let me try to make it enabled in case of env vars.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

/// 1.0 = 100% chance of being sent
/// Default: 1.0
#[serde(default = "default_sample_rate")]
pub sample_rate: f64,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be a more user-friendly value?
I mean, if for durations we are using humantime and allow things like 10s, then why not allow user to write here 10% instead of 0.1?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

/// Unit: seconds
/// Default: 15 (s)
#[serde(default = "default_request_timeout")]
pub request_timeout: u64,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be humantime, see other plugins for example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed
graphql-hive/console#7196

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

/// Unit: seconds
/// Default: 5 (s)
#[serde(default = "default_connect_timeout")]
pub connect_timeout: u64,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be humantime, see other plugins for example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed
graphql-hive/console#7196

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

/// Frequency of flushing the buffer to the server
/// Default: 5 seconds
#[serde(default = "default_flush_interval")]
pub flush_interval: u64,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be humantime, see other plugins for example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

/// Your [Registry Access Token](https://the-guild.dev/graphql/hive/docs/management/targets#registry-access-tokens) with write permission.
pub token: String,
/// A target ID, this can either be a slug following the format “$organizationSlug/$projectSlug/$targetSlug” (e.g “the-guild/graphql-hive/staging”) or an UUID (e.g. “a0f4c605-6541-4350-8cfe-b31f21a4bf80”). To be used when the token is configured with an organization access token.
pub target_id: Option<String>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this one can be validated during de-serialization as either {string}/{string}/{string} or uuid.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

Cargo.toml Outdated
[workspace.dependencies]
graphql-tools = "0.4.0"
graphql-parser = "0.4.1"
graphql-parser = { version = "0.5.0", package = "graphql-parser-hive-fork" }
Copy link
Member

@dotansimha dotansimha Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why use the custom one? we dropped it on purpose.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SDK uses this one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we need to fix that in SDK. I don't think the SDK has any reason to still use it now.
Router shouldn't use this custom one

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@dotansimha
Copy link
Member

One more thing, I think we might need to expose these by default using env vars. See env_var_override for example on how to do it. We can align to how it looks like in GW.

@ardatan ardatan force-pushed the hive-usage-reporting branch from 9d7a7d3 to cd15aa5 Compare October 30, 2025 10:38
Cargo.toml Outdated
[workspace.dependencies]
graphql-tools = "0.4.0"
graphql-parser = "0.4.1"
graphql-parser = { version = "0.5.0", package = "graphql-parser-hive-fork" }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be reverted now

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Done!

hive-router-query-planner = { path = "../../lib/query-planner", version = "2.0.2" }
hive-router-plan-executor = { path = "../../lib/executor", version = "6.0.0" }
hive-router-config = { path = "../../lib/router-config", version = "0.0.10" }
hive-console-sdk = "0.0.0"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.1.0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@ardatan ardatan requested a review from dotansimha October 31, 2025 01:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants