- 
                Notifications
    You must be signed in to change notification settings 
- Fork 3
feat(router): Hive Console Usage Reporting #499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| Summary of ChangesHello @ardatan, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a significant new feature: the ability for the GraphQL router to report detailed usage metrics to the GraphQL Hive Console. This integration provides valuable insights into how the router is being used, including operation names, execution times, and error rates. The reporting mechanism is highly configurable, allowing users to control aspects like sampling rates, excluded operations, and reporting intervals, ensuring efficient and relevant data collection. This enhancement is crucial for monitoring and optimizing GraphQL API performance and adoption. Highlights
 Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either  
 Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a  Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces Hive Console Client integration for usage reporting in the Hive Router. It includes changes to Cargo.lock and Cargo.toml files to add the new dependency, modifications to bin/router/src/lib.rs and bin/router/src/pipeline/mod.rs to implement the usage reporting logic, and a new file bin/router/src/pipeline/usage.rs for sending usage reports. The shared state is also updated to include the usage agent. I have provided review comments to address potential issues related to error handling and code clarity.
b11336d    to
    7c73c86      
    Compare
  
    25e7e44    to
    5c9a3ac      
    Compare
  
    25e2b93    to
    61308f1      
    Compare
  
    | } | ||
| let client_name = get_header_value(req, &usage_config.client_name_header); | ||
| let client_version = get_header_value(req, &usage_config.client_version_header); | ||
| let timestamp = SystemTime::now() | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be as_millis instead of sec*1000
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 👍
| }) | ||
| } | ||
|  | ||
| pub fn send_usage_report( | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like the name here is not good, as the function really collects the operation.
Maybe collect_usage_report?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this function is called from the hotpath, just for the sake of micro-perf, let's #[inline] it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 👍
| } | ||
|  | ||
| fn get_header_value(req: &HttpRequest, header_name: &str) -> Option<String> { | ||
| req.headers() | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The value of the header doesn't have to be a String here. As long as you don't need, it can remain &str here and return as such.
Even if eventually it wll be cloned internally by the usage-agent, i don't think it should happen here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 👍
|  | ||
| use crate::background_tasks::BackgroundTask; | ||
|  | ||
| pub fn from_config(router_config: &HiveRouterConfig) -> Option<UsageAgent> { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should get only the config part that's relevant to it, and then turn UsageAgent. The condition/decision on making the agent should happen in the caller function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, naming! probably create_hive_usage_agrent is a better name here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 👍
| usage_config.request_timeout, | ||
| usage_config.accept_invalid_certs, | ||
| flush_interval, | ||
| "hive-router".to_string(), | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is the user-agent to use?
You can use ROUTER_VERSION const and append it here, so we'll have something like hive-router@VERSION or hive-router/VERSION.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 👍
|  | ||
| pub fn send_usage_report( | ||
| schema: Arc<Document<'static, String>>, | ||
| start: Instant, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like passing the Instant here over and over can be replaced with measuring the total time and then just pass it here as Duration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 👍
| }; | ||
| usage_agent | ||
| .add_report(execution_report) | ||
| .unwrap_or_else(|err| tracing::error!("Failed to send usage report: {}", err)); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I feel like unwrap_or_else here could be more readable with a if let or match on the Result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 👍
| #[async_trait] | ||
| impl BackgroundTask for UsageAgent { | ||
| fn id(&self) -> &str { | ||
| "usage_report_flush_interval" | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hive_console_usage_report_task
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 👍
| 🐋 This PR was built and pushed to the following Docker images: Image Names:  Platforms:  Image Tags:  Docker metadata{
"buildx.build.ref": "builder-3120460a-6daa-42e0-a588-a96b5a30c41a/builder-3120460a-6daa-42e0-a588-a96b5a30c41a0/zxdo7heuqk99hrrp7r3fqyph5",
"containerimage.descriptor": {
  "mediaType": "application/vnd.oci.image.index.v1+json",
  "digest": "sha256:543ce52f3909f85c94da6848507319c6eae02efe51b57a5175d25e84a0a82f22",
  "size": 1609
},
"containerimage.digest": "sha256:543ce52f3909f85c94da6848507319c6eae02efe51b57a5175d25e84a0a82f22",
"image.name": "ghcr.io/graphql-hive/router:pr-499,ghcr.io/graphql-hive/router:sha-f37f916"
} | 
        
          
                bin/router/src/lib.rs
              
                Outdated
          
        
      | true => Some(JwtAuthRuntime::init(bg_tasks_manager, &router_config.jwt).await?), | ||
| false => None, | ||
| }; | ||
| let usage_agent = pipeline::usage_reporting::from_config(&router_config).map(Arc::new); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace with the new fn name i suggested above.
nit: use just the fn name, not full import here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 👍
| ✅  | 
        
          
                bin/router/src/schema_state.rs
              
                Outdated
          
        
      | pub metadata: SchemaMetadata, | ||
| pub planner: Planner, | ||
| pub subgraph_executor_map: SubgraphExecutorMap, | ||
| pub schema: Arc<Document<'static, String>>, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename to supergraph_schema as it's not clear if that's a supergraphg or a public-api schema
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 👍
        
          
                bin/router/src/shared_state.rs
              
                Outdated
          
        
      | pub override_labels_evaluator: OverrideLabelsEvaluator, | ||
| pub cors_runtime: Option<Cors>, | ||
| pub jwt_auth_runtime: Option<JwtAuthRuntime>, | ||
| pub usage_agent: Option<Arc<UsageAgent>>, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should pick one path:
1- call it usage_agent and then we need to abstract it a bit (like we did with supergraph loading)
2-call it hive_usage_agent here.
I tend to go with 2 for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 👍
        
          
                bin/router/src/shared_state.rs
              
                Outdated
          
        
      | CORSConfig(#[from] Box<CORSConfigError>), | ||
| #[error("invalid override labels config: {0}")] | ||
| OverrideLabelsCompile(#[from] Box<OverrideLabelsCompileError>), | ||
| #[error("error creating usage agent: {0}")] | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error creating hive usage agent:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 👍
| #[serde(deny_unknown_fields)] | ||
| pub struct UsageReportingConfig { | ||
| /// Your [Registry Access Token](https://the-guild.dev/graphql/hive/docs/management/targets#registry-access-tokens) with write permission. | ||
| pub token: String, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be access_token as this is explicitly how we call it everywhere in Console.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 👍
        
          
                lib/router-config/src/lib.rs
              
                Outdated
          
        
      |  | ||
| /// Configuration for usage reporting to GraphQL Hive. | ||
| #[serde(default)] | ||
| pub usage_reporting: Option<usage_reporting::UsageReportingConfig>, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we have a enabled field in UsageReportingConfig, this one should not be wrapped with Option.
It should be:
#[serde(default = "usage_reporting::UsageReportingConfig::default")]
    pub usage_reporting: usage_reporting::UsageReportingConfig,
And the impl Default for UsageReportingConfig should be implemented to configure it with enabled: false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually we don't have an enabled flag in UsageReportingConfig.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I think we should. We need to be explicit on these, otherwise it might end up with lack of ability to enable/disale via things like env vars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we introduce an enabled flag, user will need to define enabled: true even if they provide the env variables. Hive Gateway doesn't have an enabled flag for example. But let me try to make it enabled in case of env vars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
| /// 1.0 = 100% chance of being sent | ||
| /// Default: 1.0 | ||
| #[serde(default = "default_sample_rate")] | ||
| pub sample_rate: f64, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be a more user-friendly value?
I mean, if for durations we are using humantime and allow things like 10s, then why not allow user to write here 10% instead of 0.1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| /// Unit: seconds | ||
| /// Default: 15 (s) | ||
| #[serde(default = "default_request_timeout")] | ||
| pub request_timeout: u64, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be humantime, see other plugins for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed
graphql-hive/console#7196
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
| /// Unit: seconds | ||
| /// Default: 5 (s) | ||
| #[serde(default = "default_connect_timeout")] | ||
| pub connect_timeout: u64, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be humantime, see other plugins for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed
graphql-hive/console#7196
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
| /// Frequency of flushing the buffer to the server | ||
| /// Default: 5 seconds | ||
| #[serde(default = "default_flush_interval")] | ||
| pub flush_interval: u64, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be humantime, see other plugins for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 👍
| /// Your [Registry Access Token](https://the-guild.dev/graphql/hive/docs/management/targets#registry-access-tokens) with write permission. | ||
| pub token: String, | ||
| /// A target ID, this can either be a slug following the format “$organizationSlug/$projectSlug/$targetSlug” (e.g “the-guild/graphql-hive/staging”) or an UUID (e.g. “a0f4c605-6541-4350-8cfe-b31f21a4bf80”). To be used when the token is configured with an organization access token. | ||
| pub target_id: Option<String>, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this one can be validated during de-serialization as either {string}/{string}/{string} or uuid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 👍
        
          
                Cargo.toml
              
                Outdated
          
        
      | [workspace.dependencies] | ||
| graphql-tools = "0.4.0" | ||
| graphql-parser = "0.4.1" | ||
| graphql-parser = { version = "0.5.0", package = "graphql-parser-hive-fork" } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why use the custom one? we dropped it on purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SDK uses this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we need to fix that in SDK. I don't think the SDK has any reason to still use it now.
Router shouldn't use this custom one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
| One more thing, I think we might need to expose these by default using env vars. See env_var_override for example on how to do it. We can align to how it looks like in GW. | 
9d7a7d3    to
    cd15aa5      
    Compare
  
            
          
                Cargo.toml
              
                Outdated
          
        
      | [workspace.dependencies] | ||
| graphql-tools = "0.4.0" | ||
| graphql-parser = "0.4.1" | ||
| graphql-parser = { version = "0.5.0", package = "graphql-parser-hive-fork" } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be reverted now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! Done!
        
          
                bin/router/Cargo.toml
              
                Outdated
          
        
      | hive-router-query-planner = { path = "../../lib/query-planner", version = "2.0.2" } | ||
| hive-router-plan-executor = { path = "../../lib/executor", version = "6.0.0" } | ||
| hive-router-config = { path = "../../lib/router-config", version = "0.0.10" } | ||
| hive-console-sdk = "0.0.0" | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0.1.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
Hive Console Client integration
Ref ROUTER-102
Blocked by graphql-hive/console#7143
Documentation -> graphql-hive/console#7171
TODOs:
hive-console-sdkand add it to Cargo.toml here