-
Notifications
You must be signed in to change notification settings - Fork 8
Exporter timeout and grpc knobs #162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The version we were currently usign has a missing dependency which isn't available anymore.
WalkthroughAdds ExporterOptions to configuration and threads it through LoadConfiguration to services and ExporterReconciler; changes gRPC server option handling from a single option to a slice and extends gRPC keepalive/server-parameter fields; updates Helm model/schema/values and bumps golangci-lint in the Makefile. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Main as cmd/main.go
participant Cfg as internal/config.LoadConfiguration
participant Svc as ControllerService
participant Reco as ExporterReconciler
Main->>Cfg: LoadConfiguration(ctx,...)
Cfg-->>Main: ConfigurationResult{ AuthenticationConfigResult, Router, ServerOptions[], Provisioning, ExporterOptions }
Main->>Svc: New ControllerService{ ServerOptions: configResult.ServerOptions, Router: configResult.Router, ... }
Main->>Reco: New ExporterReconciler{ ExporterOptions: configResult.ExporterOptions, ... }
Note right of Reco: Reconciler uses ExporterOptions.GetOfflineTimeout() for scheduling
sequenceDiagram
autonumber
participant Reco as ExporterReconciler
participant Cfg as ExporterOptions
participant K8s as Kubernetes API
Reco->>Cfg: GetOfflineTimeout()
Reco->>K8s: Read Exporter.Status.LastSeen
alt LastSeen older than offlineTimeout
Reco->>K8s: Set status Offline
Reco-->>K8s: Requeue after safetyMargin (10s)
else LastSeen within offlineTimeout
Reco->>Reco: expiration = LastSeen + offlineTimeout
Reco-->>K8s: RequeueAfter = min(expiration + margin - now, cap)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (4)
🧰 Additional context used🧬 Code graph analysis (4)internal/config/config.go (4)
internal/config/types.go (1)
cmd/main.go (5)
internal/config/oidc.go (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (9)
internal/config/types.go (1)
31-46
: Keepalive timeout fields stay strings—shouldn’t we parse durations earlier?All keepalive fields remain
string
, so invalid or malformed values will pass through config loading and only explode at runtime when gRPC parses them. Consider switching these totime.Duration
(withmetav1.Duration
or custom parsing) during config load so we fail fast with a clear error.cmd/router/main.go (1)
63-65
: Variable naming still singular despite sliceMinor nit:
serverOption
is now a slice. Consider renaming toserverOptions
for clarity, but optional.deploy/helm/jumpstarter/values.yaml (1)
45-46
: Document warnings about low offline timeout valuesMaybe mention any operational caveats (e.g., too low timeout triggering flapping) beside the default? Optional but could help operators.
cmd/main.go (3)
157-171
: Rename localoption
toserverOptions
for clarity.It holds a slice of grpc.ServerOption. Renaming avoids confusion with the previous single-option field and improves readability.
- authenticator, prefix, router, option, provisioning, exporterOptions, err := config.LoadConfiguration( + authenticator, prefix, router, serverOptions, provisioning, exporterOptions, err := config.LoadConfiguration(
176-181
: Defensive nil-safety when dereferencingexporterOptions
.Today
LoadConfiguration
always returns a non-nil pointer, but a small guard avoids future NPEs if that changes.Example (outside this block, just before constructing the reconciler):
expOpts := config.ExporterOptions{} if exporterOptions != nil { expOpts = *exporterOptions }Then here:
- ExporterOptions: *exporterOptions, + ExporterOptions: expOpts,
218-220
: Rename localoption
toserverOptions
Keeps naming consistent with the slice-based gRPC API. Update both the return binding and the SetupWithManager call:
- authenticator, prefix, router, option, provisioning, exporterOptions, err := config.LoadConfiguration(...) + authenticator, prefix, router, serverOptions, provisioning, exporterOptions, err := config.LoadConfiguration(...) … - ServerOptions: option, + ServerOptions: serverOptions,internal/config/grpc.go (2)
68-75
: Avoid shadowing thetime
package.
time, err := time.ParseDuration(...)
shadows the imported package. Rename the local var.- if config.Keepalive.Time != "" { - time, err := time.ParseDuration(config.Keepalive.Time) + if config.Keepalive.Time != "" { + pingInterval, err := time.ParseDuration(config.Keepalive.Time) if err != nil { return nil, err } - serverParams.Time = time + serverParams.Time = pingInterval }
77-81
: Comment and condition mismatch; ServerParameters are always added.You always set
Timeout
(defaultGrpcTimeout), makingserverParams
non-zero. Either remove the condition or gate setting the default.- // Only add ServerParameters if at least one parameter is set - if serverParams != (keepalive.ServerParameters{}) { - serverOptions = append(serverOptions, grpc.KeepaliveParams(serverParams)) - } + // Always add ServerParameters (Timeout defaults to defaultGrpcTimeout when unset) + serverOptions = append(serverOptions, grpc.KeepaliveParams(serverParams))internal/config/config.go (1)
94-95
: Avoid returning pointers to fields of a local struct (optional).Returning
&config.Provisioning
and&config.ExporterOptions
is safe (heap escape), but returning by value simplifies ownership and avoids interior pointers.Consider changing the signature to return
Provisioning
andExporterOptions
by value and adjust call sites accordingly. If you keep pointers, fine as-is.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
Makefile
(1 hunks)cmd/main.go
(3 hunks)cmd/router/main.go
(1 hunks)deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py
(4 hunks)deploy/helm/jumpstarter/charts/jumpstarter-controller/values.schema.json
(3 hunks)deploy/helm/jumpstarter/values.yaml
(2 hunks)internal/config/config.go
(3 hunks)internal/config/grpc.go
(1 hunks)internal/config/types.go
(2 hunks)internal/controller/exporter_controller.go
(4 hunks)internal/service/controller_service.go
(3 hunks)internal/service/router_service.go
(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (7)
internal/config/grpc.go (2)
deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py (2)
Grpc
(65-68)Keepalive
(29-62)internal/config/types.go (2)
Grpc
(27-29)Keepalive
(31-42)
internal/service/controller_service.go (3)
internal/authentication/types.go (1)
ContextAuthenticator
(9-11)internal/authorization/types.go (1)
ContextAttributesGetter
(10-12)internal/config/types.go (1)
Router
(48-48)
internal/config/types.go (1)
deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py (4)
Authentication
(215-222)Provisioning
(17-20)Grpc
(65-68)ExporterOptions
(206-212)
deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py (1)
internal/config/types.go (1)
ExporterOptions
(44-46)
cmd/main.go (2)
internal/config/config.go (1)
LoadConfiguration
(45-95)internal/config/types.go (2)
ExporterOptions
(44-46)Router
(48-48)
internal/config/config.go (2)
internal/config/types.go (5)
Router
(48-48)Provisioning
(19-21)ExporterOptions
(44-46)Config
(7-12)Grpc
(27-29)internal/config/grpc.go (1)
LoadGrpcConfiguration
(13-83)
internal/controller/exporter_controller.go (2)
deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py (1)
ExporterOptions
(206-212)internal/config/types.go (1)
ExporterOptions
(44-46)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: e2e-tests (ubuntu-24.04-arm)
- GitHub Check: e2e-tests (ubuntu-24.04)
- GitHub Check: e2e-tests-28d6b1cc3b49ab9ae176918ab9709a2e2522c97e
- GitHub Check: deploy-kind
- GitHub Check: tests
- GitHub Check: lint-go
🔇 Additional comments (11)
deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py (1)
206-232
: Schema mirrors Go types—looks goodThe new
ExporterOptions
model lines up with the Go struct and Helm schema. No issues spotted.internal/service/controller_service.go (2)
71-83
: ServerOptions slice integrates cleanlySwitching to a slice cleanly preserves existing interceptors while letting callers append more options. Looks solid.
703-724
: Base options rebuilt each start—works as intendedCollecting interceptors into a slice before appending external options keeps behavior identical while supporting multiple additional options. All good here.
deploy/helm/jumpstarter/charts/jumpstarter-controller/values.schema.json (3)
217-236
: ExporterOptions schema matches controller configThe schema addition for exporter options aligns with the Go and Python models, enforcing the null-or-string contract as expected.
626-672
: Config schema extension is consistentWiring the new
exporterOptions
property intoJumpstarterConfig
keeps the schema coherent. Nothing to fix.
678-770
: Keepalive schema expanded appropriatelyThe additional keepalive fields line up with the backend changes and include clear descriptions. No issues spotted.
internal/service/router_service.go (2)
38-43
: RouterService ServerOptions change looks goodField rename to a slice keeps behavior while enabling composition—no concerns.
123-131
: Start now supports multiple server options cleanlyBuilding a local slice before appending injected options mirrors the controller change—implementation looks correct.
Makefile (1)
188-189
: Confirm golangci-lint v2.5.0 compatibility
Ensure CI and local contributor environments install golangci-lint v2.5.0 and that runningmake golangci-lint && ./bin/golangci-lint version && ./bin/golangci-lint run ./...yields no new lint failures.
internal/controller/exporter_controller.go (1)
187-234
: Requeue scheduling logic looks solid.
- Uses configurable timeout.
- Schedules just after expiry with a safety margin.
- Caps requeue to avoid long sleeps.
Please confirm the desired defaults:
- safetyMargin = 10s
- max requeue cap = 5m
If different, I can parameterize them.
internal/config/config.go (1)
20-20
: Return type change to []grpc.ServerOption looks good.Aligns with downstream slice-based ServerOptions.
d69ad62
to
9e0e98a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
cmd/main.go (2)
157-170
: Rename localoption
toserverOptions
for clarity.It holds []grpc.ServerOption. Renaming improves readability and matches field name ServerOptions.
- authenticator, prefix, router, option, provisioning, exporterOptions, err := config.LoadConfiguration( + authenticator, prefix, router, serverOptions, provisioning, exporterOptions, err := config.LoadConfiguration( ... - ServerOptions: option, + ServerOptions: serverOptions,Also applies to: 218-220
176-181
: Nil-safety check (defensive).
exporterOptions
currently comes from&config.ExporterOptions
and won’t be nil, but if the source changes later this would panic. Consider guarding or passing the pointer instead to keep the contract flexible.Example:
- ExporterOptions: *exporterOptions, + ExporterOptions: func() config.ExporterOptions { + if exporterOptions == nil { + return config.ExporterOptions{} + } + return *exporterOptions + }(),
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
.github/workflows/e2e.yaml
(1 hunks)cmd/main.go
(3 hunks)deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py
(4 hunks)deploy/helm/jumpstarter/charts/jumpstarter-controller/values.schema.json
(3 hunks)deploy/helm/jumpstarter/values.yaml
(2 hunks)internal/config/config.go
(3 hunks)internal/config/types.go
(2 hunks)internal/controller/exporter_controller.go
(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- deploy/helm/jumpstarter/values.yaml
- deploy/helm/jumpstarter/charts/jumpstarter-controller/values.schema.json
- deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py
🧰 Additional context used
🧬 Code graph analysis (4)
internal/config/config.go (2)
internal/config/types.go (5)
Router
(48-48)Provisioning
(19-21)ExporterOptions
(44-46)Config
(7-12)Grpc
(27-29)internal/config/grpc.go (1)
LoadGrpcConfiguration
(13-83)
internal/controller/exporter_controller.go (3)
deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py (1)
ExporterOptions
(206-212)internal/config/types.go (1)
ExporterOptions
(44-46)internal/oidc/op.go (1)
Signer
(20-24)
cmd/main.go (2)
internal/config/config.go (1)
LoadConfiguration
(45-95)internal/config/types.go (2)
ExporterOptions
(44-46)Router
(48-48)
internal/config/types.go (1)
deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py (4)
Authentication
(215-222)Provisioning
(17-20)Grpc
(65-68)ExporterOptions
(206-212)
🪛 actionlint (1.7.7)
.github/workflows/e2e.yaml
22-22: invalid job ID "e2e-tests-release-0.7". job ID must start with a letter or _ and contain only alphanumeric characters, -, or _
(id)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: lint-go
- GitHub Check: tests
- GitHub Check: deploy-kind
🔇 Additional comments (5)
internal/config/types.go (1)
7-12
: LGTM: New ExporterOptions surface.Adding
ExporterOptions
toConfig
and definingOfflineTimeout
looks consistent with downstream usage.Also applies to: 44-46
internal/controller/exporter_controller.go (2)
186-234
: Requeue logic: sensible and capped.Using LastSeen + offlineTimeout with a safety margin and a 5m cap looks good.
37-58
: Data race: cachedofflineTimeout
written without synchronization. Also guard non-positive durations.Concurrent reconciles can race on first write. Use
sync.Once
to init the cache, and treat empty/invalid/non-positive durations as default.Apply this diff within the method:
-func (r *ExporterReconciler) getOfflineTimeout() time.Duration { - // Return cached value if already parsed - if r.offlineTimeout != 0 { - return r.offlineTimeout - } - - // Parse and cache the duration - if r.ExporterOptions.OfflineTimeout == "" { - r.offlineTimeout = time.Minute // Default fallback - } else { - duration, err := time.ParseDuration(r.ExporterOptions.OfflineTimeout) - if err != nil { - // If parsing fails, use default fallback - r.offlineTimeout = time.Minute - } else { - r.offlineTimeout = duration - } - } - - return r.offlineTimeout -} +func (r *ExporterReconciler) getOfflineTimeout() time.Duration { + r.offlineOnce.Do(func() { + const def = time.Minute + if r.ExporterOptions.OfflineTimeout == "" { + r.offlineTimeout = def + return + } + if d, err := time.ParseDuration(r.ExporterOptions.OfflineTimeout); err == nil && d > 0 { + r.offlineTimeout = d + } else { + r.offlineTimeout = def + } + }) + return r.offlineTimeout +}Additions outside this hunk:
- Import:
import "sync"
- Struct field:
type ExporterReconciler struct { ... - offlineTimeout time.Duration // Cached parsed duration + offlineTimeout time.Duration // Cached parsed duration + offlineOnce sync.Once }internal/config/config.go (2)
20-43
: LGTM: Router server options now a slice.Returning []grpc.ServerOption from
LoadRouterConfiguration
aligns with updated server code.
52-95
: LGTM: Added ExporterOptions and serverOptions propagation.Return shape updates and nil placements look correct; errors propagate properly.
9e0e98a
to
c1c96c6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (8)
.github/workflows/e2e.yaml (2)
1-8
: Harden workflow token scope.Add explicit minimal permissions (e.g., contents: read) at the workflow level.
name: End-to-end tests on: workflow_dispatch: pull_request: branches: - main - 'release-*' +permissions: + contents: read
24-31
: Optional: run the compat job on both architectures.To mirror the primary job’s coverage, consider adding a strategy matrix (ubuntu-24.04 and ubuntu-24.04-arm) here as well.
-e2e-tests-release-0-7: - runs-on: ubuntu-latest +e2e-tests-release-0-7: + strategy: + matrix: + os: [ubuntu-24.04, ubuntu-24.04-arm] + runs-on: ${{ matrix.os }}deploy/helm/jumpstarter/values.yaml (1)
82-84
: Default consistency: chart vs code fallbackChart defaults offlineTimeout to 180s, but code currently falls back to 1m if unset/invalid. Either align the code fallback to 180s or make the field required in the schema to avoid divergent behavior outside Helm. I’ve proposed a small code tweak in internal/config/types.go to align to 180s.
internal/config/types.go (2)
51-70
: Align fallback to chart (180s) and guard negative durationsSwitch fallback from 1m to 3m (180s) to match values.yaml, and treat non-positive parsed values as invalid to avoid “immediate offline.”
func (e *ExporterOptions) PreprocessConfig() { // Parse and cache the offline timeout duration if e.OfflineTimeout == "" { - e.offlineTimeoutDur = time.Minute // Default fallback + e.offlineTimeoutDur = 3 * time.Minute // Align with chart default (180s) } else { duration, err := time.ParseDuration(e.OfflineTimeout) if err != nil { - // If parsing fails, use default fallback - e.offlineTimeoutDur = time.Minute + // If parsing fails, use default fallback + e.offlineTimeoutDur = 3 * time.Minute } else { - e.offlineTimeoutDur = duration + if duration <= 0 { + // Prevent non-positive durations from causing immediate-offline behavior + e.offlineTimeoutDur = 3 * time.Minute + } else { + e.offlineTimeoutDur = duration + } } } }
71-74
: Defensive getter when preprocessing wasn’t calledMake GetOfflineTimeout resilient (use same defaults/parse if offlineTimeoutDur is zero). This helps tests or alternative initialization paths.
func (e *ExporterOptions) GetOfflineTimeout() time.Duration { - return e.offlineTimeoutDur + if e.offlineTimeoutDur != 0 { + return e.offlineTimeoutDur + } + if e.OfflineTimeout == "" { + return 3 * time.Minute + } + if d, err := time.ParseDuration(e.OfflineTimeout); err == nil && d > 0 { + return d + } + return 3 * time.Minute }deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py (2)
42-63
: Add JSON‑Schema pattern for duration stringsTo fail fast on typos, add a regex pattern for Go-style durations (ns|us|ms|s|m|h). This propagates into values.schema.json.
- timeout: Optional[str] = Field( - None, - description="How long the server waits for a ping response before closing the connection", - ) + timeout: Optional[str] = Field( + None, + description="How long the server waits for a ping response before closing the connection", + json_schema_extra={"pattern": r"^\d+(ns|us|ms|s|m|h)$"}, + ) @@ - maxConnectionIdle: Optional[str] = Field( + maxConnectionIdle: Optional[str] = Field( None, - description="Maximum time a connection can be idle before being closed", + description="Maximum time a connection can be idle before being closed", + json_schema_extra={"pattern": r"^\d+(ns|us|ms|s|m|h)$"}, ) @@ - maxConnectionAge: Optional[str] = Field( + maxConnectionAge: Optional[str] = Field( None, - description="Maximum lifetime of a connection before it's closed", + description="Maximum lifetime of a connection before it's closed", + json_schema_extra={"pattern": r"^\d+(ns|us|ms|s|m|h)$"}, ) @@ - maxConnectionAgeGrace: Optional[str] = Field( + maxConnectionAgeGrace: Optional[str] = Field( None, - description="Grace period after max connection age before forcible closure", + description="Grace period after max connection age before forcible closure", + json_schema_extra={"pattern": r"^\d+(ns|us|ms|s|m|h)$"}, ) @@ - time: Optional[str] = Field( + time: Optional[str] = Field( None, - description="How often the server sends keepalive pings to clients", + description="How often the server sends keepalive pings to clients", + json_schema_extra={"pattern": r"^\d+(ns|us|ms|s|m|h)$"}, )
206-213
: Validate exporter offlineTimeout formatApply the same duration pattern here to catch invalid values early.
- offlineTimeout: Optional[str] = Field( - None, - description="How long to wait before marking the exporter as offline", - ) + offlineTimeout: Optional[str] = Field( + None, + description="How long to wait before marking the exporter as offline", + json_schema_extra={"pattern": r"^\d+(ns|us|ms|s|m|h)$"}, + )internal/controller/exporter_controller.go (1)
192-209
: Nice dynamic requeue; consider constants for tuningRequeue with a 10s margin and 5m cap is sensible. Optionally lift these literals to package‑level consts for easier tuning and testability.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
.github/workflows/e2e.yaml
(1 hunks)cmd/main.go
(3 hunks)deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py
(4 hunks)deploy/helm/jumpstarter/charts/jumpstarter-controller/values.schema.json
(3 hunks)deploy/helm/jumpstarter/values.yaml
(2 hunks)internal/config/config.go
(3 hunks)internal/config/types.go
(2 hunks)internal/controller/exporter_controller.go
(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- deploy/helm/jumpstarter/charts/jumpstarter-controller/values.schema.json
- cmd/main.go
🧰 Additional context used
🧬 Code graph analysis (4)
internal/config/types.go (1)
deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py (1)
ExporterOptions
(206-212)
internal/controller/exporter_controller.go (3)
deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py (1)
ExporterOptions
(206-212)internal/config/types.go (1)
ExporterOptions
(46-49)api/v1alpha1/exporter_types.go (1)
ExporterConditionTypeOnline
(48-48)
deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py (1)
internal/config/types.go (1)
ExporterOptions
(46-49)
internal/config/config.go (3)
deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py (4)
Router
(289-294)Provisioning
(17-20)ExporterOptions
(206-212)Grpc
(65-68)internal/config/types.go (5)
Router
(76-76)Provisioning
(21-23)ExporterOptions
(46-49)Config
(9-14)Grpc
(29-31)internal/config/grpc.go (1)
LoadGrpcConfiguration
(13-83)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: tests
- GitHub Check: e2e-tests (ubuntu-24.04-arm)
- GitHub Check: e2e-tests-release-0-7
- GitHub Check: e2e-tests (ubuntu-24.04)
- GitHub Check: deploy-kind
- GitHub Check: lint-go
🔇 Additional comments (8)
.github/workflows/e2e.yaml (2)
24-24
: Job ID fix LGTM; no lingering references toe2e-tests-release-0.7
found.
No updates needed for external badges or docs.
27-30
: Verify and pin action SHAs
The tagrelease-0.7
could not be resolved injumpstarter-dev/jumpstarter-e2e
(404); please confirm the correct tag name and update both steps to use immutable commit SHAs (for example, pinmain
to743645c760a96250d7497652ac28e393e91e6a7f
).deploy/helm/jumpstarter/values.yaml (1)
93-99
: Keepalive timeout docs vs controller defaultsComment says “default: 20s” while our controller sets a default when unspecified (defaultGrpcTimeout). Please confirm defaultGrpcTimeout matches 120s (or the upstream default you intend) and update either the code default or this comment to avoid confusion.
internal/config/types.go (1)
34-44
: MinTime requirement remains implicitParsing of keepalive MinTime is still unconditional downstream. Ensure schemas/values guarantee a non-empty minTime to prevent parse errors (previously discussed).
internal/controller/exporter_controller.go (2)
41-43
: Plumbing looks goodExporterOptions injected into the reconciler is clean and keeps policy separate from logic.
163-190
: Safe use of preprocessed timeout; small robustness noteUsing GetOfflineTimeout centralizes policy. Ensure tests constructing ExporterReconciler outside LoadConfiguration either call PreprocessConfig or rely on the defensive getter (see suggested change in types.go).
internal/config/config.go (2)
16-22
: Switch to []grpc.ServerOption is appropriateThis unblocks composing multiple keepalive/server params. Callers updated accordingly in services.
94-98
: Preprocessing placement is correct and all LoadConfiguration callers handle the extended return signature
internal/config/types.go
Outdated
func (e *ExporterOptions) PreprocessConfig() { | ||
// Parse and cache the offline timeout duration | ||
if e.OfflineTimeout == "" { | ||
e.offlineTimeoutDur = time.Minute // Default fallback |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the chart the default is indeed 3 minutes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, just trying to be a bit defensive here, perhaps someone not even using the charts, but I should probably use the same default value, that's true.
internal/config/types.go
Outdated
} else { | ||
duration, err := time.ParseDuration(e.OfflineTimeout) | ||
if err != nil { | ||
// If parsing fails, use default fallback |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and we should probably fail here or log a warning/error
4d29efd
to
cbad11b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
cmd/main.go (1)
157-157
: Renameoption
toserverOptions
for clarity.The variable
option
now holds a slice of gRPC server options ([]grpc.ServerOption
), but the singular name suggests a single option. This creates a mismatch between the name and the actual type, reducing code readability.Apply this diff to rename the variable:
- authenticator, prefix, router, option, provisioning, exporterOptions, err := config.LoadConfiguration( + authenticator, prefix, router, serverOptions, provisioning, exporterOptions, err := config.LoadConfiguration(Also update line 219:
- ServerOptions: option, + ServerOptions: serverOptions,internal/config/types.go (1)
55-70
: LGTM! Solid preprocessing with defensive defaults.The PreprocessConfig method correctly parses and caches the offline timeout duration, with a sensible 3-minute default when the configuration is empty. This aligns with the Helm chart defaults and provides efficient access via GetOfflineTimeout.
Minor cleanup: The
else
clause at line 63 is unnecessary after thereturn
in the error case. You can simplify:} else { duration, err := time.ParseDuration(e.OfflineTimeout) if err != nil { return fmt.Errorf("PreprocessConfig: failed to parse exporter offline timeout: %w", err) - } else { - e.offlineTimeoutDur = duration - } + } + e.offlineTimeoutDur = duration }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
.github/workflows/e2e.yaml
(1 hunks)cmd/main.go
(3 hunks)deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py
(4 hunks)deploy/helm/jumpstarter/charts/jumpstarter-controller/values.schema.json
(3 hunks)deploy/helm/jumpstarter/values.yaml
(2 hunks)internal/config/config.go
(3 hunks)internal/config/types.go
(2 hunks)internal/controller/exporter_controller.go
(4 hunks)internal/controller/lease_controller_test.go
(5 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- deploy/helm/jumpstarter/values.yaml
- deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py
🧰 Additional context used
🧬 Code graph analysis (5)
internal/controller/exporter_controller.go (2)
deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py (1)
ExporterOptions
(206-212)internal/config/types.go (1)
ExporterOptions
(47-50)
internal/controller/lease_controller_test.go (2)
deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py (1)
ExporterOptions
(206-212)internal/config/types.go (1)
ExporterOptions
(47-50)
cmd/main.go (2)
internal/config/config.go (1)
LoadConfiguration
(45-100)internal/config/types.go (2)
ExporterOptions
(47-50)Router
(77-77)
internal/config/config.go (2)
internal/config/types.go (5)
Router
(77-77)Provisioning
(22-24)ExporterOptions
(47-50)Config
(10-15)Grpc
(30-32)internal/config/grpc.go (1)
LoadGrpcConfiguration
(13-83)
internal/config/types.go (1)
deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py (3)
Provisioning
(17-20)Grpc
(65-68)ExporterOptions
(206-212)
🔇 Additional comments (15)
.github/workflows/e2e.yaml (1)
22-30
: LGTM! Good practice for backwards compatibility testing.The addition of a dedicated job to test the current controller against the previous release (0.7) ensures backwards compatibility. The job ID correctly uses hyphens instead of dots, addressing the previous actionlint issue.
deploy/helm/jumpstarter/charts/jumpstarter-controller/values.schema.json (3)
217-236
: LGTM! Schema correctly defines ExporterOptions.The new
ExporterOptions
schema definition properly models theofflineTimeout
field with appropriate type constraints (string|null) and descriptive metadata.
662-672
: LGTM! ExporterOptions integrated into JumpstarterConfig.The
exporterOptions
field is correctly added toJumpstarterConfig
with proper reference to theExporterOptions
definition.
706-770
: LGTM! Keepalive fields expanded appropriately.The five new keepalive parameters (timeout, maxConnectionIdle, maxConnectionAge, maxConnectionAgeGrace, time) are properly defined with:
- Correct type constraints (string|null)
- Sensible defaults (null)
- Clear descriptions of their purpose
These align with gRPC keepalive best practices and match the implementation in
internal/config/grpc.go
.internal/config/config.go (2)
16-43
: LGTM! LoadRouterConfiguration correctly updated to return slice.The function signature properly returns
[]grpc.ServerOption
instead of a single option, aligning with the pattern used throughout this PR for handling multiple gRPC server options.
45-99
: LGTM! LoadConfiguration properly extended with ExporterOptions.Key improvements:
- Signature correctly returns
*ExporterOptions
as an additional value- All error returns properly updated with nil placeholders in the correct positions
PreprocessConfig()
is called onExporterOptions
before returning (line 95), ensuring duration parsing happens at load time rather than lazy initialization- Consistent error handling throughout
The preprocessing approach addresses the concurrency concerns raised in past reviews by parsing configuration values once during initialization rather than at runtime.
internal/controller/lease_controller_test.go (1)
379-385
: LGTM! Test properly configures ExporterOptions.The test correctly:
- Initializes
ExporterReconciler
withExporterOptions
containing a 1-minute offline timeout- Calls
PreprocessConfig()
to parse the duration before reconciliation- Validates the error from preprocessing
This mirrors the production initialization pattern and ensures tests exercise the real configuration path.
internal/controller/exporter_controller.go (3)
38-42
: LGTM! ExporterOptions field added correctly.The
ExporterOptions
field is properly added to the reconciler struct. Based on the discussion in past reviews, the preprocessing approach (parsing the duration duringLoadConfiguration
viaPreprocessConfig()
) is the correct solution rather than lazy initialization withsync.Once
.
158-190
: LGTM! Offline timeout now configurable.The reconciliation logic correctly:
- Retrieves the preprocessed offline timeout via
GetOfflineTimeout()
- Uses it consistently for all LastSeen comparisons
- Updates condition messages to reflect the actual configured timeout
This replaces the previous hardcoded 1-minute timeout with a configurable value.
192-209
: LGTM! Dynamic requeue logic is well-designed.The new requeue calculation is sophisticated and correct:
- Computes expiration time as
LastSeen + offlineTimeout
- Schedules requeue for expiration + 10-second safety margin
- Includes a 5-minute cap to avoid excessive wait times
- Handles the edge case where expiration has already passed
This ensures the controller efficiently re-evaluates exporter status around the time they should transition offline, rather than using a fixed requeue interval.
cmd/main.go (1)
176-184
: LGTM! ExporterOptions correctly threaded through.The ExporterOptions field is properly populated with the dereferenced exporterOptions returned from LoadConfiguration. The integration aligns with the changes in the ExporterReconciler struct.
internal/config/types.go (4)
4-5
: LGTM! Required imports for duration handling.The fmt and time imports are necessary for the new duration parsing and error handling in ExporterOptions.PreprocessConfig.
10-15
: LGTM! ExporterOptions integrated into Config.The ExporterOptions field is properly added to the Config struct with appropriate JSON tags, enabling serialization/deserialization of exporter-specific configuration.
34-45
: LGTM! Comprehensive gRPC keepalive and timeout configuration.The expanded Keepalive struct provides granular control over gRPC connection lifecycle through standard server parameters (Timeout, MaxConnectionIdle, MaxConnectionAge, MaxConnectionAgeGrace, Time). The inline comments clearly document each field's purpose, and the
omitempty
tags appropriately mark these as optional configuration values.
72-75
: LGTM! Clean getter for cached duration.The GetOfflineTimeout method provides efficient access to the pre-calculated duration. Since PreprocessConfig is called during LoadConfiguration (as shown in the relevant code snippets), the cached value is guaranteed to be initialized before any consumers access it.
4d4a555
to
c1579ae
Compare
c1579ae
to
cec95ca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
internal/config/grpc.go (1)
16-25
: Make EnforcementPolicy optional; default MinTime to 5m when only PermitWithoutStream is setParsing MinTime unconditionally fails on empty config, making keepalive effectively mandatory. Build the policy only when configured; if only PermitWithoutStream is true, use grpc-go’s default 5m MinTime.
- // Parse EnforcementPolicy parameters - minTime, err := time.ParseDuration(config.Keepalive.MinTime) - if err != nil { - return nil, err - } - - serverOptions = append(serverOptions, grpc.KeepaliveEnforcementPolicy(keepalive.EnforcementPolicy{ - MinTime: minTime, - PermitWithoutStream: config.Keepalive.PermitWithoutStream, - })) + // Build EnforcementPolicy only when MinTime or PermitWithoutStream is configured + if config.Keepalive.MinTime != "" || config.Keepalive.PermitWithoutStream { + var minTime time.Duration + if config.Keepalive.MinTime != "" { + var err error + minTime, err = time.ParseDuration(config.Keepalive.MinTime) + if err != nil { + return nil, fmt.Errorf("LoadGrpcConfiguration: invalid Keepalive MinTime: %w", err) + } + } else { + minTime = 5 * time.Minute // grpc-go default + } + serverOptions = append(serverOptions, grpc.KeepaliveEnforcementPolicy(keepalive.EnforcementPolicy{ + MinTime: minTime, + PermitWithoutStream: config.Keepalive.PermitWithoutStream, + })) + }Based on learnings
🧹 Nitpick comments (9)
.github/workflows/e2e.yaml (1)
27-27
: Pin action to a commit SHA for supply‑chain safety.Tags can be retagged; prefer an immutable SHA.
- - uses: jumpstarter-dev/[email protected] + - uses: jumpstarter-dev/jumpstarter-e2e@<commit-sha-for-release-0.7>internal/config/types.go (1)
75-77
: Defensive fallback in GetOfflineTimeout (optional).If PreprocessConfig is skipped accidentally, return a sane default.
func (e *ExporterOptions) GetOfflineTimeout() time.Duration { - return e.offlineTimeoutDur + if e.offlineTimeoutDur <= 0 { + // fallback; PreprocessConfig should normally set this + return 3 * time.Minute + } + return e.offlineTimeoutDur }cmd/main.go (2)
19-26
: Add time import for configuration timeout (see below).import ( "context" "crypto/tls" "encoding/pem" "flag" "net" "os" + "time"
157-171
: Bound LoadConfiguration with a timeout.Prevents hanging at startup if the API server is unreachable.
- configResult, err := config.LoadConfiguration( - context.Background(), + cfgCtx, cancel := context.WithTimeout(context.Background(), 30*time.Second) + defer cancel() + configResult, err := config.LoadConfiguration( + cfgCtx, mgr.GetAPIReader(), mgr.GetScheme(), client.ObjectKey{ Namespace: os.Getenv("NAMESPACE"), Name: "jumpstarter-controller", }, oidcSigner, string(pem.EncodeToMemory(&pem.Block{ Type: "CERTIFICATE", Bytes: oidcCert.Certificate[0], })), )internal/config/grpc.go (4)
69-75
: Avoid shadowing the time packageLocal var named time shadows the imported package; linters will flag this and it’s confusing.
- if config.Keepalive.Time != "" { - time, err := time.ParseDuration(config.Keepalive.Time) + if config.Keepalive.Time != "" { + pingInterval, err := time.ParseDuration(config.Keepalive.Time) if err != nil { - return nil, err + return nil, fmt.Errorf("LoadGrpcConfiguration: failed to parse Keepalive Time: %w", err) } - serverParams.Time = time + serverParams.Time = pingInterval }
43-48
: Consistently wrap parse errors with contextOnly Timeout parse errors are wrapped. Wrap all parse errors to aid debugging.
- maxIdle, err := time.ParseDuration(config.Keepalive.MaxConnectionIdle) - if err != nil { - return nil, err + maxIdle, err := time.ParseDuration(config.Keepalive.MaxConnectionIdle) + if err != nil { + return nil, fmt.Errorf("LoadGrpcConfiguration: failed to parse Keepalive MaxConnectionIdle: %w", err) } ... - maxAge, err := time.ParseDuration(config.Keepalive.MaxConnectionAge) - if err != nil { - return nil, err + maxAge, err := time.ParseDuration(config.Keepalive.MaxConnectionAge) + if err != nil { + return nil, fmt.Errorf("LoadGrpcConfiguration: failed to parse Keepalive MaxConnectionAge: %w", err) } ... - maxAgeGrace, err := time.ParseDuration(config.Keepalive.MaxConnectionAgeGrace) - if err != nil { - return nil, err + maxAgeGrace, err := time.ParseDuration(config.Keepalive.MaxConnectionAgeGrace) + if err != nil { + return nil, fmt.Errorf("LoadGrpcConfiguration: failed to parse Keepalive MaxConnectionAgeGrace: %w", err) } ... - if err != nil { - return nil, err + if err != nil { + return nil, fmt.Errorf("LoadGrpcConfiguration: failed to parse Keepalive Time: %w", err) }Also applies to: 52-57, 60-66, 71-72
31-39
: Confirm intent: KeepaliveParams is always applied due to default TimeoutSetting a default Timeout makes serverParams non-empty, so KeepaliveParams will always be appended. If that’s intended, drop the comment “Only add … if at least one parameter is set”. If not intended, consider gating the default behind an explicit keepalive config.
Option A (intended always-on): remove the guard/comment.
- // Only add ServerParameters if at least one parameter is set - if serverParams != (keepalive.ServerParameters{}) { - serverOptions = append(serverOptions, grpc.KeepaliveParams(serverParams)) - } + serverOptions = append(serverOptions, grpc.KeepaliveParams(serverParams))Option B (opt-in only): only set default when any field explicitly set; otherwise skip KeepaliveParams. Happy to draft this if you prefer.
Also applies to: 77-80
31-37
: Validate non-negative durationsGuard against negative durations (accepted by time.ParseDuration) which lead to surprising behavior at runtime.
Example for Timeout (apply similarly to others as appropriate):
if config.Keepalive.Timeout != "" { timeout, err := time.ParseDuration(config.Keepalive.Timeout) if err != nil { return nil, fmt.Errorf("LoadGrpcConfiguration: failed to parse Keepalive Timeout: %w", err) } + if timeout < 0 { + return nil, fmt.Errorf("LoadGrpcConfiguration: Keepalive Timeout must be >= 0") + } serverParams.Timeout = timeout } else { serverParams.Timeout = defaultGrpcTimeout }If you want stricter validation (e.g., > 0), clarify and I can update all fields accordingly.
Also applies to: 42-48, 51-57, 60-66, 69-75
internal/config/config.go (1)
65-65
: Wrap decode errors with contextAdd function/field context to errors for easier triage.
- if err := yaml.Unmarshal([]byte(rawRouter), &router); err != nil { - return nil, err + if err := yaml.Unmarshal([]byte(rawRouter), &router); err != nil { + return nil, fmt.Errorf("LoadConfiguration: failed to unmarshal router: %w", err) } ... - if err := yaml.UnmarshalStrict([]byte(rawConfig), &config); err != nil { - return nil, err + if err := yaml.UnmarshalStrict([]byte(rawConfig), &config); err != nil { + return nil, fmt.Errorf("LoadConfiguration: failed to unmarshal config: %w", err) }Also applies to: 75-75
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
.github/workflows/e2e.yaml
(1 hunks)cmd/main.go
(3 hunks)deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py
(4 hunks)deploy/helm/jumpstarter/charts/jumpstarter-controller/values.schema.json
(3 hunks)deploy/helm/jumpstarter/values.yaml
(2 hunks)internal/config/config.go
(2 hunks)internal/config/grpc.go
(1 hunks)internal/config/oidc.go
(2 hunks)internal/config/types.go
(3 hunks)internal/controller/exporter_controller.go
(4 hunks)internal/controller/lease_controller_test.go
(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- deploy/helm/jumpstarter/values.yaml
- deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py
- deploy/helm/jumpstarter/charts/jumpstarter-controller/values.schema.json
- internal/controller/lease_controller_test.go
🧰 Additional context used
🧬 Code graph analysis (6)
internal/controller/exporter_controller.go (1)
internal/config/types.go (1)
ExporterOptions
(49-52)
cmd/main.go (5)
internal/config/config.go (1)
LoadConfiguration
(44-105)internal/oidc/op.go (1)
Signer
(20-24)internal/config/types.go (3)
ExporterOptions
(49-52)Provisioning
(24-26)Router
(79-79)internal/authentication/bearer.go (1)
NewBearerTokenAuthenticator
(19-21)internal/authorization/basic.go (1)
NewBasicAuthorizer
(21-27)
internal/config/types.go (1)
deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py (5)
Authentication
(215-222)Provisioning
(17-20)Grpc
(65-68)ExporterOptions
(206-212)Router
(289-294)
internal/config/config.go (3)
internal/config/types.go (8)
ConfigurationResult
(93-99)Router
(79-79)Config
(12-17)Authentication
(19-22)Grpc
(32-34)ExporterOptions
(49-52)AuthenticationConfigResult
(87-90)Provisioning
(24-26)internal/config/oidc.go (1)
LoadAuthenticationConfiguration
(17-55)internal/config/grpc.go (1)
LoadGrpcConfiguration
(13-83)
internal/config/grpc.go (2)
internal/config/types.go (2)
Grpc
(32-34)Keepalive
(36-47)deploy/helm/jumpstarter/charts/jumpstarter-controller/model.py (2)
Grpc
(65-68)Keepalive
(29-62)
internal/config/oidc.go (1)
internal/config/types.go (2)
AuthenticationConfigResult
(87-90)Internal
(28-30)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: lint-go
- GitHub Check: tests
- GitHub Check: deploy-kind
- GitHub Check: e2e-tests-release-0-7
- GitHub Check: e2e-tests (ubuntu-24.04)
- GitHub Check: e2e-tests (ubuntu-24.04-arm)
🔇 Additional comments (5)
internal/config/oidc.go (1)
17-55
: Good refactor to structured result and internal JWT wiring.Clear defaulting of prefix, safe CA handling (file vs inline), and cohesive return type.
internal/controller/exporter_controller.go (1)
163-209
: Dynamic offline threshold and requeue logic looks solid.Uses preprocessed timeout, clear condition messages, and capped requeue with safety margin.
Please confirm tests cover:
- negative/zero offlineTimeout rejected by PreprocessConfig (after applying the guard),
- online→offline transition timing near the boundary (within ±safetyMargin).
internal/config/config.go (3)
19-19
: API change to []grpc.ServerOption looks goodReturning a slice aligns with grpc.NewServer(...opts...). LGTM.
59-60
: Router unmarshal is non-strict — confirm this is intentionalUnmarshal (not Strict) allows unknown keys/typos in router. If you want to catch typos early, switch to UnmarshalStrict; if router is intentionally flexible, leave as-is.
I can provide a quick diff either way.
98-104
: Nice consolidation into ConfigurationResultCleaner API and easier to evolve. LGTM.
cec95ca
to
31165c8
Compare
Summary by CodeRabbit
New Features
Refactor
Chores
Tests