feat(predicate): add configurable cache TTL for predicate filtering #1838

doxxx93 · 2025-10-23T23:46:14Z

Motivation

Addresses #1837

The predicate cache grows unbounded over the controller's lifetime. This becomes problematic for resources with auto-generated names (like Pods) or resources recreated with different UIDs (following #1830). Long-running controllers will accumulate entries and
consume unnecessary memory.

Solution

Added configurable TTL for the predicate cache:

Introduced Config struct with ttl: Duration (default: 1 hour)
Cache entries now store hash + last_seen timestamp
Expired entries evicted automatically during polling
Updated API: predicate_filter(predicate, config) - use Default::default() for default behavior
~130 lines added(include test code), kept inline in predicate module as suggested
Added test for TTL expiration, all existing tests pass
Updated examples and documentation

Questions

@clux Before finalizing:

Default TTL: 1 hour appropriate, or prefer different duration?
Eviction timing: Every poll (current) vs. periodic vs. lazy?
TTL semantics: Time-since-last-seen (current) vs. time-since-first-insertion?

Happy to adjust based on your preferences.

Signed-off-by: doxxx93 <[email protected]>

codecov · 2025-10-23T23:54:42Z

Codecov Report

❌ Patch coverage is 87.80488% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.7%. Comparing base (7c63f56) to head (dbd0c51).

Files with missing lines	Patch %	Lines
kube-runtime/src/utils/watch_ext.rs	0.0%	3 Missing ⚠️
kube-runtime/src/utils/predicate.rs	94.8%	2 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##            main   #1838     +/-   ##
=======================================
+ Coverage   74.5%   74.7%   +0.2%     
=======================================
  Files         84      84             
  Lines       7877    7910     +33     
=======================================
+ Hits        5867    5902     +35     
+ Misses      2010    2008      -2

Files with missing lines	Coverage Δ
kube-runtime/src/controller/mod.rs	`31.6% <ø> (ø)`
kube-runtime/src/utils/mod.rs	`63.5% <ø> (ø)`
kube-runtime/src/utils/predicate.rs	`82.5% <94.8%> (+4.7%)`	⬆️
kube-runtime/src/utils/watch_ext.rs	`21.1% <0.0%> (ø)`

... and 3 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…ding Signed-off-by: doxxx93 <[email protected]>

clux

Thanks a lot. I have some initial comments for now. Need to verify something regarding the default value, but in theory this is exactly what I want 😄

kube-runtime/src/utils/predicate.rs

clux · 2025-10-24T16:16:43Z

kube-runtime/src/utils/predicate.rs

+            // Default to 1 hour TTL - long enough to avoid unnecessary reconciles
+            // but short enough to prevent unbounded memory growth


I am a bit uncertain about this value still. Because we bump the interval every time we see an object, if we see the object again during relists every 5m, then this setup (and value) works great because we will in practice never see the same object again unless the hash/uid changes.

However, I am noticing some differences in behaviour with long watches with and without using .streaming_lists(), and would like to confirm whether the behavior is correct there first.

good point. Since we update 'last_seen' on every encounter, objects that reappear during periodic relists will keep their timestamps fresh and won't expire.

Regarding the ' streaming_lists()' behavior - are you planning to verify this yourself, or would you like me to test both strategies (ListWatch vs StreamingList) to see how frequently objects are re-encountered? I can adjust the default TTL based on the findings.

If you'd like me to test this, I'm thinking of something like:

// Track when we see each pod to measure relist frequency let mut seen_times: HashMap<String, Vec<Instant>> = HashMap::new(); watcher(api, watcher_config) .applied_objects() .predicate_filter(predicates::generation, Default::default()) .try_for_each(|p| async { let name = p.name_any(); let now = Instant::now(); if let Some(last_seen) = seen_times.get(&name).and_then(|v| v.last()) { let elapsed = now.duration_since(*last_seen); info!("Pod {} re-encountered after {:?}", name, elapsed); } seen_times.entry(name).or_insert(vec![]).push(now); Ok(()) }) .await?;

Run this for ~10 minutes with both Config::default() (ListWatch) and Config::default().streaming_lists() to compare how often objects are re-sent.

Expected behavior (my assumption):

ListWatch: Objects re-appear during periodic relists (every ~5min?) → last_seen keeps updating → 1 hour TTL means entries effectively never expire for active resources

StreamingList: Objects might only appear once unless changed → last_seen doesn't update → TTL actually matters for eviction

Though I'm not entirely certain about StreamingList's relist behavior, so please let me know if this test approach makes sense or if you have better insights on this.

Yeah, I was going to have a look at it. My approach is similar. I don't think we even need to include predicates to see it. Running this modified example of pod_watcher;

use futures::prelude::*; use kube::{ Client, api::{Api, ResourceExt}, runtime::{WatchStreamExt, watcher}, }; use tracing::*; type X = k8s_openapi::api::networking::v1::Ingress; #[tokio::main] async fn main() -> anyhow::Result<()> { tracing_subscriber::fmt::init(); let client = Client::try_default().await?; let api = Api::<X>::default_namespaced(client); let use_watchlist = std::env::var("WATCHLIST").map(|s| s == "1").unwrap_or(false); let wc = if use_watchlist { // requires WatchList feature gate on 1.27 or later watcher::Config::default().streaming_lists() } else { watcher::Config::default() }; watcher(api, wc) .applied_objects() .default_backoff() .try_for_each(|p| async move { info!("saw {}", p.name_any()); Ok(()) }) .await?; Ok(()) }

and seeing re-updates from streaming_lists;

WATCHLIST=1 cargo run --example pod_watcher Blocking waiting for file lock on build directory Compiling kube-examples v2.0.1 (/home/clux/kube/kube/examples) Finished `dev` profile [unoptimized + debuginfo] target(s) in 3.44s Running `/home/clux/kube/kube/target/debug/examples/pod_watcher` 2025-10-25T19:51:33.102534Z INFO pod_watcher: saw five-e 2025-10-25T19:56:23.104634Z INFO pod_watcher: saw five-e 2025-10-25T20:01:13.106317Z INFO pod_watcher: saw five-e 2025-10-25T20:06:03.107981Z INFO pod_watcher: saw five-e

but NOT for the old way;

cargo run --example pod_watcher Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.12s Running `/home/clux/kube/kube/target/debug/examples/pod_watcher` 2025-10-25T20:06:58.630438Z INFO pod_watcher: saw five-e ... no further updates

unfortunately, i am not sure if this is a bug yet.

kube-runtime/src/utils/predicate.rs

Signed-off-by: doxxx93 <[email protected]>

doxxx93 · 2025-10-31T14:15:09Z

Hi @clux, just checking in on this - how's the investigation into the watch behavior going?

feat(predicate): add configurable cache TTL for predicate filtering

75e53e1

Signed-off-by: doxxx93 <[email protected]>

test(predicate): enhance cache TTL test with channel-based object sen…

036cca8

…ding Signed-off-by: doxxx93 <[email protected]>

clux linked an issue Oct 24, 2025 that may be closed by this pull request

Add a TTL on the runtime predicate cache #1837

Open

clux reviewed Oct 24, 2025

View reviewed changes

clux added the changelog-change changelog change category for prs label Oct 24, 2025

clux added this to the 3.0.0 milestone Oct 24, 2025

refactor(predicate): improve Config struct and add TTL setter method

dbd0c51

Signed-off-by: doxxx93 <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat(predicate): add configurable cache TTL for predicate filtering #1838

feat(predicate): add configurable cache TTL for predicate filtering #1838

doxxx93 commented Oct 23, 2025 •

edited

Loading

Uh oh!

codecov bot commented Oct 23, 2025 •

edited

Loading

Uh oh!

clux left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

clux Oct 24, 2025

Uh oh!

doxxx93 Oct 25, 2025 •

edited

Loading

Uh oh!

doxxx93 Oct 25, 2025

Uh oh!

clux Oct 25, 2025

Uh oh!

Uh oh!

doxxx93 commented Oct 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		// Default to 1 hour TTL - long enough to avoid unnecessary reconciles
		// but short enough to prevent unbounded memory growth

Uh oh!

feat(predicate): add configurable cache TTL for predicate filtering #1838

Are you sure you want to change the base?

feat(predicate): add configurable cache TTL for predicate filtering #1838

Conversation

doxxx93 commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Solution

Questions

Uh oh!

codecov bot commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

clux left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

clux Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

doxxx93 Oct 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

doxxx93 Oct 25, 2025

Choose a reason for hiding this comment

Uh oh!

clux Oct 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

doxxx93 commented Oct 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

doxxx93 commented Oct 23, 2025 •

edited

Loading

codecov bot commented Oct 23, 2025 •

edited

Loading

doxxx93 Oct 25, 2025 •

edited

Loading