Skip to content

Conversation

@doxxx93
Copy link
Contributor

@doxxx93 doxxx93 commented Oct 23, 2025

Motivation

Addresses #1837

The predicate cache grows unbounded over the controller's lifetime. This becomes problematic for resources with auto-generated names (like Pods) or resources recreated with different UIDs (following #1830). Long-running controllers will accumulate entries and
consume unnecessary memory.

Solution

Added configurable TTL for the predicate cache:

  • Introduced Config struct with ttl: Duration (default: 1 hour)
  • Cache entries now store hash + last_seen timestamp
  • Expired entries evicted automatically during polling
  • Updated API: predicate_filter(predicate, config) - use Default::default() for default behavior
  • ~130 lines added(include test code), kept inline in predicate module as suggested
  • Added test for TTL expiration, all existing tests pass
  • Updated examples and documentation

Questions

@clux Before finalizing:

  1. Default TTL: 1 hour appropriate, or prefer different duration?
  2. Eviction timing: Every poll (current) vs. periodic vs. lazy?
  3. TTL semantics: Time-since-last-seen (current) vs. time-since-first-insertion?

Happy to adjust based on your preferences.

@codecov
Copy link

codecov bot commented Oct 23, 2025

Codecov Report

❌ Patch coverage is 87.80488% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.7%. Comparing base (7c63f56) to head (dbd0c51).

Files with missing lines Patch % Lines
kube-runtime/src/utils/watch_ext.rs 0.0% 3 Missing ⚠️
kube-runtime/src/utils/predicate.rs 94.8% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##            main   #1838     +/-   ##
=======================================
+ Coverage   74.5%   74.7%   +0.2%     
=======================================
  Files         84      84             
  Lines       7877    7910     +33     
=======================================
+ Hits        5867    5902     +35     
+ Misses      2010    2008      -2     
Files with missing lines Coverage Δ
kube-runtime/src/controller/mod.rs 31.6% <ø> (ø)
kube-runtime/src/utils/mod.rs 63.5% <ø> (ø)
kube-runtime/src/utils/predicate.rs 82.5% <94.8%> (+4.7%) ⬆️
kube-runtime/src/utils/watch_ext.rs 21.1% <0.0%> (ø)

... and 3 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@clux clux linked an issue Oct 24, 2025 that may be closed by this pull request
Copy link
Member

@clux clux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot. I have some initial comments for now. Need to verify something regarding the default value, but in theory this is exactly what I want 😄

Comment on lines +131 to +132
// Default to 1 hour TTL - long enough to avoid unnecessary reconciles
// but short enough to prevent unbounded memory growth
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit uncertain about this value still. Because we bump the interval every time we see an object, if we see the object again during relists every 5m, then this setup (and value) works great because we will in practice never see the same object again unless the hash/uid changes.

However, I am noticing some differences in behaviour with long watches with and without using .streaming_lists(), and would like to confirm whether the behavior is correct there first.

Copy link
Contributor Author

@doxxx93 doxxx93 Oct 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. Since we update 'last_seen' on every encounter, objects that reappear during periodic relists will keep their timestamps fresh and won't expire.

Regarding the ' streaming_lists()' behavior - are you planning to verify this yourself, or would you like me to test both strategies (ListWatch vs StreamingList) to see how frequently objects are re-encountered? I can adjust the default TTL based on the findings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you'd like me to test this, I'm thinking of something like:

// Track when we see each pod to measure relist frequency
let mut seen_times: HashMap<String, Vec<Instant>> = HashMap::new();

watcher(api, watcher_config)
    .applied_objects()
    .predicate_filter(predicates::generation, Default::default())
    .try_for_each(|p| async {
        let name = p.name_any();
        let now = Instant::now();

        if let Some(last_seen) = seen_times.get(&name).and_then(|v| v.last()) {
            let elapsed = now.duration_since(*last_seen);
            info!("Pod {} re-encountered after {:?}", name, elapsed);
        }
        seen_times.entry(name).or_insert(vec![]).push(now);
        Ok(())
    })
    .await?;

Run this for ~10 minutes with both Config::default() (ListWatch) and Config::default().streaming_lists() to compare how often objects are re-sent.

Expected behavior (my assumption):

  • ListWatch: Objects re-appear during periodic relists (every ~5min?) → last_seen keeps updating → 1 hour TTL means entries effectively never expire for active resources
  • StreamingList: Objects might only appear once unless changed → last_seen doesn't update → TTL actually matters for eviction

Though I'm not entirely certain about StreamingList's relist behavior, so please let me know if this test approach makes sense or if you have better insights on this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was going to have a look at it. My approach is similar. I don't think we even need to include predicates to see it. Running this modified example of pod_watcher;

use futures::prelude::*;
use kube::{
    Client,
    api::{Api, ResourceExt},
    runtime::{WatchStreamExt, watcher},
};
use tracing::*;

type X = k8s_openapi::api::networking::v1::Ingress;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    tracing_subscriber::fmt::init();
    let client = Client::try_default().await?;
    let api = Api::<X>::default_namespaced(client);
    let use_watchlist = std::env::var("WATCHLIST").map(|s| s == "1").unwrap_or(false);
    let wc = if use_watchlist {
        // requires WatchList feature gate on 1.27 or later
        watcher::Config::default().streaming_lists()
    } else {
        watcher::Config::default()
    };

    watcher(api, wc)
        .applied_objects()
        .default_backoff()
        .try_for_each(|p| async move {
            info!("saw {}", p.name_any());
            Ok(())
        })
        .await?;
    Ok(())
}

and seeing re-updates from streaming_lists;

WATCHLIST=1 cargo run --example pod_watcher
    Blocking waiting for file lock on build directory
   Compiling kube-examples v2.0.1 (/home/clux/kube/kube/examples)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 3.44s
     Running `/home/clux/kube/kube/target/debug/examples/pod_watcher`
2025-10-25T19:51:33.102534Z  INFO pod_watcher: saw five-e
2025-10-25T19:56:23.104634Z  INFO pod_watcher: saw five-e
2025-10-25T20:01:13.106317Z  INFO pod_watcher: saw five-e
2025-10-25T20:06:03.107981Z  INFO pod_watcher: saw five-e

but NOT for the old way;

cargo run --example pod_watcher
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.12s
     Running `/home/clux/kube/kube/target/debug/examples/pod_watcher`
2025-10-25T20:06:58.630438Z  INFO pod_watcher: saw five-e
... no further updates

unfortunately, i am not sure if this is a bug yet.

@clux clux added the changelog-change changelog change category for prs label Oct 24, 2025
@clux clux added this to the 3.0.0 milestone Oct 24, 2025
@doxxx93
Copy link
Contributor Author

doxxx93 commented Oct 31, 2025

Hi @clux, just checking in on this - how's the investigation into the watch behavior going?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog-change changelog change category for prs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a TTL on the runtime predicate cache

2 participants