Skip to content

Decoupling Cache and Eviction Strategies #18405

@abhita

Description

@abhita

Currently for a given Cache like DefaultFilesMetadataCache, all the Cache Storage and Retrieval mechanisms are tightly coupled with Eviction Strategy.

struct DefaultFilesMetadataCacheState {
    lru_queue: LruQueue<Path, (ObjectMeta, Arc<dyn FileMetadata>)>,
    memory_limit: usize,
    memory_used: usize,
    cache_hits: HashMap<Path, usize>,
}

For any change in Eviction Policy strategy, we would have to end up implementing a new DataStructure having its' own implementation of Cache Accessing methods.

Proposed Flow:

Instead, we can decouple the Cache Data Structure and the Eviction Strategies by doing something similar as below:

pub struct CustomMetadataCache {
    /// (DashMap-based, already thread-safe)
    inner_cache: DashMap<ObjectMeta, FileMetadata>,
    /// The eviction policy (thread-safe)
    eviction_strategy: Arc<Mutex<Box<dyn EvictionStrategy>>>,
.
.
.
.

Accompanied by a pluggable Cache Strategy which would be listening to events from Cache-Storage and accordingly select items of eviction

// Core trait for cache eviction strategy
pub trait EvictionStrategy: Send + Sync {
    /// Called when a cache entry is accessed
    fn on_access(&mut self, key: &str, size: usize);

    /// Called when a cache entry is inserted
    fn on_insert(&mut self, key: &str, size: usize);

    /// Called when a cache entry is removed
    fn on_remove(&mut self, key: &str);

    /// Select entries for eviction to reach target size
    /// Returns keys to evict, ordered by eviction priority
    fn select_for_eviction(&self, target_size: usize) -> Vec<String>;

    /// Reset policy state
    fn clear(&mut self);

    /// Get the name of this strategy
    fn strategy_name(&self) -> &'static str;
}

Benefits

  • Separation of Concerns: Cache storage logic is independent of eviction policy
  • Strategy Hot-Swapping: Change eviction strategies without recompiling the cache
  • Multiple Implementations: Support LRU, LFU, FIFO, ARC, LIRS, or custom strategies out-of-the-box
  • Per-Cache Policies and Code Re-usability: Different cache instances can use different strategies
  • Reduced Duplication: Eliminate duplicated cache access code across implementations

@alamb @nuno-faria

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions