Skip to content

Add memory size estimation for ParquetMetadata #1729

@alamb

Description

@alamb

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
In https://github.com/influxdata/influxdb_iox, caching is very important to performance. Part of what is cached in memory is a ParquetMetadata structure. In order to effectively cache this data (and free it when under memory pressure) we need to know an accurate estimate of the heap it is using.

Describe the solution you'd like
I would like a function such as the following that accurately estimates the memory usage of parquet metadata:

impl ParquetMetadata {
    ...

    /// In-memory size in bytes, including `self`.
    pub fn size(&self) -> usize {
        // recursively 
    }

   ...
}

Describe alternatives you've considered
I believe @domodwyer is considering caching only the parts that we need (rather than the ParquetMetadata object itself) in which case we would likely not need this feature in IOx. I think it would still be generally helpful though

Additional context
See https://github.com/influxdata/influxdb_iox/pull/4661 for example of usecase

Metadata

Metadata

Assignees

Labels

enhancementAny new improvement worthy of a entry in the changelog

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions