Way to share `SchemaDescriptorPtr` across `ParquetMetadata` objects

**Is your feature request related to a problem or challenge? Please describe what you are trying to do.**

In low latency parquet based query applications, it is important to be able to cache / reuse the `ParquetMetaData` from parquet files (to supply via [ArrowReaderBuilder::new_with_metadata](https://docs.rs/parquet/latest/parquet/arrow/arrow_reader/struct.ArrowReaderBuilder.html#method.new_with_metadata) instead of re-reading / parsing it from the parquet footer while reading the parquet data)

For many such systems (including InfluxDB 3.0) many of the files have the same schema so storing the same schema information for each parquet file is wasteful



**Describe the solution you'd like**
I would like a way to share `SchemaDescriptorPtr` -- e.g. the schema is already wrapped in an Arc so it is likely possibly to avoid storing the same schema over and over again 

https://docs.rs/parquet/latest/src/parquet/file/metadata.rs.html#197 . 


**Describe alternatives you've considered**

Perhaps we could add an API like `with_schema` to ParquetMetadata:

```rust
impl ParquetMetaData { 
... 
  /// Set the internal schema pointers
  fn with_schema(self, schema_descr: SchemaDescPtr) -> Self {
   ..
  }
...
}
```

It could be used like this:

```rust
let mut metadata: PaquetMetadata = ... // load metadata from a parquet file
// Check if we already have the same schema loaded
if let Some(existing_schema) = find_existing_schema(&catalog, &metadata) {
  // if so, use the existing schema 
  metadata = metadata.with_schema()
}
```


**Additional context**

This infrastructure is a natural follow on to https://github.com/apache/arrow-rs/issues/1729 to track the memory used

This API would likely be be tricky to implement given there are several references to the schema in `ParquetMetadata` child fields (e.g. https://docs.rs/parquet/latest/src/parquet/file/metadata.rs.html#299)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Way to share `SchemaDescriptorPtr` across `ParquetMetadata` objects #5999

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Way to share SchemaDescriptorPtr across ParquetMetadata objects #5999

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Way to share `SchemaDescriptorPtr` across `ParquetMetadata` objects #5999