Skip to content

Consider adding BloomFilter reading support to ParquetMetadataReader #6514

@alamb

Description

@alamb

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

Parquet now has the wonderful ParquetMetaDataReader structure from @adriangb and @etseidl

This handles reading the footer metadata as well as the page indexes.

@progval noted in #6505 (comment) that BloomFilters are similiar to the PageIndex, but are not currently read/written by the ParquetMetaDataReader

Describe the solution you'd like
I would like to be able to configure the ParquetMetaDataReader (and writer) to read BloomFilters as well

Describe alternatives you've considered
This might look something like

// read parquet metadata including page indexes
let file = open_parquet_file("some_path.parquet");
let mut reader = ParquetMetaDataReader::new()
    .with_bloom_filters(true);
reader.try_parse(&file).unwrap();
let metadata = reader.finish().unwrap();
// Somehow get access to the bloom filters (not sure what that API would look like)

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementAny new improvement worthy of a entry in the changelogparquetChanges to the parquet crate

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions