Skip to content

Make parquet support optional  #7653

@alamb

Description

@alamb

Is your feature request related to a problem or challenge?

DataFusion aspires to be a modular query engine, and not all users need support for parquet

The parquet crate has a non trivial number of dependencies (some of which prevent compiling DataFusion to WASM -- see #7652)

Also there have been reports like #2042 where some of the native dependencies like zstd cause build issues

Describe the solution you'd like

I would like to make parquet support optional, the same way avro support is

Avro is marked as optional: https://github.com/apache/arrow-datafusion/blob/5f38135d5d21160d6b1ef7213578dd5eddfa4f95/datafusion/core/Cargo.toml#L53

It would be great to mark parquet as optional too
https://github.com/apache/arrow-datafusion/blob/5f38135d5d21160d6b1ef7213578dd5eddfa4f95/datafusion/core/Cargo.toml#L82

In order to make this work, we would likely need to encapsulate the parquet code in a more modular fashion (rather than sprinkling #[cfg(feature = parquet)] all over the code

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions