Skip to content

GroupedHashAggregate in row format #2452

@yjshen

Description

@yjshen

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
We could both improve the performance and save the memory of GroupedHashAggregate by employing row format.
By using Vec<u8> backed rows, we are able to:

  1. compare compound grouping keys by comparing raw bytes directly.
  2. create all accumulator states by just creating a Vec<u8> for each key, and update the contents in place
  3. reduce the memory footprint for each group state, by changing from Vec<ScalarValue> based state to Vec<u8> based state with less datatype information.

Describe the solution you'd like

  1. A new Accumulator trait to manipulate state's updating/merging based on Vec<u8>
  2. branching AggregateExec::execute to employ row-based aggregate when applicable.

Describe alternatives you've considered

Additional context
#1708 and #2188

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions