Skip to content

deprecate P_STORAGE_UPLOAD_INTERVAL  #651

@nitisht

Description

@nitisht

We explored the option of making the Parquet size configurable by making the upload interval configurable using the env P_STORAGE_UPLOAD_INTERVAL, but the problem with this approach is manifold.

  • The data in staging area is now variable and can span hours or days. This makes the query code very complicated.
  • There is not much benefit on the parquet size, because it is difficult to predict the volume of logs.

A better approach would be to add a separate compaction engine that can compact and create more compressed parquet files for historical data. We'll take that up in a separate exercise. For now we need to revert the changes in #616 and also remove the P_STORAGE_UPLOAD_INTERVAL option completely.

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions