Skip to content

Conversation

arthurpassos
Copy link
Collaborator

@arthurpassos arthurpassos commented Sep 16, 2025

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

  1. Properly implement system.exports table
  2. Add profileevents
  3. Remove importer sink, simplified the progress callback
  4. Add setting to overwrite file if it already exists in remote storage
  5. Store both parallel formatting and setting in the manifest
  6. Add part checksum to the filename

Documentation entry for user-facing changes

...

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64
  • All Regression
  • Disable CI Cache

Copy link

github-actions bot commented Sep 16, 2025

Workflow [PR], commit [437e67c]

tryLogCurrentException(__PRETTY_FUNCTION__, "Exception is in export part task");

std::lock_guard inner_lock(export_manifests_mutex);
writePartLog(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what is going on here... Are you trying to do the same thing in case writePartLog throws? Or is it only exec.execute() that can throw in this context?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should only be exec.execute() that throws. And we need to write the log in both cases

}
catch (const Exception &)
{
tryLogCurrentException(__PRETTY_FUNCTION__, "Exception is in export part task");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe smth like:

Suggested change
tryLogCurrentException(__PRETTY_FUNCTION__, "Exception is in export part task");
tryLogCurrentException(__PRETTY_FUNCTION__, fmt::format("while exporting the part {}. User should retry.", manifest.data_part->name));

M(Merge, "Number of executing background merges") \
M(MergeParts, "Number of source parts participating in current background merges") \
M(Move, "Number of currently executing moves") \
M(Export, "Number of currently executing exports") \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would probably add few counters to profile events smth like
PartsExports (counter of successfull exports)
PartsExportFailures (counter of exports with exception)
PartsExportDuplicated (when target exists)
PartsExportBytes (sum of the bytes written by exports)
PartsExportTotalMilliseconds (overall time of exports)

(Names - if you will have better ideas - feel free :) )

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added these except for PartsExportBytes because I still don't have the writing stats integrated.

@filimonov
Copy link
Member

filimonov commented Sep 17, 2025

If you can populate ProfileEvents -it would be awesome. (it also strored in thread context, probably not too hard to add but not 100% sure, but it's not a blocker, can go w/o)

Additionally i would like to see in system.exports the target file name. Ideally in the part_log too (maybe reuse path_on_disk column?)

@arthurpassos
Copy link
Collaborator Author

Also added the setting to overwrite the file

@Enmk
Copy link
Member

Enmk commented Sep 29, 2025

Implementation overview

---
  config:
    class:
      hideEmptyMembersBox: true
---
classDiagram
    ExportsList *--> "*" ExportsListElement
    ExportsListElement *--> ThreadGroup
    ContextShared *--> ExportsList
    Context --> ExportsList : exposes

    ExportsListElement --> ExportInfo : creates
    StorageSystemExports --> ExportInfo : reads

    MergeTreeData "*" *--> MergeTreeExportManifest
    MergeTreeData --> ExportsList : adds elemets
Loading

@arthurpassos
Copy link
Collaborator Author

Implementation overview

---
  config:
    class:
      hideEmptyMembersBox: true
---
classDiagram
    ExportsListElement --> ExportInfo : creates

    ExportsList *--> "*" ExportsListElement
    ContextShared *--> ExportsList
    Context --> ExportsList : exposes

    StorageSystemExports --> ExportInfo : reads
    MergeTreeData --> MergeTreeExportManifest
    MergeTreeData --> ExportsList : adds elemets
Loading

That seems about right

Copy link
Member

@Enmk Enmk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Enmk Enmk merged commit 51862f0 into antalya-25.6.5 Sep 29, 2025
131 of 136 checks passed
arthurpassos pushed a commit that referenced this pull request Sep 29, 2025
improve observability a bit, simplify sink
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants