Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions README/ReleaseNotes/v638/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,10 @@ If you want to keep using `TList*` return values, you can write a small adapter
RDF uses one copy of each histogram per thread. Now, RDataFrame can reduce the number of clones using `ROOT::RDF::Experimental::ThreadsPerTH3()`. Setting this
to numbers such as 8 would share one 3-d histogram among 8 threads, greatly reducing the memory consumption. This might slow down execution if the histograms
are filled at very high rates. Use lower number in this case.

### Snapshot
- The Snapshot method has been refactored so that it does not need anymore compile-time information (i.e. either template arguments or JIT-ting) to know the input column types. This means that any Snapshot call that specifies the template arguments, e.g. `Snapshot<int, float>(..., {"intCol", "floatCol"})` is now redundant and the template arguments can safely be removed from the call. At the same time, Snapshot does not need to JIT compile the column types, practically giving huge speedups depending on the number of columns that need to be written to disk. In certain cases (e.g. when writing O(10000) columns) the speedup can be larger than an order of magnitude. The Snapshot template is now deprecated and it will issue a compile-time warning when called. The function overload is scheduled for removal in ROOT 6.40.
- The default compression setting for the output dataset used by Snapshot has been changed from 101 (ZLIB level 1, the TTree default) to 505 (ZSTD level 5). This is a better setting on average, and makes more sense for RDataFrame since now the Snapshot operation supports more than just the TTree output data format. This change may result in smaller output file sizes for your analyses that use Snapshot with default settings. During the 6.38 development release cycle, Snapshot will print information about this change once per program run. Starting from 6.40.00, the information will not be printed. The message can be suppressed by setting ROOT_RDF_SILENCE_SNAPSHOT_INFO=1 in your environment.

## Python Interface

Expand Down
21 changes: 21 additions & 0 deletions tree/dataframe/inc/ROOT/RDF/RInterface.hxx
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
#include "ROOT/RResultPtr.hxx"
#include "ROOT/RSnapshotOptions.hxx"
#include <string_view>
#include "ROOT/RLogger.hxx"
#include "ROOT/RVec.hxx"
#include "ROOT/TypeTraits.hxx"
#include "RtypesCore.h" // for ULong64_t
Expand All @@ -44,8 +45,12 @@
#include "TProfile2D.h"
#include "TStatistic.h"

#include "ROOT/RVersion.hxx"

#include <algorithm>
#include <cstddef>
#include <cstdlib>
#include <cstring>
#include <initializer_list>
#include <iterator> // std::back_insterter
#include <limits>
Expand Down Expand Up @@ -1331,6 +1336,22 @@ public:
const ColumnNames_t &columnList,
const RSnapshotOptions &options = RSnapshotOptions())
{
// TODO: Remove before releasing 6.40.00
#if ROOT_VERSION_CODE >= ROOT_VERSION(6, 40, 0)
static_assert(false && "Remove information about change of Snapshot defaut compression settings.");
#endif
[[maybe_unused]] static bool once = []() {
if (const char *suppress = std::getenv("ROOT_RDF_SILENCE_SNAPSHOT_INFO"))
if (std::strcmp(suppress, "1") == 0)
return true;
RLogScopedVerbosity showInfo{ROOT::Detail::RDF::RDFLogChannel(), ROOT::ELogLevel::kInfo};
R__LOG_INFO(ROOT::Detail::RDF::RDFLogChannel())
<< "In ROOT 6.38, the default compression settings of Snapshot have been changed from 101 (ZLIB with "
"compression level 1, the TTree default) to 505 (ZSTD with compression level 5). This change may result "
"in smaller Snapshot output dataset size by default. In order to suppress this message, set "
"ROOT_RDF_SILENCE_SNAPSHOT_INFO=1 in your environment.";
return true;
}();
// like columnList but with `#var` columns removed
auto colListNoPoundSizes = RDFInternal::FilterArraySizeColNames(columnList, "Snapshot");
// like columnListWithoutSizeColumns but with aliases resolved
Expand Down
4 changes: 2 additions & 2 deletions tree/dataframe/inc/ROOT/RSnapshotOptions.hxx
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,8 @@ struct RSnapshotOptions {
}
std::string fMode = "RECREATE"; ///< Mode of creation of output file
ECAlgo fCompressionAlgorithm =
ROOT::RCompressionSetting::EAlgorithm::kZLIB; ///< Compression algorithm of output file
int fCompressionLevel = 1; ///< Compression level of output file
ROOT::RCompressionSetting::EAlgorithm::kZSTD; ///< Compression algorithm of output file
int fCompressionLevel = 5; ///< Compression level of output file
int fAutoFlush = 0; ///< AutoFlush value for output tree
int fSplitLevel = 99; ///< Split level of output tree
bool fLazy = false; ///< Do not start the event loop when Snapshot is called
Expand Down
Loading