Use callbacks to update a plot and a progress bar during the event loop.
Showcase registration of callback functions that act on partial results while
the event-loop is running using OnPartialResult
and OnPartialResultSlot
.
This tutorial is not meant to run in batch mode.
Author: Enrico Guiraud (CERN)
This notebook tutorial was automatically generated with ROOTBOOK-izer from the macro found in the ROOT repository on Wednesday, April 17, 2024 at 11:07 AM.
using namespace ROOT; // RDataFrame lives in here
ROOT::EnableImplicitMT();
const auto poolSize = ROOT::GetThreadPoolSize();
const auto nSlots = 0 == poolSize ? 1 : poolSize;
We start by creating a RDataFrame with a good number of empty events
const auto nEvents = nSlots * 10000ull;
RDataFrame d(nEvents);
heavyWork
is a lambda that fakes some interesting computation and just returns a normally distributed double
TRandom r;
auto heavyWork = [&r]() {
for (volatile int i = 0; i < 1000000; ++i)
;
return r.Gaus();
};
Let's define a column "x" produced by invoking heavyWork
for each event
df
stores a modified data-frame that contains "x"
auto df = d.Define("x", heavyWork);
Now we register a histogram-filling action with the RDataFrame.
h
can be used just like a pointer to TH1D but it is actually a TResultProxy
auto h = df.Histo1D<double>({"browserHisto", "", 100, -2., 2.}, "x");
So far we have registered a column "x" to a data-frame with nEvents
events and we registered the filling of a
histogram with the values of column "x".
In the following we will register three functions for execution during the event-loop:
First off we create a TBrowser that contains a "RDFResults" directory
auto dfDirectory = new TMemFile("RDFResults", "RECREATE");
auto browser = new TBrowser("b", dfDirectory);
Warning in <TBrowser::TBrowser>: The ROOT browser cannot run in batch mode
The global pad should now be set to the TBrowser's canvas, let's store its value in a local variable
auto browserPad = gPad;
A useful feature of TResultProxy
is its OnPartialResult
method: it allows us to register a callback that is
executed once per specified number of events during the event-loop, on "partial" versions of the result objects
contained in the TResultProxy
. In this case, the partial result is going to be a histogram filled with an
increasing number of events.
Instead of requesting the callback to be executed every N entries, this time we use the special value kOnce
to
request that it is executed once right before starting the event-loop.
The callback is a C++11 lambda that registers the partial result object in dfDirectory
.
h.OnPartialResult(h.kOnce, [dfDirectory](TH1D &h_) { dfDirectory->Add(&h_); });
input_line_61:2:30: error: 'dfDirectory' cannot be captured because it does not have automatic storage duration h.OnPartialResult(h.kOnce, [dfDirectory](TH1D &h_) { dfDirectory->Add(&h_); }); ^ input_line_57:2:7: note: 'dfDirectory' declared here auto dfDirectory = new TMemFile("RDFResults", "RECREATE"); ^
Note that we called OnPartialResult
with a dot, .
, since this is a method of TResultProxy
itself.
We do not want to call OnPartialResult
on the pointee histogram!)
Multiple callbacks can be registered on the same TResultProxy
(they are executed one after the other in the
same order as they were registered). We now request that the partial result is drawn and the TBrowser's TPad is
updated every 50 events.
h.OnPartialResult(50, [&browserPad](TH1D &hist) {
if (!browserPad)
return; // in case root -b was invoked
browserPad->cd();
hist.Draw();
browserPad->Update();
// This call tells ROOT to process all pending GUI events
// It allows users to use the TBrowser as usual while the event-loop is running
gSystem->ProcessEvents();
});
input_line_62:2:26: error: 'browserPad' cannot be captured because it does not have automatic storage duration h.OnPartialResult(50, [&browserPad](TH1D &hist) { ^ input_line_60:2:7: note: 'browserPad' declared here auto browserPad = gPad; ^ In module 'std' imported from input_line_1:1: /usr/include/c++/9/bits/std_function.h:447:2: error: constructor for 'std::function<void (TH1D &)>' must explicitly initialize the base class '_Maybe_unary_or_binary_function<void, TH1D &>' which does not have a default constructor function(_Functor); ^ input_line_62:2:24: note: in instantiation of function template specialization 'std::function<void (TH1D &)>::function<(lambda at input_line_62:2:24), void, void>' requested here h.OnPartialResult(50, [&browserPad](TH1D &hist) { ^ /usr/include/c++/9/bits/refwrap.h:57:12: note: 'std::_Maybe_unary_or_binary_function<void, TH1D &>' declared here struct _Maybe_unary_or_binary_function<_Res, _T1> ^
Finally, we would like to print a progress bar on the terminal to show how the event-loop is progressing.
To take into account all events we use OnPartialResultSlot
: when Implicit Multi-Threading is enabled, in fact,
OnPartialResult
invokes the callback only in one of the worker threads, and always returns that worker threads'
partial result. This is useful because it means we don't have to worry about concurrent execution and
thread-safety of the callbacks if we are happy with just one threads' partial result.
OnPartialResultSlot
, on the other hand, invokes the callback in each one of the worker threads, every time a
thread finishes processing a batch of everyN
events. This is what we want for the progress bar, but we need to
take care that two threads will not print to terminal at the same time: we need a std::mutex for synchronization.
std::string progressBar;
std::mutex barMutex; // Only one thread at a time can lock a mutex. Let's use this to avoid concurrent printing.
In module 'std' imported from input_line_1:1: /usr/include/c++/9/bits/std_function.h:222:2: error: function 'std::_Function_base::_Base_manager<(lambda)>::_M_init_functor' is used but not defined in this translation unit, and cannot be defined in any other translation unit because its type does not have linkage _M_init_functor(_Any_data& __functor, _Functor&& __f) ^ /usr/include/c++/9/bits/std_function.h:675:19: note: used here _My_handler::_M_init_functor(_M_functor, std::move(__f)); ^ /usr/include/c++/9/bits/std_function.h:247:2: error: function 'std::_Function_base::_Base_manager<(lambda)>::_M_init_functor' is used but not defined in this translation unit, and cannot be defined in any other translation unit because its type does not have linkage _M_init_functor(_Any_data& __functor, _Functor&& __f, true_type) ^ /usr/include/c++/9/bits/std_function.h:223:4: note: used here { _M_init_functor(__functor, std::move(__f), _Local_storage()); } ^ /usr/include/c++/9/bits/std_function.h:151:2: error: function 'std::_Function_base::_Base_manager<(lambda)>::_M_get_pointer' is used but not defined in this translation unit, and cannot be defined in any other translation unit because its type does not have linkage _M_get_pointer(const _Any_data& __source) ^ /usr/include/c++/9/bits/std_function.h:300:11: note: used here (*_Base::_M_get_pointer(__functor))( ^ /usr/include/c++/9/bits/std_function.h:222:2: error: function 'std::_Function_base::_Base_manager<(lambda)>::_M_init_functor' is used but not defined in this translation unit, and cannot be defined in any other translation unit because its type does not have linkage _M_init_functor(_Any_data& __functor, _Functor&& __f) ^ /usr/include/c++/9/bits/std_function.h:675:19: note: used here _My_handler::_M_init_functor(_M_functor, std::move(__f)); ^ /usr/include/c++/9/bits/std_function.h:247:2: error: function 'std::_Function_base::_Base_manager<(lambda)>::_M_init_functor' is used but not defined in this translation unit, and cannot be defined in any other translation unit because its type does not have linkage _M_init_functor(_Any_data& __functor, _Functor&& __f, true_type) ^ /usr/include/c++/9/bits/std_function.h:223:4: note: used here { _M_init_functor(__functor, std::move(__f), _Local_storage()); } ^ /usr/include/c++/9/bits/std_function.h:151:2: error: function 'std::_Function_base::_Base_manager<(lambda)>::_M_get_pointer' is used but not defined in this translation unit, and cannot be defined in any other translation unit because its type does not have linkage _M_get_pointer(const _Any_data& __source) ^ /usr/include/c++/9/bits/std_function.h:300:11: note: used here (*_Base::_M_get_pointer(__functor))( ^
Magic numbers that yield good progress bars for nSlots = 1,2,4,8
const auto everyN = nSlots == 8 ? 1000 : 100ull * nSlots;
const auto barWidth = nEvents / everyN;
h.OnPartialResultSlot(everyN, [&barWidth, &progressBar, &barMutex](unsigned int /*slot*/, TH1D & /*partialHist*/) {
std::lock_guard<std::mutex> l(barMutex); // lock_guard locks the mutex at construction, releases it at destruction
progressBar.push_back('#');
// re-print the line with the progress bar
std::cout << "\r[" << std::left << std::setw(barWidth) << progressBar << ']' << std::flush;
});
input_line_64:4:44: error: use of undeclared identifier 'progressBar' h.OnPartialResultSlot(everyN, [&barWidth, &progressBar, &barMutex](unsigned int /*slot*/, TH1D & /*partialHist*/) { ^ input_line_64:4:58: error: use of undeclared identifier 'barMutex' h.OnPartialResultSlot(everyN, [&barWidth, &progressBar, &barMutex](unsigned int /*slot*/, TH1D & /*partialHist*/) { ^ input_line_64:5:34: error: unknown type name 'barMutex' std::lock_guard<std::mutex> l(barMutex); // lock_guard locks the mutex at construction, releases it at destruction ^ input_line_64:5:33: warning: parentheses were disambiguated as a function declaration [-Wvexing-parse] std::lock_guard<std::mutex> l(barMutex); // lock_guard locks the mutex at construction, releases it at destruction ^~~~~~~~~~ input_line_64:5:34: note: add a pair of parentheses to declare a variable std::lock_guard<std::mutex> l(barMutex); // lock_guard locks the mutex at construction, releases it at destruction ^ ( input_line_64:6:4: error: use of undeclared identifier 'progressBar' progressBar.push_back('#'); ^ input_line_64:8:62: error: use of undeclared identifier 'progressBar' std::cout << "\r[" << std::left << std::setw(barWidth) << progressBar << ']' << std::flush; ^
So far we told RDataFrame what we want to happen during the event-loop, but we have not actually run any of those actions: the TBrowser is still empty, the progress bar has not been printed even once, and we haven't produced a single data-point! As usual with RDataFrame, the event-loop is triggered by accessing the contents of a TResultProxy for the first time. Let's run!
std::cout << "Analysis running..." << std::endl;
h->Draw(); // the final, complete result will be drawn after the event-loop has completed.
std::cout << "\nDone!" << std::endl;
In module 'std' imported from input_line_1:1: /usr/include/c++/9/bits/std_function.h:222:2: error: function 'std::_Function_base::_Base_manager<(lambda)>::_M_init_functor' is used but not defined in this translation unit, and cannot be defined in any other translation unit because its type does not have linkage _M_init_functor(_Any_data& __functor, _Functor&& __f) ^ /usr/include/c++/9/bits/std_function.h:675:19: note: used here _My_handler::_M_init_functor(_M_functor, std::move(__f)); ^ /usr/include/c++/9/bits/std_function.h:247:2: error: function 'std::_Function_base::_Base_manager<(lambda)>::_M_init_functor' is used but not defined in this translation unit, and cannot be defined in any other translation unit because its type does not have linkage _M_init_functor(_Any_data& __functor, _Functor&& __f, true_type) ^ /usr/include/c++/9/bits/std_function.h:223:4: note: used here { _M_init_functor(__functor, std::move(__f), _Local_storage()); } ^ /usr/include/c++/9/bits/std_function.h:151:2: error: function 'std::_Function_base::_Base_manager<(lambda)>::_M_get_pointer' is used but not defined in this translation unit, and cannot be defined in any other translation unit because its type does not have linkage _M_get_pointer(const _Any_data& __source) ^ /usr/include/c++/9/bits/std_function.h:300:11: note: used here (*_Base::_M_get_pointer(__functor))( ^
Finally, some book-keeping: in the TMemFile that we are using as TBrowser directory, we substitute the partial result with a clone of the final result (the "original" final result will be deleted at the end of the macro).
dfDirectory->Clear();
auto clone = static_cast<TH1D *>(h->Clone());
clone->SetDirectory(nullptr);
dfDirectory->Add(clone);
if (!browserPad)
return; // in case root -b was invoked
browserPad->cd();
clone->Draw();
browserPad->Update();
[runStaticInitializersOnce]: Failed to materialize symbols: { (main, { _ZNKSt19__shared_ptr_accessIN4ROOT8Internal3RDF11RActionBaseELN9__gnu_cxx12_Lock_policyE2ELb0ELb0EEptEv, _ZN4ROOT3RDF10RResultPtrI4TH1DE11ThrowIfNullEv, _ZN4ROOT3RDF10RResultPtrI4TH1DE3GetEv, _ZN12__cling_N53316__cling_Un1Qu333EPv, _Z30__fd_init_order__cling_Un1Qu32v, _ZStneIN4ROOT8Internal3RDF11RActionBaseEEbRKSt10shared_ptrIT_EDn, $.cling-module-342.__inits.0, __cxx_global_var_initcling_module_342_, __vd_init_order__cling_Un1Qu33, _ZNKSt12__shared_ptrIN4ROOT8Internal3RDF11RActionBaseELN9__gnu_cxx12_Lock_policyE2EEcvbEv, _ZNKSt12__shared_ptrIN4ROOT8Internal3RDF11RActionBaseELN9__gnu_cxx12_Lock_policyE2EE3getEv, __cxx_global_var_initcling_module_342_.1, _ZN12__cling_N5335cloneE, __orc_init_func.cling-module-342, _GLOBAL__sub_I_cling_module_342, _ZN4ROOT3RDF10RResultPtrI4TH1DEptEv, _ZNKSt19__shared_ptr_accessIN4ROOT8Internal3RDF11RActionBaseELN9__gnu_cxx12_Lock_policyE2ELb0ELb0EE6_M_getEv, _ZN4ROOT3RDF10RResultPtrI4TH1DE10TriggerRunEv }) } IncrementalExecutor::executeFunction: symbol '_ZSteqI4TH1DEbRKSt10shared_ptrIT_EDn' unresolved while linking [cling interface function]! You are probably missing the definition of bool std::operator==<TH1D>(std::shared_ptr<TH1D> const&, decltype(nullptr)) Maybe you need to load the corresponding shared library?
Draw all canvases
gROOT->GetListOfCanvases()->Draw()
[runStaticInitializersOnce]: Failed to materialize symbols: { (main, { __orc_init_func.cling-module-342 }) }