Rf 4 0 4_Categories

Data and categories: working with RooCategory objects to describe discrete variables

Author: Wouter Verkerke
This notebook tutorial was automatically generated with ROOTBOOK-izer from the macro found in the ROOT repository on Monday, January 17, 2022 at 10:02 AM.

In [1]:
%%cpp -d
#include "RooRealVar.h"
#include "RooDataSet.h"
#include "RooPolynomial.h"
#include "RooCategory.h"
#include "Roo1DTable.h"
#include "RooGaussian.h"
#include "TCanvas.h"
#include "TAxis.h"
#include "RooPlot.h"
#include <iostream>
In [2]:
%%cpp -d
// This is a workaround to make sure the namespace is used inside functions
using namespace RooFit;

Construct a category with labels

Define a category with labels only

In [3]:
RooCategory tagCat("tagCat", "Tagging category");
tagCat.defineType("Lepton");
tagCat.defineType("Kaon");
tagCat.defineType("NetTagger-1");
tagCat.defineType("NetTagger-2");
tagCat.Print();
RooFit v3.60 -- Developed by Wouter Verkerke and David Kirkby 
                Copyright (C) 2000-2013 NIKHEF, University of California & Stanford University
                All rights reserved, please read http://roofit.sourceforge.net/license.txt

RooCategory::tagCat = Lepton(idx = 0)

Construct a category with labels and indices

Define a category with explicitly numbered states

In [4]:
RooCategory b0flav("b0flav", "B0 flavour eigenstate");
b0flav["B0"] = -1;
b0flav["B0bar"] = 1;

Print it in "verbose" mode to see all states.

In [5]:
b0flav.Print("V");
--- RooAbsArg ---
  Value State: clean
  Shape State: clean
  Attributes: 
  Address: 0x7f62f46b73c8
  Clients: 
  Servers: 
  Proxies: 
--- RooAbsCategory ---
  Value = -1 "B0)
  Possible states:
    B0	-1
    B0bar	1

Alternatively, define many states at once. the function takes a map with std::string --> index mapping.

In [6]:
RooCategory largeCat("largeCat", "A category with many states");
largeCat.defineTypes({
    {"A", 0}, {"b", 2}, {"c", 8}, {"dee", 4},
    {"F", 133}, {"g", 15}, {"H", -20}
});

Iterate, query and set states

One can iterate through the {index,name} pair of category objects

In [7]:
std::cout << "\nThis is the for loop over states of 'largeCat':";
for (const auto& idxAndName : largeCat)
  std::cout << "\n\t" << idxAndName.first << "\t" << idxAndName.second;
std::cout << '\n' << std::endl;
This is the for loop over states of 'largeCat':
	A	0
	F	133
	H	-20
	b	2
	c	8
	dee	4
	g	15

To ask whether a state is valid use:

In [8]:
std::cout <<   "Has label 'A': " << largeCat.hasLabel("A");
std::cout << "\nHas index '-20': " << largeCat.hasIndex(-20);
Has label 'A': 1
Has index '-20': 1

To retrieve names or state numbers:

In [9]:
std::cout << "\nLabel corresponding to '2' is " << largeCat.lookupName(2);
std::cout << "\nIndex corresponding to 'A' is " << largeCat.lookupIndex("A");
Label corresponding to '2' is b
Index corresponding to 'A' is 0

To get the current state:

In [10]:
std::cout << "\nCurrent index is " << largeCat.getCurrentIndex();
std::cout << "\nCurrent label is " << largeCat.getCurrentLabel();
std::cout << std::endl;
Current index is 0
Current label is A

To set the state, use one of the two:

In [11]:
largeCat.setIndex(8);
largeCat.setLabel("c");

Generate dummy data for tabulation demo

Generate a dummy dataset

In [12]:
RooRealVar x("x", "x", 0, 10);
RooDataSet *data = RooPolynomial("p", "p", x).generate(RooArgSet(x, b0flav, tagCat), 10000);

Tables are equivalent of plots for categories

In [13]:
Roo1DTable *btable = data->table(b0flav);
btable->Print();
btable->Print("v");
input_line_63:2:23: error: reference to 'data' is ambiguous
 Roo1DTable *btable = data->table(b0flav);
                      ^
input_line_62:3:13: note: candidate found by name lookup is '__cling_N530::data'
RooDataSet *data = RooPolynomial("p", "p", x).generate(RooArgSet(x, b0flav, tagCat), 10000);
            ^
/usr/include/c++/9/bits/range_access.h:318:5: note: candidate found by name lookup is 'std::data'
    data(initializer_list<_Tp> __il) noexcept
    ^
/usr/include/c++/9/bits/range_access.h:289:5: note: candidate found by name lookup is 'std::data'
    data(_Container& __cont) noexcept(noexcept(__cont.data()))
    ^
/usr/include/c++/9/bits/range_access.h:299:5: note: candidate found by name lookup is 'std::data'
    data(const _Container& __cont) noexcept(noexcept(__cont.data()))
    ^
/usr/include/c++/9/bits/range_access.h:309:5: note: candidate found by name lookup is 'std::data'
    data(_Tp (&__array)[_Nm]) noexcept
    ^

Create table for subset of events matching cut expression

In [14]:
Roo1DTable *ttable = data->table(tagCat, "x>8.23");
ttable->Print();
ttable->Print("v");
input_line_64:2:23: error: reference to 'data' is ambiguous
 Roo1DTable *ttable = data->table(tagCat, "x>8.23");
                      ^
input_line_62:3:13: note: candidate found by name lookup is '__cling_N530::data'
RooDataSet *data = RooPolynomial("p", "p", x).generate(RooArgSet(x, b0flav, tagCat), 10000);
            ^
/usr/include/c++/9/bits/range_access.h:318:5: note: candidate found by name lookup is 'std::data'
    data(initializer_list<_Tp> __il) noexcept
    ^
/usr/include/c++/9/bits/range_access.h:289:5: note: candidate found by name lookup is 'std::data'
    data(_Container& __cont) noexcept(noexcept(__cont.data()))
    ^
/usr/include/c++/9/bits/range_access.h:299:5: note: candidate found by name lookup is 'std::data'
    data(const _Container& __cont) noexcept(noexcept(__cont.data()))
    ^
/usr/include/c++/9/bits/range_access.h:309:5: note: candidate found by name lookup is 'std::data'
    data(_Tp (&__array)[_Nm]) noexcept
    ^

Create table for all (tagcat x b0flav) state combinations

In [15]:
Roo1DTable *bttable = data->table(RooArgSet(tagCat, b0flav));
bttable->Print("v");
input_line_65:2:24: error: reference to 'data' is ambiguous
 Roo1DTable *bttable = data->table(RooArgSet(tagCat, b0flav));
                       ^
input_line_62:3:13: note: candidate found by name lookup is '__cling_N530::data'
RooDataSet *data = RooPolynomial("p", "p", x).generate(RooArgSet(x, b0flav, tagCat), 10000);
            ^
/usr/include/c++/9/bits/range_access.h:318:5: note: candidate found by name lookup is 'std::data'
    data(initializer_list<_Tp> __il) noexcept
    ^
/usr/include/c++/9/bits/range_access.h:289:5: note: candidate found by name lookup is 'std::data'
    data(_Container& __cont) noexcept(noexcept(__cont.data()))
    ^
/usr/include/c++/9/bits/range_access.h:299:5: note: candidate found by name lookup is 'std::data'
    data(const _Container& __cont) noexcept(noexcept(__cont.data()))
    ^
/usr/include/c++/9/bits/range_access.h:309:5: note: candidate found by name lookup is 'std::data'
    data(_Tp (&__array)[_Nm]) noexcept
    ^

Retrieve number of events from table Number can be non-integer if source dataset has weighed events

In [16]:
Double_t nb0 = btable->get("B0");
std::cout << "Number of events with B0 flavor is " << nb0 << std::endl;
input_line_67:2:3: error: use of undeclared identifier 'btable'
 (btable->get("B0"))
  ^
Error in <HandleInterpreterException>: Error evaluating expression (btable->get("B0"))
Execution of your code was aborted.

Retrieve fraction of events with "lepton" tag

In [17]:
Double_t fracLep = ttable->getFrac("Lepton");
std::cout << "Fraction of events tagged with Lepton tag is " << fracLep << std::endl;
input_line_69:2:3: error: use of undeclared identifier 'ttable'
 (ttable->getFrac("Lepton"))
  ^
Error in <HandleInterpreterException>: Error evaluating expression (ttable->getFrac("Lepton"))
Execution of your code was aborted.

Defining ranges for plotting, fitting on categories

Define named range as comma separated list of labels

In [18]:
tagCat.setRange("good", "Lepton,Kaon");

Or add state names one by one

In [19]:
tagCat.addToRange("soso", "NetTagger-1");
tagCat.addToRange("soso", "NetTagger-2");

Use category range in dataset reduction specification

In [20]:
RooDataSet *goodData = (RooDataSet *)data->reduce(CutRange("good"));
goodData->table(tagCat)->Print("v");
input_line_72:2:39: error: reference to 'data' is ambiguous
 RooDataSet *goodData = (RooDataSet *)data->reduce(CutRange("good"));
                                      ^
input_line_62:3:13: note: candidate found by name lookup is '__cling_N530::data'
RooDataSet *data = RooPolynomial("p", "p", x).generate(RooArgSet(x, b0flav, tagCat), 10000);
            ^
/usr/include/c++/9/bits/range_access.h:318:5: note: candidate found by name lookup is 'std::data'
    data(initializer_list<_Tp> __il) noexcept
    ^
/usr/include/c++/9/bits/range_access.h:289:5: note: candidate found by name lookup is 'std::data'
    data(_Container& __cont) noexcept(noexcept(__cont.data()))
    ^
/usr/include/c++/9/bits/range_access.h:299:5: note: candidate found by name lookup is 'std::data'
    data(const _Container& __cont) noexcept(noexcept(__cont.data()))
    ^
/usr/include/c++/9/bits/range_access.h:309:5: note: candidate found by name lookup is 'std::data'
    data(_Tp (&__array)[_Nm]) noexcept
    ^