Validation and MC studies: RooMCStudy - using separate fit and generator models, using the chi^2 calculator model Running a biased fit model against an optimal fit.

Author: Wouter Verkerke
This notebook tutorial was automatically generated with ROOTBOOK-izer from the macro found in the ROOT repository on Tuesday, December 06, 2022 at 12:08 PM.

In [1]:
%%cpp -d
#include "RooRealVar.h"
#include "RooDataSet.h"
#include "RooGaussian.h"
#include "RooChebychev.h"
#include "RooMCStudy.h"
#include "RooChi2MCSModule.h"
#include "RooPlot.h"
#include "TCanvas.h"
#include "TAxis.h"
#include "TH1.h"
#include "TDirectory.h"
#include "TLegend.h"

using namespace RooFit;


## Create model¶

Observables, parameters

In [2]:
RooRealVar x("x", "x", -10, 10);
x.setBins(10);
RooRealVar mean("mean", "mean of gaussian", 0, -2., 1.8);
RooRealVar sigma("sigma", "width of gaussian", 5, 1, 10);


Create Gaussian pdf

In [3]:
RooGaussian gauss("gauss", "gaussian PDF", x, mean, sigma);


## Create manager with chi^2 add-on module¶

Create study manager for binned likelihood fits of a Gaussian pdf in 10 bins

In [4]:
RooMCStudy *mcs = new RooMCStudy(gauss, x, Silence(), Binned());


Add chi^2 calculator module to mcs

In [5]:
RooChi2MCSModule chi2mod;


Generate 1000 samples of 1000 events

In [6]:
mcs->generateAndFit(2000, 1000);

[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1980
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1960
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1940
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1920
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1900
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1880
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1860
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1840
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1820
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1800
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1780
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1760
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1740
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1720
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1700
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1680
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1660
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1640
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1620
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1600
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1580
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1560
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1540
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1520
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1500
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1480
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1460
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1440
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1420
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1400
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1380
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1360
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1340
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1320
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1300
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1280
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1260
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1240
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1220
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1200
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1180
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1160
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1140
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1120
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1100
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1080
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1060
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1040
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1020
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1000
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 980
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 960
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 940
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 920
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 900
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 880
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 860
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 840
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 820
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 800
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 780
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 760
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 740
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 720
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 700
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 680
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 660
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 640
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 620
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 600
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 580
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 560
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 540
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 520
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 500
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 480
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 460
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 440
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 420
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 400
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 380
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 360
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 340
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 320
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 300
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 280
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 260
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 240
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 220
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 200
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 180
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 160
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 140
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 120
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 100
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 80
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 60
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 40
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 20
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 0


Number of bins for chi2 plots

In [7]:
int nBins = 100;


Fill histograms with distributions chi2 and prob(chi2,ndf) that are calculated by RooChiMCSModule

In [8]:
TH1 *hist_chi2 = mcs->fitParDataSet().createHistogram("chi2", AutoBinning(nBins));
hist_chi2->SetTitle("#chi^{2} values of all toy runs;#chi^{2}");
TH1 *hist_prob = mcs->fitParDataSet().createHistogram("prob", AutoBinning(nBins));
hist_prob->SetTitle("Corresponding #chi^{2} probability;Prob(#chi^{2},ndof)");


## Create manager with separate fit model¶

Create alternate pdf with shifted mean

In [9]:
RooRealVar mean2("mean2", "mean of gaussian 2", 2.);
RooGaussian gauss2("gauss2", "gaussian PDF2", x, mean2, sigma);


Create study manager with separate generation and fit model. This configuration is set up to generate biased fits as the fit and generator model have different means, and the mean parameter is limited to [-2., 1.8], so it just misses the optimal mean value of 2 in the data.

In [10]:
RooMCStudy *mcs2 = new RooMCStudy(gauss2, x, FitModel(gauss), Silence(), Binned());


Add chi^2 calculator module to mcs

In [11]:
RooChi2MCSModule chi2mod2;


Generate 1000 samples of 1000 events

In [12]:
mcs2->generateAndFit(2000, 1000);

[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1980
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1960
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1940
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1920
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1900
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1880
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1860
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1840
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1820
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1800
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1780
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1760
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1740
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1720
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1700
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1680
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1660
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1640
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1620
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1600
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1580
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1560
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1540
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1520
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1500
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1480
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1460
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1440
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1420
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1400
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1380
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1360
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1340
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1320
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1300
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1280
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1260
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1240
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1220
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1200
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1180
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1160
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1140
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1120
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1100
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1080
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1060
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1040
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1020
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 1000
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 980
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 960
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 940
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 920
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 900
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 880
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 860
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 840
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 820
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 800
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 780
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 760
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 740
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 720
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 700
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 680
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 660
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 640
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 620
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 600
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 580
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 560
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 540
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 520
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 500
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 480
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 460
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 440
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 420
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 400
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 380
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 360
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 340
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 320
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 300
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 280
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 260
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 240
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 220
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 200
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 180
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 160
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 140
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 120
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 100
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 80
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 60
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 40
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 20
[#0] PROGRESS:Generation -- RooMCStudy::run: sample 0
[#0] WARNING:Generation -- The fit parameter 'mean' is not in the model that was used to generate toy data. The parameter 'mean2'=2 was found at the same position in the generator model. It will be used to compute pulls.
If this is not desired, the parameters of the generator model need to be renamed or reordered.


Request a the pull plot of mean. The pulls will be one-sided because mean is limited to 1.8. Note that RooFit will have trouble to compute the pulls because the parameters are called mean in the fit, but mean2 in the generator model. It is not obvious that these are related. RooFit will nevertheless compute pulls, but complain that this is risky.

In [13]:
auto pullMeanFrame = mcs2->plotPull(mean);


Fill histograms with distributions chi2 and prob(chi2,ndf) that are calculated by RooChiMCSModule

In [14]:
TH1 *hist2_chi2 = mcs2->fitParDataSet().createHistogram("chi2", AutoBinning(nBins));
TH1 *hist2_prob = mcs2->fitParDataSet().createHistogram("prob", AutoBinning(nBins));
hist2_chi2->SetLineColor(kRed);
hist2_prob->SetLineColor(kRed);

TLegend leg;
leg.AddEntry(hist_chi2, "Optimal fit", "L");
leg.AddEntry(hist2_chi2, "Biased fit", "L");
leg.SetBorderSize(0);
leg.SetFillStyle(0);

TCanvas *c = new TCanvas("rf802_mcstudy_addons", "rf802_mcstudy_addons", 800, 400);
c->Divide(3);
c->cd(1);
hist_chi2->GetYaxis()->SetTitleOffset(1.4);
hist_chi2->Draw();
hist2_chi2->Draw("esame");
leg.DrawClone();
c->cd(2);
hist_prob->GetYaxis()->SetTitleOffset(1.4);
hist_prob->Draw();
hist2_prob->Draw("esame");
c->cd(3);
pullMeanFrame->Draw();

Info in <THttpEngine::Create>: Starting HTTP server on port 9353


Make RooMCStudy object available on command line after macro finishes

In [15]:
gDirectory->Add(mcs);


Draw all canvases

In [16]:
%jsroot on
gROOT->GetListOfCanvases()->Draw()