Rf 6 0 2_Chi 2Fit

Likelihood and minimization: setting up a chi^2 fit to a binned dataset

Author: Wouter Verkerke
This notebook tutorial was automatically generated with ROOTBOOK-izer from the macro found in the ROOT repository on Monday, January 17, 2022 at 10:07 AM.

In [1]:
%%cpp -d
#include "RooRealVar.h"
#include "RooDataSet.h"
#include "RooGaussian.h"
#include "RooChebychev.h"
#include "RooAddPdf.h"
#include "RooChi2Var.h"
#include "TCanvas.h"
#include "TAxis.h"
#include "RooPlot.h"
In [2]:
%%cpp -d
// This is a workaround to make sure the namespace is used inside functions
using namespace RooFit;

Setup model

Declare observable x

In [3]:
RooRealVar x("x", "x", 0, 10);
RooFit v3.60 -- Developed by Wouter Verkerke and David Kirkby 
                Copyright (C) 2000-2013 NIKHEF, University of California & Stanford University
                All rights reserved, please read http://roofit.sourceforge.net/license.txt

Create two gaussian pdfs g1(x,mean1,sigma) anf g2(x,mean2,sigma) and their parameters

In [4]:
RooRealVar mean("mean", "mean of gaussians", 5);
RooRealVar sigma1("sigma1", "width of gaussians", 0.5);
RooRealVar sigma2("sigma2", "width of gaussians", 1);

RooGaussian sig1("sig1", "Signal component 1", x, mean, sigma1);
RooGaussian sig2("sig2", "Signal component 2", x, mean, sigma2);
[#0] WARNING:InputArguments -- The parameter 'sigma1' with range [-1e+30, 1e+30] of the RooGaussian 'sig1' exceeds the safe range of (0, inf). Advise to limit its range.
[#0] WARNING:InputArguments -- The parameter 'sigma2' with range [-1e+30, 1e+30] of the RooGaussian 'sig2' exceeds the safe range of (0, inf). Advise to limit its range.

Build chebychev polynomial pdf

In [5]:
RooRealVar a0("a0", "a0", 0.5, 0., 1.);
RooRealVar a1("a1", "a1", 0.2, 0., 1.);
RooChebychev bkg("bkg", "Background", x, RooArgSet(a0, a1));

Sum the signal components into a composite signal pdf

In [6]:
RooRealVar sig1frac("sig1frac", "fraction of component 1 in signal", 0.8, 0., 1.);
RooAddPdf sig("sig", "Signal", RooArgList(sig1, sig2), sig1frac);

Sum the composite signal and background

In [7]:
RooRealVar bkgfrac("bkgfrac", "fraction of background", 0.5, 0., 1.);
RooAddPdf model("model", "g1+g2+a", RooArgList(bkg, sig), bkgfrac);

Create binned dataset

In [8]:
RooDataSet *d = model.generate(x, 10000);
RooDataHist *dh = d->binnedClone();
[#0] WARNING:Eval -- Evaluating RooAddPdf without a defined normalization set. This can lead to ambiguos coefficients definition and incorrect results. Use RooAddPdf::fixCoefNormalization(nset) to provide a normalization set for defining uniquely RooAddPdf coefficients!

Construct a chi^2 of the data and the model. When a pdf is used in a chi^2 fit, the probability density scaled by the number of events in the dataset to obtain the fit function If model is an extended pdf, the expected number events is used instead of the observed number of events.

In [9]:
model.chi2FitTo(*dh);
[#1] INFO:Minimization -- RooAbsMinimizerFcn::setOptimizeConst: activating const optimization
[#1] INFO:Minimization --  The following expressions have been identified as constant and will be precalculated and cached: (sig1,sig2)
 **********
 **    1 **SET PRINT           1
 **********
 **********
 **    2 **SET NOGRAD
 **********
 PARAMETER DEFINITIONS:
    NO.   NAME         VALUE      STEP SIZE      LIMITS
     1 a0           5.00000e-01  1.00000e-01    0.00000e+00  1.00000e+00
     2 a1           2.00000e-01  1.00000e-01    0.00000e+00  1.00000e+00
     3 bkgfrac      5.00000e-01  1.00000e-01    0.00000e+00  1.00000e+00
     4 sig1frac     8.00000e-01  1.00000e-01    0.00000e+00  1.00000e+00
 **********
 **    3 **SET ERR           1
 **********
 **********
 **    4 **SET PRINT           1
 **********
 **********
 **    5 **SET STR           1
 **********
 NOW USING STRATEGY  1: TRY TO BALANCE SPEED AGAINST RELIABILITY
 **********
 **    6 **MIGRAD        2000           1
 **********
 FIRST CALL TO USER FUNCTION AT NEW START POINT, WITH IFLAG=4.
 START MIGRAD MINIMIZATION.  STRATEGY  1.  CONVERGENCE WHEN EDM .LT. 1.00e-03
 FCN=106.175 FROM MIGRAD    STATUS=INITIATE       16 CALLS          17 TOTAL
                     EDM= unknown      STRATEGY= 1      NO ERROR MATRIX       
  EXT PARAMETER               CURRENT GUESS       STEP         FIRST   
  NO.   NAME      VALUE            ERROR          SIZE      DERIVATIVE 
   1  a0           5.00000e-01   1.00000e-01   2.01358e-01  -9.22919e+00
   2  a1           2.00000e-01   1.00000e-01   2.57889e-01   3.56932e+01
   3  bkgfrac      5.00000e-01   1.00000e-01   2.01358e-01   2.47046e+01
   4  sig1frac     8.00000e-01   1.00000e-01   2.57889e-01   9.71383e+00
 MIGRAD MINIMIZATION HAS CONVERGED.
 MIGRAD WILL VERIFY CONVERGENCE AND ERROR MATRIX.
 COVARIANCE MATRIX CALCULATED SUCCESSFULLY
 FCN=104.639 FROM MIGRAD    STATUS=CONVERGED      78 CALLS          79 TOTAL
                     EDM=5.30614e-06    STRATEGY= 1      ERROR MATRIX ACCURATE 
  EXT PARAMETER                                   STEP         FIRST   
  NO.   NAME      VALUE            ERROR          SIZE      DERIVATIVE 
   1  a0           5.00975e-01   2.29160e-02   2.25835e-04  -3.66627e-02
   2  a1           1.58055e-01   3.68598e-02   3.29973e-04   9.16128e-03
   3  bkgfrac      5.06726e-01   1.13595e-02   6.50594e-05   4.90783e-02
   4  sig1frac     8.16064e-01   3.74255e-02   3.05383e-04  -3.32372e-02
 EXTERNAL ERROR MATRIX.    NDIM=  25    NPAR=  4    ERR DEF=1
  5.255e-04  1.573e-04 -3.988e-05 -1.315e-04 
  1.573e-04  1.363e-03 -3.100e-04 -9.244e-04 
 -3.988e-05 -3.100e-04  1.291e-04  3.237e-04 
 -1.315e-04 -9.244e-04  3.237e-04  1.405e-03 
 PARAMETER  CORRELATION COEFFICIENTS  
       NO.  GLOBAL      1      2      3      4
        1  0.18992   1.000  0.186 -0.153 -0.153
        2  0.75966   0.186  1.000 -0.739 -0.668
        3  0.82130  -0.153 -0.739  1.000  0.760
        4  0.77660  -0.153 -0.668  0.760  1.000
 **********
 **    7 **SET ERR           1
 **********
 **********
 **    8 **SET PRINT           1
 **********
 **********
 **    9 **HESSE        2000
 **********
 COVARIANCE MATRIX CALCULATED SUCCESSFULLY
 FCN=104.639 FROM HESSE     STATUS=OK             23 CALLS         102 TOTAL
                     EDM=5.30457e-06    STRATEGY= 1      ERROR MATRIX ACCURATE 
  EXT PARAMETER                                INTERNAL      INTERNAL  
  NO.   NAME      VALUE            ERROR       STEP SIZE       VALUE   
   1  a0           5.00975e-01   2.29157e-02   4.51671e-05   1.95002e-03
   2  a1           1.58055e-01   3.68515e-02   6.59945e-05  -7.53081e-01
   3  bkgfrac      5.06726e-01   1.13574e-02   1.30119e-05   1.34521e-02
   4  sig1frac     8.16064e-01   3.74193e-02   6.10765e-05   6.84296e-01
 EXTERNAL ERROR MATRIX.    NDIM=  25    NPAR=  4    ERR DEF=1
  5.255e-04  1.572e-04 -3.986e-05 -1.314e-04 
  1.572e-04  1.363e-03 -3.098e-04 -9.239e-04 
 -3.986e-05 -3.098e-04  1.290e-04  3.236e-04 
 -1.314e-04 -9.239e-04  3.236e-04  1.405e-03 
 PARAMETER  CORRELATION COEFFICIENTS  
       NO.  GLOBAL      1      2      3      4
        1  0.18986   1.000  0.186 -0.153 -0.153
        2  0.75953   0.186  1.000 -0.739 -0.668
        3  0.82122  -0.153 -0.739  1.000  0.760
        4  0.77651  -0.153 -0.668  0.760  1.000

Nb: it is also possible to fit a rooabsreal function to a roodatahist using chi2FitTo().

Note that entries with zero bins are not allowed for a proper chi^2 calculation and will give error messages

In [10]:
RooDataSet *dsmall = (RooDataSet *)d->reduce(EventRange(1, 100));
RooDataHist *dhsmall = dsmall->binnedClone();
RooChi2Var chi2_lowstat("chi2_lowstat", "chi2", model, *dhsmall);
cout << chi2_lowstat.getVal() << endl;
29.4863