rf202_extendedmlfit

Addition and convolution: setting up an extended maximum likelihood fit

Author: Clemens Lange, Wouter Verkerke (C++ version)
This notebook tutorial was automatically generated with ROOTBOOK-izer from the macro found in the ROOT repository on Sunday, November 27, 2022 at 11:06 AM.

In [1]:
import ROOT
Welcome to JupyROOT 6.27/01

Set up component pdfs

Declare observable x

In [2]:
x = ROOT.RooRealVar("x", "x", 0, 10)

Create two Gaussian PDFs g1(x,mean1,sigma) anf g2(x,mean2,sigma) and their parameters

In [3]:
mean = ROOT.RooRealVar("mean", "mean of gaussians", 5)
sigma1 = ROOT.RooRealVar("sigma1", "width of gaussians", 0.5)
sigma2 = ROOT.RooRealVar("sigma2", "width of gaussians", 1)

sig1 = ROOT.RooGaussian("sig1", "Signal component 1", x, mean, sigma1)
sig2 = ROOT.RooGaussian("sig2", "Signal component 2", x, mean, sigma2)
[#0] WARNING:InputArguments -- The parameter 'sigma1' with range [-1e+30, 1e+30] of the RooGaussian 'sig1' exceeds the safe range of (0, inf). Advise to limit its range.
[#0] WARNING:InputArguments -- The parameter 'sigma2' with range [-1e+30, 1e+30] of the RooGaussian 'sig2' exceeds the safe range of (0, inf). Advise to limit its range.

Build Chebychev polynomial pdf

In [4]:
a0 = ROOT.RooRealVar("a0", "a0", 0.5, 0.0, 1.0)
a1 = ROOT.RooRealVar("a1", "a1", -0.2, 0.0, 1.0)
bkg = ROOT.RooChebychev("bkg", "Background", x, [a0, a1])

Sum the signal components into a composite signal pdf

In [5]:
sig1frac = ROOT.RooRealVar("sig1frac", "fraction of component 1 in signal", 0.8, 0.0, 1.0)
sig = ROOT.RooAddPdf("sig", "Signal", [sig1, sig2], [sig1frac])

Method 1 - Construct extended composite model

Sum the composite signal and background into an extended pdf nsigsig+nbkgbkg

In [6]:
nsig = ROOT.RooRealVar("nsig", "number of signal events", 500, 0.0, 10000)
nbkg = ROOT.RooRealVar("nbkg", "number of background events", 500, 0, 10000)
model = ROOT.RooAddPdf("model", "(g1+g2)+a", [bkg, sig], [nbkg, nsig])

Sample, fit and plot extended model

Generate a data sample of expected number events in x from model = model.expectedEvents() = nsig+nbkg

In [7]:
data = model.generate({x})

Fit model to data, ML term automatically included

In [8]:
model.fitTo(data)
Out[8]:
<cppyy.gbl.RooFitResult object at 0x(nil)>
[#1] INFO:Minimization -- p.d.f. provides expected number of events, including extended term in likelihood.
[#1] INFO:Minimization -- RooAbsMinimizerFcn::setOptimizeConst: activating const optimization
[#1] INFO:Minimization --  The following expressions have been identified as constant and will be precalculated and cached: (sig1,sig2)
[#1] INFO:Minimization --  The following expressions will be evaluated in cache-and-track mode: (bkg)
 **********
 **    1 **SET PRINT           1
 **********
 **********
 **    2 **SET NOGRAD
 **********
 PARAMETER DEFINITIONS:
    NO.   NAME         VALUE      STEP SIZE      LIMITS
     1 a0           5.00000e-01  1.00000e-01    0.00000e+00  1.00000e+00
     2 a1           0.00000e+00  1.00000e-01    0.00000e+00  1.00000e+00
 MINUIT WARNING IN PARAM DEF
 ============== STARTING VALUE IS AT LIMIT.
 MINUIT WARNING IN PARAMETR
 ============== VARIABLE2 IS AT ITS LOWER ALLOWED LIMIT.
 MINUIT WARNING IN PARAMETR
 ============== VARIABLE2 BROUGHT BACK INSIDE LIMITS.
     3 nbkg         5.00000e+02  2.50000e+02    0.00000e+00  1.00000e+04
     4 nsig         5.00000e+02  2.50000e+02    0.00000e+00  1.00000e+04
     5 sig1frac     8.00000e-01  1.00000e-01    0.00000e+00  1.00000e+00
 **********
 **    3 **SET ERR         0.5
 **********
 **********
 **    4 **SET PRINT           1
 **********
 **********
 **    5 **SET STR           1
 **********
 NOW USING STRATEGY  1: TRY TO BALANCE SPEED AGAINST RELIABILITY
 **********
 **    6 **MIGRAD        2500           1
 **********
 FIRST CALL TO USER FUNCTION AT NEW START POINT, WITH IFLAG=4.
 MINUIT WARNING IN MIGrad    
 ============== VARIABLE2 IS AT ITS LOWER ALLOWED LIMIT.
 START MIGRAD MINIMIZATION.  STRATEGY  1.  CONVERGENCE WHEN EDM .LT. 1.00e-03
 FCN=-3981.56 FROM MIGRAD    STATUS=INITIATE       33 CALLS          34 TOTAL
                     EDM= unknown      STRATEGY= 1      NO ERROR MATRIX       
  EXT PARAMETER               CURRENT GUESS       STEP         FIRST   
  NO.   NAME      VALUE            ERROR          SIZE      DERIVATIVE 
   1  a0           5.00000e-01   1.00000e-01   0.00000e+00   2.42158e+00
   2  a1           6.91266e-02   1.00000e-01   5.30141e-01  -2.47844e-01
   3  nbkg         5.00000e+02   2.50000e+02   0.00000e+00   5.64537e+01
   4  nsig         5.00000e+02   2.50000e+02   0.00000e+00  -5.64546e+01
   5  sig1frac     8.00000e-01   1.00000e-01   0.00000e+00   2.68987e-01
                               ERR DEF= 0.5
 MIGRAD MINIMIZATION HAS CONVERGED.
 MIGRAD WILL VERIFY CONVERGENCE AND ERROR MATRIX.
 COVARIANCE MATRIX CALCULATED SUCCESSFULLY
 FCN=-3983.59 FROM MIGRAD    STATUS=CONVERGED     132 CALLS         133 TOTAL
                     EDM=8.19227e-06    STRATEGY= 1      ERROR MATRIX ACCURATE 
  EXT PARAMETER                                   STEP         FIRST   
  NO.   NAME      VALUE            ERROR          SIZE      DERIVATIVE 
   1  a0           5.07582e-01   7.96182e-02   6.78893e-03  -3.57814e-03
   2  a1           2.66323e-01   1.34788e-01   8.23878e-03   4.57147e-04
   3  nbkg         4.27828e+02   3.83165e+01   5.09588e-04  -1.07634e-01
   4  nsig         5.72096e+02   4.01499e+01   4.94609e-04  -2.13397e-01
   5  sig1frac     6.41354e-01   9.73632e-02   6.07877e-03   6.56989e-04
                               ERR DEF= 0.5
 EXTERNAL ERROR MATRIX.    NDIM=  25    NPAR=  5    ERR DEF=0.5
  6.393e-03  2.444e-03 -5.472e-01  5.471e-01 -1.360e-03 
  2.444e-03  1.876e-02 -3.816e+00  3.815e+00 -8.807e-03 
 -5.472e-01 -3.816e+00  1.468e+03 -1.040e+03  2.504e+00 
  5.471e-01  3.815e+00 -1.040e+03  1.612e+03 -2.504e+00 
 -1.360e-03 -8.807e-03  2.504e+00 -2.504e+00  9.613e-03 
 PARAMETER  CORRELATION COEFFICIENTS  
       NO.  GLOBAL      1      2      3      4      5
        1  0.22641   1.000  0.223 -0.179  0.170 -0.173
        2  0.79364   0.223  1.000 -0.727  0.694 -0.656
        3  0.78664  -0.179 -0.727  1.000 -0.676  0.667
        4  0.75448   0.170  0.694 -0.676  1.000 -0.636
        5  0.73098  -0.173 -0.656  0.667 -0.636  1.000
 **********
 **    7 **SET ERR         0.5
 **********
 **********
 **    8 **SET PRINT           1
 **********
 **********
 **    9 **HESSE        2500
 **********
 COVARIANCE MATRIX CALCULATED SUCCESSFULLY
 FCN=-3983.59 FROM HESSE     STATUS=OK             31 CALLS         164 TOTAL
                     EDM=8.15795e-06    STRATEGY= 1      ERROR MATRIX ACCURATE 
  EXT PARAMETER                                INTERNAL      INTERNAL  
  NO.   NAME      VALUE            ERROR       STEP SIZE       VALUE   
   1  a0           5.07582e-01   7.95756e-02   1.35779e-03   1.51648e-02
   2  a1           2.66323e-01   1.33724e-01   3.29551e-04  -4.86296e-01
   3  nbkg         4.27828e+02   3.80516e+01   1.01918e-04  -1.15411e+00
   4  nsig         5.72096e+02   3.99008e+01   9.89219e-05  -1.08774e+00
   5  sig1frac     6.41354e-01   9.67277e-02   2.43151e-04   2.86616e-01
                               ERR DEF= 0.5
 EXTERNAL ERROR MATRIX.    NDIM=  25    NPAR=  5    ERR DEF=0.5
  6.386e-03  2.398e-03 -5.351e-01  5.351e-01 -1.328e-03 
  2.398e-03  1.846e-02 -3.736e+00  3.736e+00 -8.604e-03 
 -5.351e-01 -3.736e+00  1.448e+03 -1.020e+03  2.453e+00 
  5.351e-01  3.736e+00 -1.020e+03  1.592e+03 -2.453e+00 
 -1.328e-03 -8.604e-03  2.453e+00 -2.453e+00  9.486e-03 
 PARAMETER  CORRELATION COEFFICIENTS  
       NO.  GLOBAL      1      2      3      4      5
        1  0.22416   1.000  0.221 -0.176  0.168 -0.171
        2  0.78979   0.221  1.000 -0.723  0.689 -0.650
        3  0.78325  -0.176 -0.723  1.000 -0.672  0.662
        4  0.75090   0.168  0.689 -0.672  1.000 -0.631
        5  0.72671  -0.171 -0.650  0.662 -0.631  1.000
[#1] INFO:Minimization -- RooAbsMinimizerFcn::setOptimizeConst: deactivating const optimization

Plot data and PDF overlaid, expected number of events for pdf projection normalization rather than observed number of events (==data.numEntries())

In [9]:
xframe = x.frame(Title="extended ML fit example")
data.plotOn(xframe)
model.plotOn(xframe, Normalization=dict(scaleFactor=1.0, scaleType=ROOT.RooAbsReal.RelativeExpected))
Out[9]:
<cppyy.gbl.RooPlot object at 0x9c6dcb0>

Overlay the background component of model with a dashed line

In [10]:
model.plotOn(
    xframe,
    Components={bkg},
    LineStyle=":",
    Normalization=dict(scaleFactor=1.0, scaleType=ROOT.RooAbsReal.RelativeExpected),
)
Out[10]:
<cppyy.gbl.RooPlot object at 0x9c6dcb0>
[#1] INFO:Plotting -- RooAbsPdf::plotOn(model) directly selected PDF components: (bkg)
[#1] INFO:Plotting -- RooAbsPdf::plotOn(model) indirectly selected PDF components: ()

Overlay the background+sig2 components of model with a dotted line

In [11]:
ras_bkg_sig2 = {bkg, sig2}
model.plotOn(
    xframe,
    Components=ras_bkg_sig2,
    LineStyle=":",
    Normalization=dict(scaleFactor=1.0, scaleType=ROOT.RooAbsReal.RelativeExpected),
)
Out[11]:
<cppyy.gbl.RooPlot object at 0x9c6dcb0>
[#1] INFO:Plotting -- RooAbsPdf::plotOn(model) directly selected PDF components: (bkg,sig2)
[#1] INFO:Plotting -- RooAbsPdf::plotOn(model) indirectly selected PDF components: (sig)

Print structure of composite pdf

In [12]:
model.Print("t")
0x96cbfa0 RooAddPdf::model = 0.886051/1 [Auto,Clean] 
  0x94d7670/V- RooChebychev::bkg = 0.733677 [Auto,Dirty] 
    0x91ba190/V- RooRealVar::x = 5
    0x27e77f0/V- RooRealVar::a0 = 0.507582 +/- 0.0795756
    0x274b330/V- RooRealVar::a1 = 0.266323 +/- 0.133724
  0x9643730/V- RooRealVar::nbkg = 427.828 +/- 38.0516
  0x96b2910/V- RooAddPdf::sig = 1/1 [Auto,Clean] 
    0x9317f40/V- RooGaussian::sig1 = 1 [Auto,Dirty] 
      0x91ba190/V- RooRealVar::x = 5
      0x90e16d0/V- RooRealVar::mean = 5
      0x8f38500/V- RooRealVar::sigma1 = 0.5
    0x228abd0/V- RooRealVar::sig1frac = 0.641354 +/- 0.0967277
    0x92f9130/V- RooGaussian::sig2 = 1 [Auto,Dirty] 
      0x91ba190/V- RooRealVar::x = 5
      0x90e16d0/V- RooRealVar::mean = 5
      0x8ce1980/V- RooRealVar::sigma2 = 1
  0x9625d90/V- RooRealVar::nsig = 572.096 +/- 39.9008

Method 2 - Construct extended components first

Associated nsig/nbkg as expected number of events with sig/bkg

In [13]:
esig = ROOT.RooExtendPdf("esig", "extended signal pdf", sig, nsig)
ebkg = ROOT.RooExtendPdf("ebkg", "extended background pdf", bkg, nbkg)

Sum extended components without coefs

Construct sum of two extended pdf (no coefficients required)

In [14]:
model2 = ROOT.RooAddPdf("model2", "(g1+g2)+a", [ebkg, esig])

Draw the frame on the canvas

In [15]:
c = ROOT.TCanvas("rf202_extendedmlfit", "rf202_extendedmlfit", 600, 600)
ROOT.gPad.SetLeftMargin(0.15)
xframe.GetYaxis().SetTitleOffset(1.4)
xframe.Draw()

c.SaveAs("rf202_extendedmlfit.png")
Info in <TCanvas::Print>: png file rf202_extendedmlfit.png has been created

Draw all canvases

In [16]:
from ROOT import gROOT 
gROOT.GetListOfCanvases().Draw()