rs302_JeffreysPriorDemo

tutorial demonstrating and validates the RooJeffreysPrior class

Jeffreys's prior is an 'objective prior' based on formal rules. It is calculated from the Fisher information matrix.

Read more: http://en.wikipedia.org/wiki/Jeffreys_prior

The analytic form is not known for most PDFs, but it is for simple cases like the Poisson mean, Gaussian mean, Gaussian sigma.

This class uses numerical tricks to calculate the Fisher Information Matrix efficiently. In particular, it takes advantage of a property of the 'Asimov data' as described in Asymptotic formulae for likelihood-based tests of new physics Glen Cowan, Kyle Cranmer, Eilam Gross, Ofer Vitells http://arxiv.org/abs/arXiv:1007.1727

This Demo has four parts:

  1. TestJeffreysPriorDemo -- validates Poisson mean case 1/sqrt(mu)
  2. TestJeffreysGaussMean -- validates Gaussian mean case
  3. TestJeffreysGaussSigma -- validates Gaussian sigma case 1/sigma
  4. TestJeffreysGaussMeanAndSigma -- demonstrates 2-d example

Author: Kyle Cranmer
This notebook tutorial was automatically generated with ROOTBOOK-izer from the macro found in the ROOT repository on Sunday, August 14, 2022 at 09:40 AM.

In [1]:
%%cpp -d
#include "RooJeffreysPrior.h"

#include "RooWorkspace.h"
#include "RooDataHist.h"
#include "RooGenericPdf.h"
#include "RooPlot.h"
#include "RooFitResult.h"
#include "RooRealVar.h"
#include "RooAbsPdf.h"
#include "RooNumIntConfig.h"

#include "TH1F.h"
#include "TCanvas.h"
#include "TLegend.h"
#include "TMatrixDSym.h"

using namespace RooFit;

In [2]:
%%cpp -d
void TestJeffreysGaussMean()
{
   RooWorkspace w("w");
   w.factory("Gaussian::g(x[0,-20,20],mu[0,-5.,5],sigma[1,0,10])");
   w.factory("n[10,.1,200]");
   w.factory("ExtendPdf::p(g,n)");
   w.var("sigma")->setConstant();
   w.var("n")->setConstant();

   RooDataHist *asimov = w.pdf("p")->generateBinned(*w.var("x"), ExpectedData());

   RooFitResult *res = w.pdf("p")->fitTo(*asimov, Save(), SumW2Error(kTRUE));

   asimov->Print();
   res->Print();
   TMatrixDSym cov = res->covarianceMatrix();
   cout << "variance = " << (cov.Determinant()) << endl;
   cout << "stdev = " << sqrt(cov.Determinant()) << endl;
   cov.Invert();
   cout << "jeffreys = " << sqrt(cov.Determinant()) << endl;

   w.defineSet("poi", "mu");
   w.defineSet("obs", "x");

   RooJeffreysPrior pi("jeffreys", "jeffreys", *w.pdf("p"), *w.set("poi"), *w.set("obs"));

   const RooArgSet *temp = w.set("poi");
   pi.getParameters(*temp)->Print();

   //  return;
   RooGenericPdf *test = new RooGenericPdf("constant", "Expected = constant", "1", *w.set("poi"));

   TCanvas *c1 = new TCanvas;
   RooPlot *plot = w.var("mu")->frame();
   pi.plotOn(plot);
   test->plotOn(plot, LineColor(kRed), LineStyle(kDashDotted));
   plot->Draw();

   auto legend = plot->BuildLegend();
   legend->DrawClone();
}

In [3]:
%%cpp -d
void TestJeffreysGaussSigma()
{
   // this one is VERY sensitive
   // if the Gaussian is narrow ~ range(x)/nbins(x) then the peak isn't resolved
   //   and you get really bizarre shapes
   // if the Gaussian is too wide range(x) ~ sigma then PDF gets renormalized
   //   and the PDF falls off too fast at high sigma
   RooWorkspace w("w");
   w.factory("Gaussian::g(x[0,-20,20],mu[0,-5,5],sigma[1,1,5])");
   w.factory("n[100,.1,2000]");
   w.factory("ExtendPdf::p(g,n)");
   //  w.var("sigma")->setConstant();
   w.var("mu")->setConstant();
   w.var("n")->setConstant();
   w.var("x")->setBins(301);

   RooDataHist *asimov = w.pdf("p")->generateBinned(*w.var("x"), ExpectedData());

   RooFitResult *res = w.pdf("p")->fitTo(*asimov, Save(), SumW2Error(kTRUE));

   asimov->Print();
   res->Print();
   TMatrixDSym cov = res->covarianceMatrix();
   cout << "variance = " << (cov.Determinant()) << endl;
   cout << "stdev = " << sqrt(cov.Determinant()) << endl;
   cov.Invert();
   cout << "jeffreys = " << sqrt(cov.Determinant()) << endl;

   w.defineSet("poi", "sigma");
   w.defineSet("obs", "x");

   RooJeffreysPrior pi("jeffreys", "jeffreys", *w.pdf("p"), *w.set("poi"), *w.set("obs"));

   const RooArgSet *temp = w.set("poi");
   pi.getParameters(*temp)->Print();

   RooGenericPdf *test = new RooGenericPdf("test", "Expected = #sqrt{2}/#sigma", "sqrt(2.)/sigma", *w.set("poi"));

   TCanvas *c1 = new TCanvas;
   RooPlot *plot = w.var("sigma")->frame();
   pi.plotOn(plot);
   test->plotOn(plot, LineColor(kRed), LineStyle(kDashDotted));
   plot->Draw();

   auto legend = plot->BuildLegend();
   legend->DrawClone();
}

In [4]:
%%cpp -d
void TestJeffreysGaussMeanAndSigma()
{
   // this one is VERY sensitive
   // if the Gaussian is narrow ~ range(x)/nbins(x) then the peak isn't resolved
   //   and you get really bizarre shapes
   // if the Gaussian is too wide range(x) ~ sigma then PDF gets renormalized
   //   and the PDF falls off too fast at high sigma
   RooWorkspace w("w");
   w.factory("Gaussian::g(x[0,-20,20],mu[0,-5,5],sigma[1,1.,5.])");
   w.factory("n[100,.1,2000]");
   w.factory("ExtendPdf::p(g,n)");

   w.var("n")->setConstant();
   w.var("x")->setBins(301);

   RooDataHist *asimov = w.pdf("p")->generateBinned(*w.var("x"), ExpectedData());

   RooFitResult *res = w.pdf("p")->fitTo(*asimov, Save(), SumW2Error(kTRUE));

   asimov->Print();
   res->Print();
   TMatrixDSym cov = res->covarianceMatrix();
   cout << "variance = " << (cov.Determinant()) << endl;
   cout << "stdev = " << sqrt(cov.Determinant()) << endl;
   cov.Invert();
   cout << "jeffreys = " << sqrt(cov.Determinant()) << endl;

   w.defineSet("poi", "mu,sigma");
   w.defineSet("obs", "x");

   RooJeffreysPrior pi("jeffreys", "jeffreys", *w.pdf("p"), *w.set("poi"), *w.set("obs"));

   const RooArgSet *temp = w.set("poi");
   pi.getParameters(*temp)->Print();
   //  return;

   TCanvas *c1 = new TCanvas;
   TH1 *Jeff2d = pi.createHistogram("2dJeffreys", *w.var("mu"), Binning(10, -5., 5), YVar(*w.var("sigma"), Binning(10, 1., 5.)));
   Jeff2d->Draw("surf");
}
In [5]:
RooWorkspace w("w");
w.factory("Uniform::u(x[0,1])");
w.factory("mu[100,1,200]");
w.factory("ExtendPdf::p(u,mu)");

RooDataHist *asimov = w.pdf("p")->generateBinned(*w.var("x"), ExpectedData());

RooFitResult *res = w.pdf("p")->fitTo(*asimov, Save(), SumW2Error(kTRUE));

asimov->Print();
res->Print();
TMatrixDSym cov = res->covarianceMatrix();
cout << "variance = " << (cov.Determinant()) << endl;
cout << "stdev = " << sqrt(cov.Determinant()) << endl;
cov.Invert();
cout << "jeffreys = " << sqrt(cov.Determinant()) << endl;

w.defineSet("poi", "mu");
w.defineSet("obs", "x");

RooJeffreysPrior pi("jeffreys", "jeffreys", *w.pdf("p"), *w.set("poi"), *w.set("obs"));

RooGenericPdf *test = new RooGenericPdf("Expected", "Expected = 1/#sqrt{#mu}", "1./sqrt(mu)",
   *w.set("poi"));

TCanvas *c1 = new TCanvas;
RooPlot *plot = w.var("mu")->frame();
pi.plotOn(plot);
test->plotOn(plot, LineColor(kRed), LineStyle(kDashDotted));
plot->Draw();

auto legend = plot->BuildLegend();
legend->DrawClone();
[#1] INFO:Minimization -- p.d.f. provides expected number of events, including extended term in likelihood.
[#1] INFO:Minimization -- createConstraintTerm: caching constraint set under name CACHE_CONSTR_OF_PDF_p_FOR_OBS_x with 0 entries
[#1] INFO:Minimization -- RooAbsMinimizerFcn::setOptimizeConst: activating const optimization
[#1] INFO:Minimization --  The following expressions have been identified as constant and will be precalculated and cached: (u)
 **********
 **    1 **SET PRINT           1
 **********
 **********
 **    2 **SET NOGRAD
 **********
 PARAMETER DEFINITIONS:
    NO.   NAME         VALUE      STEP SIZE      LIMITS
     1 mu           1.00000e+02  1.99000e+01    1.00000e+00  2.00000e+02
 **********
 **    3 **SET ERR         0.5
 **********
 **********
 **    4 **SET PRINT           1
 **********
 **********
 **    5 **SET STR           1
 **********
 NOW USING STRATEGY  1: TRY TO BALANCE SPEED AGAINST RELIABILITY
 **********
 **    6 **MIGRAD         500           1
 **********
 FIRST CALL TO USER FUNCTION AT NEW START POINT, WITH IFLAG=4.
 START MIGRAD MINIMIZATION.  STRATEGY  1.  CONVERGENCE WHEN EDM .LT. 1.00e-03
 FCN=-360.517 FROM MIGRAD    STATUS=INITIATE        4 CALLS           5 TOTAL
                     EDM= unknown      STRATEGY= 1      NO ERROR MATRIX       
  EXT PARAMETER               CURRENT GUESS       STEP         FIRST   
  NO.   NAME      VALUE            ERROR          SIZE      DERIVATIVE 
   1  mu           1.00000e+02   1.99000e+01   2.01361e-01  -5.66619e-05
                               ERR DEF= 0.5
 MIGRAD MINIMIZATION HAS CONVERGED.
 MIGRAD WILL VERIFY CONVERGENCE AND ERROR MATRIX.
 COVARIANCE MATRIX CALCULATED SUCCESSFULLY
 FCN=-360.517 FROM MIGRAD    STATUS=CONVERGED      24 CALLS          25 TOTAL
                     EDM=4.72209e-14    STRATEGY= 1      ERROR MATRIX ACCURATE 
  EXT PARAMETER                                   STEP         FIRST   
  NO.   NAME      VALUE            ERROR          SIZE      DERIVATIVE 
   1  mu           1.00000e+02   9.98317e+00   1.31866e-03  -2.16215e-06
                               ERR DEF= 0.5
 EXTERNAL ERROR MATRIX.    NDIM=  25    NPAR=  1    ERR DEF=0.5
  1.000e+02 
 **********
 **    7 **SET ERR         0.5
 **********
 **********
 **    8 **SET PRINT           1
 **********
 **********
 **    9 **HESSE         500
 **********
 COVARIANCE MATRIX CALCULATED SUCCESSFULLY
 FCN=-360.517 FROM HESSE     STATUS=OK              7 CALLS          32 TOTAL
                     EDM=9.50228e-17    STRATEGY= 1      ERROR MATRIX ACCURATE 
  EXT PARAMETER                                INTERNAL      INTERNAL  
  NO.   NAME      VALUE            ERROR       STEP SIZE       VALUE   
   1  mu           1.00000e+02   9.98317e+00   2.63731e-04  -5.02515e-03
                               ERR DEF= 0.5
 EXTERNAL ERROR MATRIX.    NDIM=  25    NPAR=  1    ERR DEF=0.5
  1.000e+02 
[#1] INFO:Fitting -- RooAbsPdf::fitTo(p) Calculating sum-of-weights-squared correction matrix for covariance matrix
 **********
 **   10 **SET ERR         0.5
 **********
 **********
 **   11 **SET PRINT           1
 **********
 **********
 **   12 **HESSE         500
 **********
 COVARIANCE MATRIX CALCULATED SUCCESSFULLY
 FCN=-360.517 FROM HESSE     STATUS=OK              5 CALLS          37 TOTAL
                     EDM=8.98159e-17    STRATEGY= 1      ERROR MATRIX ACCURATE 
  EXT PARAMETER                                INTERNAL      INTERNAL  
  NO.   NAME      VALUE            ERROR       STEP SIZE       VALUE   
   1  mu           1.00000e+02   9.98311e+00   1.05492e-05  -5.02515e-03
                               ERR DEF= 0.5
 EXTERNAL ERROR MATRIX.    NDIM=  25    NPAR=  1    ERR DEF=0.5
  1.000e+02 
[#1] INFO:Minimization -- RooAbsMinimizerFcn::setOptimizeConst: deactivating const optimization
RooDataHist::genData[x] = 100 bins (100 weights)

  RooFitResult: minimized FCN value: -360.517, estimated distance to minimum: 8.98159e-17
                covariance matrix quality: Full, accurate covariance matrix
                Status : MINIMIZE=0 HESSE=0 HESSE=0 

    Floating Parameter    FinalValue +/-  Error   
  --------------------  --------------------------
                    mu    1.0000e+02 +/-  1.00e+01

variance = 100.001
stdev = 10.0001
jeffreys = 0.0999994
[#1] INFO:NumericIntegration -- RooRealIntegral::init(jeffreys_Int[mu]) using numeric integrator RooAdaptiveGaussKronrodIntegrator1D to calculate Int(mu)
[#1] INFO:NumericIntegration -- RooRealIntegral::init(Expected_Int[mu]) using numeric integrator RooIntegrator1D to calculate Int(mu)

Draw all canvases

In [6]:
%jsroot on
gROOT->GetListOfCanvases()->Draw()