Kernel > Restart & Clear output
before adding any changes to git!In this tutorial we will perform template fits with pyroofit
.
pyroofit
is a python wrapper for the RooFit
package. Documentation can be found here:
http://www.desy.de/~swehle/pyroofit/
The project's code can be found at https://github.com/simonUU/PyrooFit
pyroofit
can be installed with pip3 install --user pyroofit
.
import ROOT
from pyroofit.models import Gauss, Chebychev
import numpy as np
import pandas as pd
# Show pictures in this notebook
from IPython.display import Image
%matplotlib inline
# Create some test data
data = pd.DataFrame(
np.concatenate((
np.random.normal(-2, 1, 1000),
np.random.normal(3, 2, 1000),
-5 + 10* np.random.random_sample(2000)
)),
columns=["x"]
)
x, y
data from the tutorial fitting_curves_data
.No? Then cheat and look at the histogram:
data.hist(bins=30)
# Gaussian for signal
pdf_sig = Gauss(('x', -5, 5), mean=(-1, 0, 1), title="Signal")
# Straight line for background
pdf_bkg = Chebychev(('x', -5, 5), n=1, title="Background")
Now we build a compound PDF from the two simple PDFs.
pyroofit
is quite nice in that this has a very pythonic syntax (we overload the +
operator):
pdf = pdf_sig + pdf_bkg
Now we're ready to fit:
pdf.fit(data)
Don't be deterred by the amount of output, but let's look at the results:
pdf.get()
Hint: In order to get the results as a dictionary, use get_parameters()
instead:
pdf.get_parameters()
And plot:
pdf.plot("test.png", legend=True)
Image("test.png")
gauss1 = Gauss(('x', -5, 5), mean=(-5, -3, 0), title="signal", name="gauss1")
gauss2 = Gauss(('x', -5, 5), mean=(0, 3, 5), title="signal", name="gauss2")
pdf_sig = gauss1 + gauss2
pdf_bkg = Chebychev(('x', -5, 5), n=1, title="Background")
pdf = pdf_sig + pdf_bkg
pdf.fit(data)
print("Gauss 1:")
print(gauss1.get())
print("Gauss 2:")
print(gauss2.get())
print("Bkg:")
print(pdf_bkg.get())
pdf.plot("test2.png", legend=True)
Image("test2.png")
data = pd.DataFrame(
np.concatenate((
np.random.normal(-2, 1, 1000),
-5 + 10* np.random.random_sample(2000)
)),
columns=["x"]
)
In the previous examples, we simply "knew" that our signal was shaped like a (two) Gaussian(s) and the background was linear.
Usually however, the situation isn't as simple and we first have to learn how our signal and background looks like by looking at MC data. Remember that in MC we always know signal from background (it's simulated data after all).
Thus, we can first fit our signal and background model to the MC, then fix the parameters. Now we have two PDFs $\mathrm{pdf}_\mathrm{sig}$ and $\mathrm{pdf}_\mathrm{bkg}$ and get the signal and background yields by fitting the data with $\mu_\mathrm{sig}\mathrm{pdf}_\mathrm{sig} + \mu_\mathrm{bkg}\mathrm{pdf}_\mathrm{bkg}$.
mc_signal = pd.DataFrame(
np.random.normal(-2, 1, 1000),
columns=["x"]
)
mc_bkg = pd.DataFrame(
np.concatenate((
np.random.normal(2, 1, 1000),
-5 + 10* np.random.random_sample(2000)
)),
columns=["x"]
)
data = pd.DataFrame(
np.concatenate((
np.random.normal(2, 1, int(1.2*1000)),
-5 + 10* np.random.random_sample(int(1.2*2000)),
np.random.normal(-2, 1, int(0.3*1000))
)),
columns=["x"]
)
mc_signal.hist(bins=30)
mc_bkg.hist(bins=30)
pdf_sig = Gauss(('x', -5, 5), mean=(-4, -2, 0), title="signal", name="gauss1")
pdf_sig.fit(mc_signal)
pdf_sig.plot("mc_signal_fit.png", legend=True)
Image("mc_signal_fit.png")
pdf_sig.fix()
pdf_bkg = (
Gauss(('x', -5, 5), mean=(-5, 3, 5), title="Background", name="gauss2")
+ Chebychev(('x', -5, 5), n=1, title="Background")
)
pdf_bkg.fit(mc_bkg)
pdf_bkg.plot("mc_bkg_fit.png", legend=True)
Image("mc_bkg_fit.png")
pdf_bkg.fix()
pdf_bkg.get()
pdf = pdf_sig + pdf_bkg
pdf.fit(data)
pdf.plot("fit_to_data.png", legend=True)
Image("fit_to_data.png")
pdf.get()
pdf_bkg.get()