This is an exercise showing a simple analysis exploring the Z -> 2 lepton final state, focusing on the e+e- and μ+μ- channels. The analysis aims to explore the kinematics of Z --> e+e- OR μ+μ- events.
The analysis is performed based on ATLAS run1 open data MC ntuples. The below cell retrieves such an ntuple.
The analysis consists of two parts:
!wget --progress=dot:giga http://opendata.atlas.cern/release/samples/MC/mc_105987.WZ.root
// Get the ROOT file containing the Z -> eemumu background events
(More information on cern.ch/adl)
LHC data analyses are usually performed using complex analysis frameworks written in general purpose languages like C++ and python. But this method has a steep learning curve, as even the simplest tasks could be coded in a complicated way, and it is not straightforward to understand the code, make changes or additions. However there is another emerging alternative which allows to decouple physics content from the technical code and write analyses with a simple, self-describing syntax. Analysis Description Language (ADL) is a HEP-specific analysis language developed with this purpose.
A HEP analysis includes 3 main parts:
ADL consists of blocks separating object, variable and event selection definitions for a clear separation of analysis components. Blocks have a keyword-expression structure. Keywords specify analysis concepts and operations. Syntax includes mathematical and logical operations, comparison and optimization operators, reducers, 4-vector algebra and HEP-specific functions (dφ, dR, …).
ADL is designed with the goal to be self-describing, so especially for simple cases like in this example, one does not need to read syntax rules to understand an ADL description. However if you are interested, the set of syntax rules can be found here.
Once an analysis is written it needs to be run on events. This is achieved by CutLang , the runtime interpreter who reads and understands the ADL syntax and runs it on events. CutLang is also a framework which aturomatically handles many tedious tasks as reading input events, writing output histograms, etc. CutLang can be run on various environments such as linux, mac, conda, docker, jupyter, etc.
In case you are interested to learn more on CutLang, please see the CutLang github
Writing the analysis with ADL: In the following cell, part of the analysis is written using the ADL syntax. However there are some parts missing. Please follow the instructions in the comments to complete the missing parts. If you feel adventurous, you could modify the object or event selections, add new variables or new histograms.
Running the analysis with CutLang: Executing the cell will run the analysis on both the signal and background events. The run parameters are given in the first line of the cell:
NOTE: When running jupyter/binder via direct link, if your run does not complete due to memory issues, please reduce the number of events via the "events" parameter.
Analysis output: Running the analysis will produce two outputs:
%%cutlang file=mc_105987.WZ.root filetype=ATLASOD adlfile=ZtoLL events=100000 verbose=10000
# ADL file for Z->ee/mumu analysis
# Object selection
# Take input electrons, labeled "ele" and obtain a set of selected electrons "elesel"
object goodEle
take ele # start with initial electron set
select pT(ele) > 25 # apply a cut on transverse momentum
select abs(eta(ele)) < 2.5 # apply a cut on pseudorapidity
# Take input muons, labeled "muo" and obtain a set of selected muons "muosel"
object goodMuo
take muo # start with initial muon set
select pT(muo) > 25 # apply a cut on transverse momentum
select abs(eta(muo)) < 2.5 # apply a cut on pseudorapidity
object goodLeptons : Union (goodEle, goodMuo)
# Useful definitions
define mLL = m(goodLeptons[0] goodLeptons[1])
define elePDGid = 11
define muoPDGid = 13
# Event selection
algo Zll
select ALL #cut0: count all events
histo hneinp, "number of input electrons", 6, 0, 6, size(ele)
histo hnesel, "number of selected electrons", 6, 0, 6, size(goodEle)
histo hnminp, "number of input muons", 6, 0, 6, size(muo)
histo hnmsel, "number of selected muons", 6, 0, 6, size(goodMuo)
histo hnleps, "number of selected lepts", 6, 0, 6, size(goodLeptons)
histo hnenminp, "number of input electrons vs muons", 6, 0, 6, 6, 0, 6, size(ele), size(muo)
histo hnenmsel, "number of selected electrons vs muons", 6, 0, 6, 6, 0, 6, size(goodEle), size(goodMuo)
select Size(ele) + Size(muo) > 1 #cut1: We just want events with at least two leptons
select Size(goodLeptons) == 2 #cut2: We want, in fact, exactly two good leptons
select q(goodLeptons[0]) * q(goodLeptons[1]) == -1 #cut3: The two selected leptons must have opposite charge
select pdgID(goodLeptons[0])+pdgID(goodLeptons[1])==0 #cut4: The two selected leptons have the same flavor
histo hZllselbc, "Z(->LL,selected) candidate mass (GeV)", 50, 50, 150, mLL
select abs(mLL - 91.18) < 20 #cut5: The absolute value of the difference between the
# two leptons and the known Z boson mass (mz) must be less than 20 GeV
histo hZllselac, "Z(->LL,selected,massWindow) candidate mass (GeV)", 50, 50, 150, mLL
select abs(pdgID(goodLeptons[0])) == elePDGid ? hMZee,"Inv.Mass of Z (Zee)",50,50.0,150.0,mLL : ALL
select abs(pdgID(goodLeptons[0])) == muoPDGid ? hMZmm,"Inv.Mass of Z (Zmm)",50,50.0,150.0,mLL : ALL
Now let's make some plots using the ROOT package in python (which is widely used at CERN). Instructions are shown within comments in the following cells.
What to do:
%%python
# Let's start with importing the needed modules
from ROOT import gStyle, TFile, TH1, TH1D, TH2D, TCanvas, TLegend, TColor
# Now let's set some ROOT styling parameters:
# You do not need to know what they mean, but can directly use these settings
gStyle.SetOptStat(0)
gStyle.SetPalette(1)
gStyle.SetTextFont(42)
gStyle.SetTitleStyle(0000)
gStyle.SetTitleBorderSize(0)
gStyle.SetTitleFont(42)
gStyle.SetTitleFontSize(0.055)
gStyle.SetTitleFont(42, "xyz")
gStyle.SetTitleSize(0.5, "xyz")
gStyle.SetLabelFont(42, "xyz")
gStyle.SetLabelSize(0.45, "xyz")
%%python
# Let's open the output file produced by CutLang:
# (If you changed the adlfile option when running cutlang, you will need to change the file names)
f = TFile("histoOut-ZtoLL-mc_105987.root")
# We can see what is inside the signal file:
f.ls()
# There should be a directory (TDirectoryFile) per selection algorithm also known as a region.
%%python
# Let's see what is available:
f.cd("Zll")
f.ls()
%%python
# Get the histograms out of the file
# lepton counts:
hneinp = f.Get("Zll/hneinp")
hnminp = f.Get("Zll/hnminp")
hnesel = f.Get("Zll/hnesel")
hnmsel = f.Get("Zll/hnmsel")
hnenminp = f.Get("Zll/hnenminp")
hnenmsel = f.Get("Zll/hnenmsel")
# Z reconstruction before cut
hZllselbc = f.Get("Zll/hZllselbc")
# Z reconstruction after cut
hZllselac = f.Get("Zll/hZllselac")
# Z from electrons only
hMZee = f.Get("Zll/hMZee")
# Z from muons only
hMZmm = f.Get("Zll/hMZmm")
%%python
############ LETS SEE 1D MULTIPLICITIES AND HOW TO MAKE NICE PLOTS
# In order to be able to make many plots, let's define two generic histogrms to which we can
# assign any of the histograms above:
h1 = hneinp
h2 = hnminp
# Now we format the histograms: lines, colors, axes titles, etc..
# You do not need to learn the commands here unless you are really curious.
# Otherwise just execute the cell.
# Color numbers can be retrived from https://root.cern.ch/doc/master/classTColor.html
# (check for color wheel)
h1.SetLineColor(600) # kBlue
h2.SetLineColor(416+2) # kGreen + 2
# Make the x-axis title:
title = h1.GetTitle()
h1.SetTitle("")
h1.GetXaxis().SetTitle(title)
h1.GetXaxis().SetTitleOffset(1.25)
h1.GetXaxis().SetTitleSize(0.05)
h1.GetXaxis().SetLabelSize(0.045)
h1.GetXaxis().SetNdivisions(8, 5, 0)
h1.GetYaxis().SetTitle("number of events")
h1.GetYaxis().SetTitleOffset(1.4)
h1.GetYaxis().SetTitleSize(0.05)
h1.GetYaxis().SetLabelSize(0.045)
# Set the maximum of the y axis:
if (h2.GetMaximum()>h1.GetMaximum()):
h1.SetMaximum(h2.GetMaximum()*1.1)
# Make a generically usable legend
l = TLegend(0.65, 0.75, 0.88, 0.87)
l.SetBorderSize(0)
l.SetFillStyle(0000)
# You can change the legend titles from here based on what you are plotting
l.AddEntry(h1,h1.GetName(), "l")
l.AddEntry(h2,h2.GetName(), "l")
%%python %jsroot on
############ LETS SEE 2D MULTIPLICITIES AND THE EFFECT OF MASS WINDOW CUT
c = TCanvas("c", "c", 620, 500)
c.SetBottomMargin(0.15)
c.SetLeftMargin(0.15)
c.SetRightMargin(0.15)
h1.Draw()
h2.Draw("same")
l.Draw("same")
c.Draw()
# Don't worry about the error that appears below!
%%python %jsroot on
############ LETS SEE 2D MULTIPLICITIES AND THE EFFECT OF MASS WINDOW CUT
c2 = TCanvas("c2", "c2", 620, 500)
c2.Divide(2,1)
c2.SetBottomMargin(0.15)
c2.SetLeftMargin(0.15)
c2.SetRightMargin(0.15)
c2.cd(1)
hnenmsel.Draw("colz")
hnenmsel.Draw("sametext")
c2.cd(2)
hZllselbc.SetLineColor(2)
hZllselac.SetLineColor(4)
hZllselbc.Draw("e")
hZllselac.Draw("esame")
c2.Draw()
# Don't worry about the error that appears below!
%%python %jsroot on
############ LETS SEE MUON AND ELECTRON CHANNELS ON TOP OF EACH OTHER
c3 = TCanvas("c3", "c3", 620, 500)
c3.SetBottomMargin(0.15)
c3.SetLeftMargin(0.15)
c3.SetRightMargin(0.15)
hMZmm.SetLineColor(2)
hMZmm.SetTitle("compare ee(blue) & mm(red) channels")
hMZmm.GetXaxis().SetTitle("mLL (GeV)")
hMZmm.Draw("e")
hMZee.Draw("esame")
c3.Draw()
# Don't worry about the error that appears below!