Notebook

Black Hole Spectroscopy with PyCBC Inference¶

This notebook provides an overview of how to do black hole spectroscopy with PyCBC Inference. We suggest that you go through the inference tutorial 0 for a better understanding of what's done here. For more details, see the separate tutorial notebooks.

Make sure the software is set up¶

To do a ringdown analysis in PyCBC we need the pykerr package installed. We'll also install dynesty for doing the sampling. These are both optional dependencies in the PyCBC repository, meaning that we need to install them separately if we are installing released versions of PyCBC.

In [ ]:

import sys
!{sys.executable} -m pip install pykerr dynesty pycbc --no-cache-dir

Recap¶

To do parameter estimation, we use the command-line script pycbc_inference. This pulls together the sampler, models, and distributions modules in the pycbc.inference package so that we may perform Bayesian inference. The standard command-line argument for PyCBC inference are:

pycbc_inference --verbose --seed ${SEED} --output-file inference.hdf --config-files config1.ini [config2.ini ...]

The key arguments that pycbc_inference takes are an output file and one or more config-files (run pycbc_inference --help to see all of the arguments it takes). The config files entirely specify the problem that you will be analyzing. In them, you specify the model to use, the variable parameters, and the prior on each of the parameters. If the model involves gravitational-wave data, you also specify what data to load and how to estimate the PSD.

You use the same command-line program regardless of whether you are doing a standard CBC analysis, a BH ringdown analysis, or any other form of testing GR. The config file(s) you provide are what determines what type of analysis you are doing. Below, we will illustrated how to setup a config file to do a BH ringdown analysis on GW190521.

Note: in order to do everything from within the notebook, we'll create the config files by echoing a python string to a file. In a normal situation, you would just use your favorite text editor to create the config files.

The model¶

As illustrated in previous notebooks, there are many different models in PyCBC. The model you choose, combined with the waveform template you set (done via the approximant argument in the [static_params] section), determines the type of analysis you are doing. Let's recall what models are available:

In [ ]:

from pycbc.inference import models

for model_name in models.models:
    print(model_name)

To do a BH ringdown analysis we need to use one of the gated Gaussian models. Note that there are two: the standard gated_gaussian_noise model, and a gated_gaussian_margpol. The latter does the same thing, it just marginalizes over the polarization angle numerically to speed up convergence. We'll use that model here.

In [ ]:

model_config = """
[model]
name = gated_gaussian_margpol
low-frequency-cutoff = 15
"""

In [ ]:

!echo '{model_config}' > model.ini
!cat model.ini

The data¶

Since this model uses data, we also need to create a data section that specifies what data to read for the analysis and the PSD estimation. See the PyCBC Inference docs for more information about what options are available and what they mean.

First we need to download the data from GWOSC (note: if you were doing this on the command line, you can just use wget or curl):

In [ ]:

from astropy.utils.data import download_file

ifos = ['H1', 'L1', 'V1']
data_filenames = {}
for ifo in ifos:
    print("Downloading {} data".format(ifo))
    
    # Download the gravitational wave data for GW190521
    url = "https://www.gw-openscience.org/eventapi/html/O3_Discovery_Papers/GW190521/v2/{}-{}1_GWOSC_4KHZ_R2-1242440920-4096.gwf"
    fname = download_file(url.format(ifo[0], ifo[0]), cache=True)
    data_filenames[ifo] = fname

Now let's set up our data section, pointing to the frame files we just downloaded:

In [ ]:

data_config = """
[data]
instruments = H1 L1 V1
trigger-time = 1242442967.445
analysis-start-time = -4
analysis-end-time = 4
data-conditioning-low-freq = H1:0 L1:0 V1:0
psd-estimation = median-mean
psd-start-time = -144
psd-end-time = 144
psd-inverse-length = 8
psd-segment-length = 8
psd-segment-stride = 4
frame-files = H1:{h1file} L1:{l1file} V1:{v1file}
channel-name = H1:GWOSC-4KHZ_R2_STRAIN L1:GWOSC-4KHZ_R2_STRAIN V1:GWOSC-4KHZ_R2_STRAIN
sample-rate = 2048
strain-high-pass = 10
pad-data = 8
""".format(h1file=data_filenames['H1'],
           l1file=data_filenames['L1'],
           v1file=data_filenames['V1'])

In [ ]:

!echo '{data_config}' > data.ini
!cat data.ini

The prior¶

Now let's specify the prior. We need to provide a section that lists the variable parameters and another that specifies the static parameters. For every variable parameter we have to provide one or more prior section(s) that specifies the prior distribution.

For a ringdown, the key things to note are:

By setting the approximant to TdQNMfromFinalMassSpin we are using a QNM template from the Kerr solution. This means that the frequency and damping times of all the modes we will use are set by the final mass and spin (which we will vary). If we wanted to allow the frequencies and damping times to be completely independent, we would use the TdQNMfromFreqTau approximant. (Note, however, that TdQNMfromFinalMassspin accepts delta_flmn and delta_taulmn parameters that can be used to make the frequency and damping time of an lmn mode to deviate from its GR value. See the documentation TdQNMfromFinalMassSpin for a full list of documented parameters.
We set which harmonics to use via the harmonic argument; here, we are using the spheroidal harmonics (which come from pykerr).
We set which modes are included by setting the lmns argument. Note that the last digit in the lmns is not the overtone number, but the number of overtones to include for the given lm mode. In other words, lmns = 222 means to include the 220 and 221 mode. If we wanted the 220 only, we'd set lmns = 221.
If we wanted to add other harmonics, we'd list them as a space-separated list, e.g., 221 331 321 would mean to include the 220, 330, and 320 modes.
For each mode/overtone we specify with the lmns argument we need to provide an amplitude and phase for that mode (as well as any other optional argument, such as a frequency/damping time deviation and/or a different polarization; see the TdQNMfromFinalMassSpin documentation for details). These additional parameters can either be fixed (by setting it in the static_params) or variable.
The amplitude of the dominant mode (amp220) is in terms of strain. The amplitudes of all other modes are specified with respect to the dominant mode. For example, amp221 means that the amplitude of the 221 mode (in terms of strain) will be amp221 * amp220.

In [ ]:

prior_config = """
[static_params]
approximant = TdQNMfromFinalMassSpin
harmonics = spheroidal
tref = ${data|trigger-time}
ra = 3.5
dec = 0.73
toffset = 0.018
lmns = 222
t_final = 2

[variable_params]
final_mass = 
final_spin = 
inclination = 
logamp220 = 
phi220 =
amp221 =
phi221 =

[waveform_transforms-t_gate_end]
name = custom
inputs = tref
t_gate_end = tref + 0.0

[waveform_transforms-t_gate_start]
name = custom
inputs = t_gate_end
t_gate_start = t_gate_end - 2

[waveform_transforms-tc]
name = custom
inputs = t_gate_end
tc = t_gate_end - 0.001

[prior-final_mass]
name = uniform
min-final_mass = 100
max-final_mass = 500

[prior-final_spin]
name = uniform
min-final_spin = -0.99
max-final_spin = 0.99

[prior-inclination]
name = sin_angle

[prior-logamp220]
name = uniform
min-logamp220 = -24
max-logamp220 = -19

[waveform_transforms-amp220]
name = custom
inputs = logamp220
amp220 = 10**logamp220

[prior-phi220]
name = uniform_angle

[prior-amp221]
name = uniform
min-amp221 = 0
max-amp221 = 5

[prior-phi221]
name = uniform_angle
"""

In [ ]:

!echo '{prior_config}' > prior.ini
!cat prior.ini

The sampler¶

Finally, we need to specify the sampler settings. To do that, we need to provide a sampler section. In this case we'll use the nested sampler dynesty, but you can use any of the samplers that PyCBC has support for. See the PyCBC Inference docs for a list of supported samplers and the optional available for them.

Note: Below we set the number of livepoints to 100, the checkpoint interval to only 30s, and we set a maxcall to 100. We are only doing this so that we can trigger a checkpoint quickly. This is only for demonstration purposes. In a real analysis, you would want to use many more livepoints (4000 is a good number for BH ringdown) and set the checkpoint interval to something ~1800s (or more). You also don't normally need to set maxcall.

In [ ]:

sampler_config = """
[sampler]
name = dynesty
dlogz = 0.1
nlive = 100
checkpoint_time_interval = 30
maxcall = 100
"""

In [ ]:

!echo '{sampler_config}' > sampler.ini
!cat sampler.ini

Run `pycbc_inference`¶

Now that we've set up our config files, we can run pycbc_inference. The BH ringdown problem we set up above would take a long time (several hours) to run in a notebook like this. Instead, start this running, then interrupt the kernel after you see the output message say that it has checkpointed a couple times (should occur within a couple minutes).

In [ ]:

# generate a random int for the seed
import numpy
seed = numpy.random.randint(1,int(2**31))
print("Using seed: {}".format(seed))

In [ ]:

!timeout 300 \
    pycbc_inference --verbose \
    --config-files model.ini data.ini prior.ini sampler.ini \
    --output-file inference.hdf \
    --seed {seed} \
    --nprocesses 8 \
    --force    

We can see that a checkpoint file exists in our directory:

In [ ]:

!ls -lh inference.hdf.checkpoint

Plotting with `pycbc_inference_plot_posterior`¶

The command-line tool pycbc_inference_plot_posterior is used to make corner and posterior plots of our results. We can give it either the direct output of pycbc_inference (here, inference.hdf) or a posterior file created using pycbc_inference_extract_samples. There are several plotting configuration options available (see pycbc_inference_plot_posterior --help for the list of options).

We can also plot the current status of the sampler using the checkpoint file. This isn't too meaningful for nested samplers, but can give you an idea of how far along the sampler is. Let's take a look at the raw samples from our current checkpoint file.

In [ ]:

!pycbc_inference_plot_posterior --verbose \
    --input-file inference.hdf.checkpoint \
    --output-file raw_samples_current.png \
    --plot-scatter --plot-marginal \
    --z-arg loglikelihood \
    --raw-samples

In [ ]:

from IPython.display import Image
Image('raw_samples_current.png', height=480)

Plot a fully completed result¶

The setup we've done above was used in Capano et al. arXiv:2105.05238, the samples for which were released here. We'll download the completed posterior file of the 220+221 result at $t_{\rm ref} - 7\textrm{ms}$ from there to see what a completed result looks like.

In [ ]:

!wget https://github.com/gwastro/BH-Spectroscopy-GW190521/raw/main/posteriors/kerr/220_221/KERR-220_221-M05MS.hdf

In [ ]:

!pycbc_inference_plot_posterior --verbose \
    --input-file KERR-220_221-M05MS.hdf \
    --output-file completed_scatter.png \
    --plot-scatter --plot-marginal \
    --z-arg loglikelihood

In [ ]:

Image('completed_scatter.png', height=480)

Let's make a more interesting plot: we'll use these to make a density plot of the mass, spin, and amplitudes, and give them pretty labels:

In [ ]:

!pycbc_inference_plot_posterior --verbose \
    --input-file KERR-220_221-M05MS.hdf \
    --output-file completed_density-mass_spin_amps.png \
    --plot-density --plot-contours --plot-marginal \
    --density-cmap inferno --contour-colors white \
    --max-kde-samples 5000 \
    --parameters \
        'final_mass:$M_f (M_\odot)$' \
        'final_spin:$\chi_f$' \
        '10**logamp220*1e20:$A_{220} (\times 10^20)$' \
        'amp221:$A_{221}/A_{220}$'

In [ ]:

Image('completed_density-mass_spin_amps.png', height=480)

Loading the model from the posterior file¶

The inference file and/or posterior file contains all the information to reconstruct the model; i.e., the config file, data, and PSDs are all stored in the hdf file. We can use this to load the model in the file, and plot the max likelihood waveforms.

In [ ]:

from pycbc.inference import io, models

In [ ]:

# load the posterior file
fp = io.loadfile('KERR-220_221-M05MS.hdf', 'r')
# get the config, the data, and PSDs from the file
# the config file:
cp = fp.read_config_file()
# the data
data = fp.read_data()
# the psds
psds = fp.read_psds()

In [ ]:

# now let's load the model
model = models.read_from_config(cp, data=data, psds=psds)

In [ ]:

# let's get the maximum likelihood point
samples = fp.read_samples(list(fp['samples'].keys()))
maxlidx = samples['loglikelihood'].argmax()
maxlparams = {p: samples[p][maxlidx] for p in model.variable_params}

In [ ]:

# get the loglikelihood of these points
model.update(**maxlparams)
model.loglikelihood

In [ ]:

# get the matched-filter SNR
print((2*model.loglr)**0.5)

In [ ]:

# get the gated, whitened maxL waveform and plot it against
# the whitened data
gated_wfs = model.get_gated_waveforms()
gated_data = model.get_gated_data()
# whiten them
gated_wfs = model.whiten(gated_wfs, 1)
gated_data = model.whiten(gated_data, 1)
# convert to the time domain
gated_wfs = {ifo: d.to_timeseries() for ifo, d in gated_wfs.items()}
gated_data = {ifo: d.to_timeseries() for ifo, d in gated_data.items()}

In [ ]:

# plot
%matplotlib notebook
from matplotlib import pyplot
fig, axes = pyplot.subplots(nrows=3, figsize=(8,8))
for ii, ifo in enumerate(gated_wfs):
    ax = axes[ii]
    # we'll plot w.r.t. the geocentric end gate time
    t_gate_end = model.current_params['t_gate_end']
    d = gated_data[ifo].time_slice(t_gate_end-0.15, t_gate_end+0.15)
    t = d.sample_times - t_gate_end
    ax.plot(t, d, color='black', label='{} gated data'.format(ifo))
    # plot the waveform
    d = gated_wfs[ifo].time_slice(t_gate_end-0.15, t_gate_end+0.15)
    t = d.sample_times - t_gate_end
    ax.plot(t, d, color='C1', lw=3, label='maxL gated template')
    ax.legend()
    if ii == 2:
        ax.set_xlabel(r'time - $t_{\rm{gate\,end}}$ (s)', fontsize=18)
    if ii == 1:
        ax.set_ylabel('whitened data', fontsize=18)
fig.show()

In [ ]:

Black Hole Spectroscopy with PyCBC Inference¶

Make sure the software is set up¶

Recap¶

The model¶

The data¶

The prior¶

The sampler¶

Run pycbc_inference¶

Plotting with pycbc_inference_plot_posterior¶

Plot a fully completed result¶

Loading the model from the posterior file¶

Run `pycbc_inference`¶

Plotting with `pycbc_inference_plot_posterior`¶