In this notebook, we introduce the Monte Carlo sampling methods for an example Thevenin model. This notebook includes importing experimental data for a Tesla 4680 NCA/Gr cell, and identifying the resistance elements of a two-RC Thevenin model. First, we import PyBOP and the required packages,
%pip install --upgrade pip ipywidgets -q
%pip install pybop -q
%pip install pandas -q
import sys
import numpy as np
import pandas as pd
import pybamm
import pybop
pybop.plot.PlotlyManager().pio.renderers.default = "notebook_connected"
parallel = True if sys.platform != "win32" else False
Note: you may need to restart the kernel to use updated packages. Note: you may need to restart the kernel to use updated packages. Note: you may need to restart the kernel to use updated packages.
Next, we will import and process the experimental data. The data for this notebook was obtained from the supplemental data provided in:
[1] M. Ank et al., ‘Lithium-Ion Cells in Automotive Applications: Tesla 4680 Cylindrical Cell Teardown and Characterization’, doi: 10.1149/1945-7111/ad14d0.
A portion of this data is retained in the PyBOP repository for use in parameterisation and optimisation examples. However, users' are pointed to the official location https://mediatum.ub.tum.de/1725661 for more information. First, we import the three-electrode pOCV dataset,
ocv_df = pd.read_csv(
"../data/Tesla_4680/T-cell_pOCV_data.txt",
sep="\t",
decimal=",",
)
Next, let's plot the terminal voltage to visualise the protocol.
ocv_df.plot(y="Ewe-Ece/V", x="time/h", kind="line")
<Axes: xlabel='time/h'>
As we are aiming to identify the resistance elements in an equivalent circuit model, we will construct an OCV function from the discharge portion of the experimental data. This is done by filtering the dataframe and appending an additional point to ensure the OCV function has data across the operating region (i.e. up to 4.2V).
sto_measured = (
1
- ocv_df.loc[(ocv_df["I/mA"] <= 0.0), "Capacity/mA.h"]
/ ocv_df.loc[(ocv_df["I/mA"] <= 0.0), "Capacity/mA.h"].iloc[-1]
)
V_measured = ocv_df.loc[(ocv_df["I/mA"] <= 0.0), "Ewe-Ece/V"]
V_measured.iloc[0] = 4.2 # Extend for improved interpolation (a bit of a fudge)
def ocv(sto):
name = "OCV"
x = np.flip(sto_measured.to_numpy())
y = np.flip(V_measured.to_numpy())
return pybamm.Interpolant(x, y, sto, name)
Next, let's construct the parameter set and update it with the known information and placeholder values for this cell.
parameter_set = pybop.ParameterSet.pybamm("ECM_Example")
parameter_set.update(
{
"Cell capacity [A.h]": 22.651, # 083/828 - C/20
"Nominal cell capacity [A.h]": 22.651,
"Current function [A]": 22.651,
"Initial SoC": 1.0,
"Upper voltage cut-off [V]": 4.25, # Extended to avoid hitting event
"Lower voltage cut-off [V]": 2.5,
"Open-circuit voltage [V]": ocv,
"R2 [Ohm]": 1e-4, # placeholder
"C2 [F]": 4e5, # placeholder
"Element-2 initial overpotential [V]": 0,
},
check_already_exists=False,
)
Now we import the corresponding constant current tests for this cell and construct a dataset from a subset of this time-series. This is done to reduce the computational time for inference, but also to improve the robustness of the inference task.
cycling_df = pd.read_csv(
"../data/Tesla_4680/601-828_Capacity_03_MB_CB1_subset.txt",
sep="\t",
)
filter_cycling = cycling_df.loc[54811:61000].copy() # Full cycle is [54811:127689]
filter_cycling["time/s"] = filter_cycling["time/s"] - filter_cycling["time/s"].iloc[0]
# Take every 100th point
filtered_cycling_df = filter_cycling.iloc[
[i for i in range(len(filter_cycling)) if not i % 100 != 0]
]
filtered_cycling_df.plot(x="time/s", y="Ecell/V", kind="line")
<Axes: xlabel='time/s'>
parameters = pybop.Parameters(
pybop.Parameter(
"R0 [Ohm]",
prior=pybop.Gaussian(3e-3, 1e-3),
bounds=[1e-5, 1e-2],
),
pybop.Parameter(
"R1 [Ohm]",
prior=pybop.Gaussian(5e-3, 1e-3),
bounds=[1e-5, 1e-2],
),
pybop.Parameter(
"R2 [Ohm]",
prior=pybop.Gaussian(1e-4, 5e-5),
bounds=[1e-5, 1e-2],
),
)
In this parameter inference task, we use the filtered time-series data to reduce the computation time. This choice can be validated by comparing the inference results to the full time-series inference, which isn't covered in this example.
dataset_hundred = pybop.Dataset(
{
"Time [s]": filtered_cycling_df["time/s"].to_numpy(),
"Current function [A]": -filtered_cycling_df["I/mA"].to_numpy()
/ 1000, # Convert mA to A
"Voltage [V]": filtered_cycling_df["Ecell/V"].to_numpy(),
}
)
We now construct a two-pair RC model and build the model with an initial SOC based on the first voltage point in the experimental data.
The FittingProblem
and likelihood classes are constructed with posterior built from the GaussianLogLikelihood.
model = pybop.empirical.Thevenin(
parameter_set=parameter_set, options={"number of rc elements": 2}
)
model.build(
initial_state={
"Initial open-circuit voltage [V]": dataset_hundred["Voltage [V]"][0]
}
)
# Generate problem, likelihood, and sampler
problem = pybop.FittingProblem(model, parameters, dataset_hundred)
likelihood = pybop.GaussianLogLikelihood(problem)
posterior = pybop.LogPosterior(likelihood)
Below we identify the parameters using the Maximum a Posterior estimate and the Covariance Matrix Adaptation Evolution Strategy (CMAES). We will use this estimate later in the Bayesian inference task as a starting position for the chains.
optim = pybop.CMAES(
posterior,
sigma0=[1e-4, 1e-4, 1e-4, 1e-4],
max_iterations=200,
max_unchanged_iterations=40,
parallel=parallel,
)
results_optim = optim.run()
print(results_optim)
pybop.plot.quick(problem, results_optim.x)
pybop.plot.convergence(optim)
pybop.plot.parameters(optim);
/Users/engs2510/.pyenv/versions/3.12.4/lib/python3.12/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.
OptimisationResult: Optimised parameters: [4.15690229e-03 1.64236988e-03 8.78939716e-05 2.78474178e-04] Final cost: 431.8611377082031 Number of iterations: 78 SciPy result available: No
/Users/engs2510/.pyenv/versions/pybop/lib/python3.12/site-packages/pybamm/solvers/base_solver.py:762: SolverWarning: Explicit interpolation times not implemented for CasADi solver with 'safe' mode
Below we construct the Monte Carlo sampler, specifically the Hamiltionian sampler, which samples from the posterior using Hamiltonion dynamics with gradient information. To minimise the time to execute this notebook, we greatly limit the number of samples; however, this should be increased for actual inference tasks. We initialise the chains from the Maximum a Posteriori point-estimate obtained from the CMAES optimisation above, as this should be close to the mode of the posterior.
sampler = pybop.HaarioBardenetACMC(
posterior,
chains=3,
x0=results_optim.x, # Initialise at the MAP estimate
max_iterations=1500, # Increase for accurate posteriors
warm_up=350,
verbose=True,
parallel=parallel, # Only supported for macOS/WSL/Linux
)
chains = sampler.run()
# Create summary statistics
posterior_summary = pybop.PosteriorSummary(chains)
# Create rhat of chains
print(posterior_summary.rhat())
Using Haario-Bardenet adaptive covariance MCMC Generating 3 chains. Running in parallel with 12 worker processes. /Users/engs2510/.pyenv/versions/3.12.4/lib/python3.12/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock. | Iteration: 1 | Iter/s: 0.00 | | Iteration: 2 | Iter/s: 28.66 | | Iteration: 3 | Iter/s: 25.26 | | Iteration: 4 | Iter/s: 74.38 | | Iteration: 5 | Iter/s: 28.77 | | Iteration: 6 | Iter/s: 25.46 | | Iteration: 7 | Iter/s: 52.50 | | Iteration: 8 | Iter/s: 82.83 | | Iteration: 9 | Iter/s: 63.31 | | Iteration: 10 | Iter/s: 1.61 | | Iteration: 50 | Iter/s: 15.56 | | Iteration: 100 | Iter/s: 16.85 | | Iteration: 150 | Iter/s: 44.17 | | Iteration: 200 | Iter/s: 125.50 | | Iteration: 250 | Iter/s: 145.31 | Initial phase completed. | Iteration: 300 | Iter/s: 18.41 | | Iteration: 350 | Iter/s: 124.50 | | Iteration: 400 | Iter/s: 49.71 | | Iteration: 450 | Iter/s: 15.42 | | Iteration: 500 | Iter/s: 12.31 | | Iteration: 550 | Iter/s: 12.83 | | Iteration: 600 | Iter/s: 16.69 | | Iteration: 650 | Iter/s: 18.65 | | Iteration: 700 | Iter/s: 23.45 | | Iteration: 750 | Iter/s: 9.57 | | Iteration: 800 | Iter/s: 24.36 | | Iteration: 850 | Iter/s: 6.11 | | Iteration: 900 | Iter/s: 7.26 | | Iteration: 950 | Iter/s: 12.78 | | Iteration: 1000 | Iter/s: 9.71 | | Iteration: 1050 | Iter/s: 10.11 | | Iteration: 1100 | Iter/s: 10.57 | | Iteration: 1150 | Iter/s: 9.25 | | Iteration: 1200 | Iter/s: 14.29 | | Iteration: 1250 | Iter/s: 12.10 | | Iteration: 1300 | Iter/s: 37.08 | | Iteration: 1350 | Iter/s: 11.71 | | Iteration: 1400 | Iter/s: 30.06 | | Iteration: 1450 | Iter/s: 48.85 | | Iteration: 1500 | Iter/s: 9.15 | Halting: Maximum number of iterations (1500) reached.
[1.00493998 1.00643067 1.00368448 1.01600809]
Next, we plot the parameter traces as well as the combined posterior (across all chains) and each chain individually.
posterior_summary.plot_trace()
posterior_summary.plot_posterior()
posterior_summary.plot_chains()
As expected, these chains haven't fully converged yet, so the values obtained from the posterior will be biased based on the initial conditions for sampling. Increasing the number of iterations for the sampler while also calibrating the warm_up
period will provide better results. Next, for completeness, we will plot the identified time-series model against the experimental data.
pybop.plot.quick(problem, posterior_summary.mean);
/Users/engs2510/.pyenv/versions/pybop/lib/python3.12/site-packages/pybamm/solvers/base_solver.py:762: SolverWarning: Explicit interpolation times not implemented for CasADi solver with 'safe' mode
Now, we can compare the identified parameters for each method. As the Bayesian sampler provides us with samples from the posterior, we use the statistical moments when comparing to the point-based result from the optimiser.
print(f"The XNES result is: {results_optim.x}")
print(f"The posterior means are:{posterior_summary.mean}")
print(
f"The difference between the two methods is: {np.abs(results_optim.x - posterior_summary.mean)}"
)
The XNES result is: [4.15690229e-03 1.64236988e-03 8.78939716e-05 2.78474178e-04] The posterior means are:[4.14717807e-03 1.64491239e-03 9.20455193e-05 2.86617819e-04] The difference between the two methods is: [9.72421663e-06 2.54251113e-06 4.15154776e-06 8.14364117e-06]
The sampled posterior also gives us information of the uncertainty in the inference process. This is present in the standard deviations as well as the lower and upper confidence intervals, shown below.
posterior_summary.summary_table()