Parameter estimation¶

In [6]:

using Pkg
Pkg.activate("..")
using Clapeyron, Metaheuristics

  Activating project at `~/Library/CloudStorage/OneDrive-CaliforniaInstituteofTechnology/University/UROP/SAFT_codes/Clapeyron`

In this notebook, we will illustrate how one can perform parameter estimation using Clapeyron.jl. To give the user the most-flexibility possible, we have left the choice of optimizer up to them. For all examples considered, we will be using Metaheuristics.jl. To keep the optimizations short, we'll use a very basic method (actual optimizations should use many more iterations):

In [110]:

method = ECA(;options=Options(iterations=100));

Pure-component parameter estimation in SAFT equations¶

As a first example, we will fit the pure-component PC-SAFT parameters for methane in Clapeyron.jl. Although we use the PC-SAFT equation of state in this example, this procedure could be repeated using any other pure-component equation of state available in Clapeyron.jl.

First we generate the model:

In [89]:

model = PCSAFT(["methane"]);

One can imagine that this model is our 'initial guess' for the parameters of methane. If the user wish to develop parameters for a species not available in Clapeyron.jl, they can introduce their parameters using the userlocations optional argument for the model. The next step is to define which parameters need to be fitted, along with their bounds and initial guesses:

In [90]:

toestimate = [
    Dict(
        :param => :epsilon,
        :lower => 100.,
        :upper => 300.,
        :guess => 250.
    ),
    Dict(
        :param => :sigma,
        :factor => 1e-10,
        :lower => 3.2,
        :upper => 4.0,
        :guess => 3.7
    )
    ,
    Dict(
        :param => :segment,
        :lower => 0.9,
        :upper => 1.1,
        :guess => 1.
    )
];

Note that, in the above, we have specified an additional argument, :factor, for :sigma. This is because, for most optimisers, it is best if all variables have values close to the same magnitude. Within Clapeyron.jl, all $\sigma$ values are in meters, which will be much smaller than all other parameters. As such, at the level of the optimiser, these parameters will be treated in angstroms and will be converted to the correct units internally.

The next step is to define the properties we wish to fit to. While there are many property estimation methods available in Clapeyron.jl, they may not always output the desired values. For example, the saturation_pressure method outputs the saturation pressure, liquid volume and vapour volume. In most cases for SAFT-type parameters, we will want to fit to the saturation pressure and liquid density. As such, we can define two new functions:

In [91]:

function saturation_p(model::EoSModel,T)
    sat = saturation_pressure(model,T)
    return sat[1]
end

function saturation_rhol(model::EoSModel,T)
    sat = saturation_pressure(model,T)
    return 1/sat[2]
end

saturation_rhol (generic function with 1 method)

The last step is the provide the experimental data. Within Clapeyron.jl, we accept our inputs as .csv files with the following format: | Clapeyron Estimator | | |---------------------|-------| | [method=saturation_p] | | | T | out_p | | 45.23 | 11.13 | | 55.29 | 391.8 |

Note that the inputs and outputs of the function named in the second cell is by the prefix out_ in the case of the latter.

Now that each part of the parameter estimation problem has been defined, we can compile it all together:

In [92]:

estimator,objective,initial,upper,lower = Estimation(model,toestimate,["data/saturation_pressure.csv","data/saturation_liquid_density.csv"]);

The estimator object contains all of the information relevant to the parameter estimation problem and objective takes in guesses for the parameters and outputs the value of the objective function (we use the root-mean-squared-relative error). initial, upper and lower are self-explanatory. We can then use our global optimiser to solve for the optimal parameters given a set of experimental data:

In [93]:

params, model = optimize(objective, estimator, method);

where params are the optimized parameters and model is the new model containing the optimized parameters. For easy storage, one can export the model to CSV files using export_model:

In [94]:

export_model(model);

If the user wishes to weight the various properties being fit to differently, this can be achieved by adding the weights when we build the estimator:

In [95]:

estimator,objective,initial,upper,lower = Estimation(model,toestimate,[(2.,"data/saturation_pressure.csv"),(1.,"data/saturation_liquid_density.csv")]);

We can then re-optimise the parameters:

In [96]:

params, model = optimize(objective, estimator, method);

One thing to note above is that, for evaluating the saturation pressure and saturated liquid densities, this is not the most-efficient way of doing so as it involves two calls to the saturation_pressure function. If we instead define a new function which outputs both properties, we can combine the csv spreadsheets into one:

In [97]:

function saturation_p_rhol(model::EoSModel,T)
    sat = saturation_pressure(model,T)
    return sat[1], 1/sat[2]
end

saturation_p_rhol (generic function with 1 method)

Re-building the estimator:

In [98]:

estimator,objective,initial,upper,lower = Estimation(model,toestimate,["data/saturation_pressure_liquid_density.csv"])

params, model = optimize(objective, estimator, method);

Mixture system parameter estimation in Activity Coefficient Models¶

Consider a water+ethanol system modeled using NRTL where we need to fit the cross binary interaction parameters ($A_{ij}$). Again, as a first step, we construct the initial model:

In [99]:

model = NRTL(["water","ethanol"])

NRTL{PR{BasicIdeal, PRAlpha, NoTranslation, vdW1fRule}} with 2 components:
 "water"
 "ethanol"
Contains parameters: a, b, c, Mw

For the sake of simplicity, we are only going to re-fit $a_{12}$, $a_{21}$ and $c_{12}$. As before, we can define the set of parameters we wish to fit:

In [100]:

toestimate = [
    Dict(
        :param => :a,
        :indices => (1,2),
        :symmetric => false,
        :lower => 2.,
        :upper => 5.,
        :guess => 3.
    ),
    Dict(
        :param => :a,
        :indices => (2,1),
        :symmetric => false,
        :lower => -2.,
        :upper => 0.,
        :guess => -1.
    )
    ,
    Dict(
        :param => :c,
        :indices => (1,2),
        :lower => 0.2,
        :upper => 0.5,
        :guess => 0.3
    )
];

One might notice some slight differences in the above example. For one, we have now specified the indices of the parameters we wish to fit (Clapeyron.jl assumes that the indices are always (1,1) unless otherwise specified). If one isn't sure of the indices of the parameters one wants to fit, one can look at the model.params object.

Furthermore, in the case of the a parameters, as they are asymmetric, an additional argument needs to be specified (:symmetric=>false) as Clapeyron.jl assumes that all binary interaction parameters are symmetric. This is why the :symmetric argument for the c parameter did not need to be specified.

Subsequently, we can define the properties we wish to estimate:

In [101]:

function bubble_point(model::EoSModel,T,x)
    bub = bubble_temperature(model,T,[x,1-x])
    return bub[1], bub[4][1]
end

bubble_point (generic function with 1 method)

Building the estimator:

In [102]:

estimator,objective,initial,upper,lower = Estimation(model,toestimate,["data/bubble_point.csv"]);

And estimating:

In [103]:

params, model = optimize(objective, estimator, method);

Mixture system parameter estimation in predictive cubics¶

Now consider the case where we wish to optimize activity coefficient parameters within a $G_E$ mixing rule of a cubic (for example, PSRK):

In [104]:

model = PSRK(["ethanol","water"])

RK{BasicIdeal, SoaveAlpha, PenelouxTranslation, PSRKRule{PSRKUNIFAC{BasicIdeal}}} with 2 components:
 "ethanol"
 "water"
Contains parameters: a, b, Tc, Pc, Mw

Here, the parameters are stored deep within the model struct:

In [105]:

model.mixing.activity.params

Clapeyron.UNIFACParam for ["CH2", "CH3", "OH", "H2O"] with 5 params:
 A::PairParam{Float64}
 B::PairParam{Float64}
 C::PairParam{Float64}
 R::SingleParam{Float64}
 Q::SingleParam{Float64}

Thankfully, Clapeyron will iterate down the struct until it finds parameters with names which match those specified within toestimate. As such, not much needs to change at this level:

In [106]:

toestimate = [
    Dict(
        :param => :A,
        :indices => (3,4),
        :symmetric => false,
        :lower => -200.,
        :upper => 500.,
        :guess => -250.
    ),
    Dict(
        :param => :A,
        :indices => (4,3),
        :symmetric => false,
        :lower => 0.,
        :upper => 1000.,
        :guess => 350.
    )
];

The only thing we have to worry about is if the model contains other submodels whose parameters match those we wish to optimize. To avoid optimizing the wrong parameters, we can specify a submodel to ignore using an additional argument to the Estimation function:

In [107]:

estimator,objective,initial,upper,lower = Estimation(model,toestimate,["data/bubble_point.csv"],[:puremodel]);

After this, optimization can continue as normal:

In [113]:

method = ECA(;options=Options(iterations=100));

params, model = optimize(objective, estimator, method);

Estimating association parameters in SAFT equations of state¶

Sticking to the same ethanol+water system, let us say we want to re-fit the water association parameters. One thing to note in Clapeyron.jl is that the association parameters are compressed:

In [114]:

model = PCSAFT(["water","ethanol"]);

model.params.epsilon_assoc

AssocParam{Float64}["water", "ethanol"]) with 4 values:
("water", "e") >=< ("water", "H"): 2500.7
("water", "e") >=< ("ethanol", "H"): 2577.05
("water", "H") >=< ("ethanol", "e"): 2577.05
("ethanol", "e") >=< ("ethanol", "H"): 2653.4

As such, when specifying which index to fit, we need to specify the index based on the list above. As mentioned before, the index assumed by Clapeyron.jl is always (1,1). As such, for fitting just the water parameters, we don't need to specify the index:

In [115]:

toestimate = [
    Dict(
        :param => :epsilon_assoc,
        :lower => 1000.,
        :upper => 3000.,
        :guess => 2500.
    ),
    Dict(
        :param => :bondvol,
        :lower => 0.02,
        :upper => 0.04,
        :guess => 0.03
    )
];

We can recombine everything as before to build our estimator:

In [116]:

estimator,objective,initial,upper,lower = Estimation(model,toestimate,["data/bubble_point.csv"]);

We can then re-fit the parameters:

In [117]:

method = ECA(;options=Options(iterations=100));

params, model = optimize(objective, estimator, method);

However, more-realistically, we will want to fit the cross-association parameters for the ethanol+water system. In this case, we have two sets of parameters which need to be varied together (ethanol,H-water,e and ethanol,e-water,H). This can be done by specifying the :cross_assoc=>true argument:

In [118]:

toestimate = [
    Dict(
        :param => :epsilon_assoc,
        :indices => 2,
        :cross_assoc => true,
        :lower => 1000.,
        :upper => 3000.,
        :guess => 2500.
    ),
    Dict(
        :param => :bondvol,
        :indices => 2,
        :cross_assoc => true,
        :lower => 0.02,
        :upper => 0.04,
        :guess => 0.03
    )
];

We can now build our estimator and re-fit the parameters:

In [119]:

method = ECA(;options=Options(iterations=100));

params, model = optimize(objective, estimator, method);

Group-contribution parameter estimation¶

The last case we consider is the parameter estimation of group-contribution parameters. We will do this in the context of SAFT-$\gamma$ Mie for the ethane+propane system. As before, let us define our model:

In [120]:

model = SAFTgammaMie(["ethane","propane"]);

In this case, we are going to re-fit the $\epsilon$ parameters for both the CH$_3$ and CH$_2$ groups. However, in doing so, we want the unlike $\epsilon$ parameter to be updated using the Hudson-McCoubrey combining rule:

$$ \epsilon_{kl}=\frac{\sqrt{\sigma_{kk}^3\sigma_{ll}^3}}{\sqrt{\sigma_{kl}^3}}\sqrt{\epsilon_{kk}\epsilon_{ll}}$$

By default, Clapeyron.jl will only vary the parameters specified in the toestimate object. The way to specify that we wish to use the combining rules in Clapeyron.jl is using the :recombine argument in our parameters:

In [121]:

toestimate = [
    Dict(
        :param => :epsilon,
        :indices => (1,1),
        :recombine => true,
        :lower => 200.,
        :upper => 500.,
        :guess => 350.
    ),
    Dict(
        :param => :epsilon,
        :indices => (2,2),
        :recombine => true,
        :lower => 200.,
        :upper => 500.,
        :guess => 350.
    )
];

Once this is done, we can define our property we wish to estimate. In this case, as the CH$_3$ and CH$_2$ groups are involved in both species, and we wish to fit using pure-component saturation pressure, we will need to specify which species is involved in which data set: | Clapeyron Estimator | | |---------------------|-------| | [method=saturation_p_rhol,species=ethane] | | | T | out_p | | 45.23 | 11.13 | | 55.29 | 391.8 |

With all of this set-up, we can build our estimator:

In [122]:

estimator,objective,initial,upper,lower = Estimation(model,toestimate,["data/gc_sat_eth.csv","data/gc_sat_prop.csv"],[:vrmodel]);

Note that, above, we have an additional argument, [:vrmodel]. The difficulty with SAFT-$\gamma$ Mie specifically is that it has a submodel, vrmodel where we have mapped the group-contribution parameters to component-specific parameters, which helps keep the implementation of SAFT-$\gamma$ Mie concise. The disadvantage of this is that there is now a submodel in SAFT-$\gamma$ Mie which has the same parameter names as the ones we want to fit. By default, Clapeyron.jl will look through all submodels to find the parameters with the names specified. As such, to avoid incorrectly varying the submodel parameters, we add this additional argument.

With everything set-up, we can now fit the parameters:

In [123]:

method = ECA(;options=Options(iterations=100));

params, model = optimize(objective, estimator, method);