Before running the tutorial you should have Julia 1.5.x or newer installed and add the JuliaMolSim registry

] registry add https://github.com/JuliaRegistries/General

] registry add https://github.com/JuliaMolSim/MolSim.git

Packages can then be installed using

] add JuLIP, ACE, IPFitting

ACE Fitting and Regularisation

In [1]:
using IPFitting, ACE, JuLIP

Reading in the configurations

In [2]:
cfgs = IPFitting.Data.read_xyz("./Al_LD_MD.xyz", energy_key="energy", force_key="force", virial_key="virial");
Reading in ./Al_LD_MD.xyz ...
Processing data ...
┌ Info: Keys used: E => "energy", F => "force", V => "virial"
└ @ IPFitting.Data /Users/Cas/.julia/dev/IPFitting/src/data.jl:153
┌ Info: Array keys found: "force" [1063], "masses" [999], "momenta" [999], "numbers" [1063], "positions" [1063]
└ @ IPFitting.Data /Users/Cas/.julia/dev/IPFitting/src/data.jl:157
┌ Info: Info keys found: "config_type" [1063], "energy" [1063], "virial" [1063]
└ @ IPFitting.Data /Users/Cas/.julia/dev/IPFitting/src/data.jl:157
Progress: 100%|█████████████████████████████████████████| Time: 0:00:07
┌─────────────────┬───────┬───────┬───────┬───────┬───────┐
│     config_type  #cfgs  #envs     #E     #F     #V │
│          String  Int64  Int64  Int64  Int64  Int64 │
├─────────────────┼───────┼───────┼───────┼───────┼───────┤
│        Al_T4000 │    32 │  2240 │    32 │  6720 │   288 │
│        Al_T1000 │    32 │  2240 │    32 │  6720 │   288 │
│ Al_fcc_LD_1000K │   999 │   999 │   999 │  2997 │  8991 │
├─────────────────┼───────┼───────┼───────┼───────┼───────┤
│           total │  1063 │  5479 │  1063 │ 16437 │  9567 │
│         missing │     0 │     0 │     0 │     0 │     0 │
└─────────────────┴───────┴───────┴───────┴───────┴───────┘

Listing the unique configuration types (taken from Python ASE Atoms objects at.info["config_type"] from .xyz file)

In [3]:
@show unique(configtype.(cfgs));
unique(configtype.(cfgs)) = ["Al_T4000", "Al_T1000", "Al_fcc_LD_1000K"]

ACE Basis

We set up a 2B + ACE basis. We will use the many-body ACE basis for interatomic distances [0.8*r0, 2.0*r0], and the 2B for all distances within a 3.0*r0 cutoff.

Here B2 is the polynomial degree of the 2B part, N is the number of interaction neighbours, polydeg the maximum polynomial degree in the ACE fit

In [4]:
r0 = rnn(:Al)
N = 3
deg_site = 12
deg_pair = 3

# construction of a basic basis for site energies 
Bsite = rpi_basis(species = :Al,  
                  N = N,       # correlation order = body-order - 1
                  maxdeg = deg_site,  # polynomial degree
                  r0 = r0,     # estimate for NN distance
                  rin = 0.8*r0, rcut = 2*r0,   # domain for radial basis (cf documentation)
                  pin = 2)                     # require smooth inner cutoff

# pair potential basis 
Bpair = pair_basis(species = :Al, r0 = r0, maxdeg = deg_pair, 
                   rcut = 3 * r0, rin = 0.0, 
                   pin = 0 )   # pin = 0 means no inner cutoff


B = JuLIP.MLIPs.IPSuperBasis([Bpair, Bsite]);

length(B) returns the size of the ACE Basis. Generally chosen in a 1000 to 10 000 range

In [5]:
length(B)
Out[5]:
214

LsqDB assembly

Use export JULIA_NUM_THREADS=16 to parallelise the LSQDB assembly over 16 cores. Default is assembly in serial. Unfortunately Julia doesn't allow (yet?) controlling the number of threads from within the code.

The assembled least squares system is then stored on disk. For a simple problem such as this, it is not necessary. But for larger problems it can take hours to assemble the lsq matrix versus second to load it from disk. This makes it possible to experiment interactively with different fitting parameters

In [6]:
dbname = "./Al_MD_LD_PH_B$(deg_pair)_N$(N)_$(deg_site)"
@show dbname
dB = LsqDB(dbname, B, cfgs);
dbname = "./Al_MD_LD_PH_B3_N3_12"
Assemble LSQ blocks in serial
Progress: 100%|█████████████████████████████████████████| Time: 0:01:15
┌ Info: Elapsed: 75.8s
└ @ IPFitting.Tools /Users/Cas/.julia/dev/IPFitting/src/tools.jl:68
┌ Info: Writing db to disk...
└ @ IPFitting.DB /Users/Cas/.julia/dev/IPFitting/src/lsq_db.jl:188
┌ Warning: The file ./Al_MD_LD_PH_B3_N3_12_info.json already exists. It will be renamed to ./Al_MD_LD_PH_B3_N3_12_info.json.gyrja to avoid overwriting.
└ @ IPFitting.DB /Users/Cas/.julia/dev/IPFitting/src/lsq_db.jl:76
┌ Warning: The file ./Al_MD_LD_PH_B3_N3_12_kron.h5 already exists. It will be renamed to ./Al_MD_LD_PH_B3_N3_12_kron.h5.zvtwj to avoid overwriting.
└ @ IPFitting.DB /Users/Cas/.julia/dev/IPFitting/src/lsq_db.jl:76
┌ Info: ... done
└ @ IPFitting.DB /Users/Cas/.julia/dev/IPFitting/src/lsq_db.jl:195

Fitting

Load the assembled LsqDB database. This is not strictly necessary in this case since it is already in memory. But the more typical workflow is to have one script that assembles the database and a second script that loads it and produces fits.

In [7]:
dB = LsqDB(dbname);

E0 is the isolated atom energy

Set the weights, default sets the default weights, which can be overwritten for a specific configuration type

In [8]:
E0 = -105.5951

weights = Dict(
        "default" => Dict("E" => 30.0, "F" => 1.0 , "V" => 1.0 ),
        "Al_fcc_LD_1000K" => Dict("E" => 40.0, "F" => 1.0 , "V" => 10.0 ));

Vref = OneBody(:Al => E0)
Out[8]:
OneBody{Float64}(Dict(:Al => -105.5951))

Perform the fit using the lsqfit command. We're using simple ridge regression here.

In [9]:
IP, lsqinfo = lsqfit(dB, solver=(:rid, 1.05), weights = weights, Vref = Vref,
                               asmerrs = true);
┌ Info: assemble lsq system
└ @ IPFitting.Lsq /Users/Cas/.julia/dev/IPFitting/src/lsq.jl:383
┌ Info: Free Memory: ≈ 0.03 GB
└ @ IPFitting.Lsq /Users/Cas/.julia/dev/IPFitting/src/lsq.jl:307
┌ Info: Free Memory: ≈ 0.03 GB
└ @ IPFitting.Lsq /Users/Cas/.julia/dev/IPFitting/src/lsq.jl:307
┌ Info: Free Memory: ≈ 0.13 GB
└ @ IPFitting.Lsq /Users/Cas/.julia/dev/IPFitting/src/lsq.jl:307
┌ Info: solve (23878, 214) LSQ system using Ridge Regression [r = 1.05] 
└ @ IPFitting.Lsq /Users/Cas/.julia/dev/IPFitting/src/lsq.jl:432
┌ Info: `reglsq` : solve regularised least squares
└ @ IPFitting.Lsq /Users/Cas/.julia/dev/IPFitting/src/lsq.jl:1160
┌ Info: found bracket, starting bisection
└ @ IPFitting.Lsq /Users/Cas/.julia/dev/IPFitting/src/lsq.jl:1188
┌ Info: found a solution
└ @ IPFitting.Lsq /Users/Cas/.julia/dev/IPFitting/src/lsq.jl:1211
┌ Info: Free Memory: ≈ 0.09 GB
└ @ IPFitting.Lsq /Users/Cas/.julia/dev/IPFitting/src/lsq.jl:307
┌ Info: Relative RMSE on training set: 1.4690414202516895
└ @ IPFitting.Lsq /Users/Cas/.julia/dev/IPFitting/src/lsq.jl:1024
┌ Info: Assemble errors table
└ @ IPFitting.Lsq /Users/Cas/.julia/dev/IPFitting/src/lsq.jl:1056
┌ Warning: new error implementation... redo this part please 
└ @ IPFitting.Lsq /Users/Cas/.julia/dev/IPFitting/src/lsq.jl:1057
┌ Info: Assemble Information about the fit
└ @ IPFitting.Lsq /Users/Cas/.julia/dev/IPFitting/src/lsq.jl:1070
In [10]:
rmse_table(lsqinfo["errors"])
┌─────────────────┬─────────┬──────────┬────────┬────────┐
│     config type  E [meV]  F [eV/A]   F [%]  V[meV] │
├─────────────────┼─────────┼──────────┼────────┼────────┤
│        Al_T1000 │   3.402 │    0.078 │ 10.306 │ 19.783 │
│        Al_T4000 │   5.238 │    0.296 │  8.816 │ 50.306 │
│ Al_fcc_LD_1000K │   4.092 │    0.000 │    NaN │ 55.919 │
├─────────────────┼─────────┼──────────┼────────┼────────┤
│             set │   4.112 │    0.196 │  8.894 │ 55.014 │
└─────────────────┴─────────┴──────────┴────────┴────────┘
In [11]:
save_dict("ACE_Al.json", Dict("IP" => write_dict(IP), "info" => lsqinfo))
In [ ]:

In [ ]:

In [ ]:

2B Repulsion

Add 2B repulsion for interatomic distances < 0.6*r0. This ensures the potential is repulsive at small interatomic distances where there is no data.

In [12]:
ri = round(0.6 * r0, digits=2)
Vfit = IP.components[2]
Vrep = ACE.PairPotentials.RepulsiveCore(Vfit, ri)

IP.components[2] = Vrep;

Calculating error/force/virial errors with regularised and 2B repulsion fit.

In [13]:
add_fits_serial!(IP, cfgs; fitkey="fit_rep")
rmse_, rmserel_ = rmse(cfgs; fitkey="fit_rep");
In [14]:
rmse_table(rmse_, rmserel_)
┌─────────────────┬─────────┬──────────┬────────┬────────┐
│     config type  E [meV]  F [eV/A]   F [%]  V[meV] │
├─────────────────┼─────────┼──────────┼────────┼────────┤
│        Al_T1000 │   3.402 │    0.078 │ 10.306 │ 19.783 │
│        Al_T4000 │   5.198 │    0.296 │  8.810 │ 50.504 │
│ Al_fcc_LD_1000K │   4.092 │    0.000 │    NaN │ 55.919 │
├─────────────────┼─────────┼──────────┼────────┼────────┤
│             set │   4.111 │    0.196 │  8.888 │ 55.020 │
└─────────────────┴─────────┴──────────┴────────┴────────┘

Saving the fit

In [15]:
lsqinfo["errors"]["rmse"] = rmse_
lsqinfo["errors"]["rmserel"] = rmserel_

potname = "$(dbname)_regfit_$(ri)_rep.json"
save_dict(potname, Dict("IP" => write_dict(IP_reg), "info" => lsqinfo))
UndefVarError: IP_reg not defined

Stacktrace:
 [1] top-level scope at In[15]:5
 [2] include_string(::Function, ::Module, ::String, ::String) at ./loading.jl:1091

Usage

In [16]:
V = read_dict(load_dict(potname)["IP"]);
SystemError: opening file "./Al_MD_LD_PH_B3_N3_12_regfit_1.72_rep.json": No such file or directory

Stacktrace:
 [1] systemerror(::String, ::Int32; extrainfo::Nothing) at ./error.jl:168
 [2] #systemerror#48 at ./error.jl:167 [inlined]
 [3] systemerror at ./error.jl:167 [inlined]
 [4] open(::String; lock::Bool, read::Nothing, write::Nothing, create::Nothing, truncate::Nothing, append::Nothing) at ./iostream.jl:284
 [5] open at ./iostream.jl:273 [inlined]
 [6] open(::JSON.Parser.var"#4#5"{DataType,DataType,Nothing,Bool,Bool,Int64}, ::String; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at ./io.jl:323
 [7] open at ./io.jl:323 [inlined]
 [8] #parsefile#3 at /Users/Cas/.julia/packages/JSON/3rsiS/src/Parser.jl:519 [inlined]
 [9] parsefile at /Users/Cas/.julia/packages/JSON/3rsiS/src/Parser.jl:518 [inlined]
 [10] load_dict(::String) at /Users/Cas/.julia/dev/JuLIP/src/FIO.jl:78
 [11] top-level scope at In[16]:1
 [12] include_string(::Function, ::Module, ::String, ::String) at ./loading.jl:1091
In [17]:
at = bulk(:Al) * (2,2,2);
In [18]:
energy(V, at)
UndefVarError: V not defined

Stacktrace:
 [1] top-level scope at In[18]:1
 [2] include_string(::Function, ::Module, ::String, ::String) at ./loading.jl:1091
In [19]:
forces(V, at)
UndefVarError: V not defined

Stacktrace:
 [1] top-level scope at In[19]:1
 [2] include_string(::Function, ::Module, ::String, ::String) at ./loading.jl:1091
In [ ]:

In [ ]: