Before running the tutorial you should have Julia 1.5.x or newer installed and add the JuliaMolSim registry
] registry add https://github.com/JuliaRegistries/General
] registry add https://github.com/JuliaMolSim/MolSim.git
Packages can then be installed using
] add JuLIP, ACE, IPFitting
using IPFitting, ACE, JuLIP
cfgs = IPFitting.Data.read_xyz("./Al_LD_MD.xyz", energy_key="energy", force_key="force", virial_key="virial");
Reading in ./Al_LD_MD.xyz ... Processing data ...
┌ Info: Keys used: E => "energy", F => "force", V => "virial"
└ @ IPFitting.Data /Users/Cas/.julia/dev/IPFitting/src/data.jl:153
┌ Info: Array keys found: "force" [1063], "masses" [999], "momenta" [999], "numbers" [1063], "positions" [1063]
└ @ IPFitting.Data /Users/Cas/.julia/dev/IPFitting/src/data.jl:157
┌ Info: Info keys found: "config_type" [1063], "energy" [1063], "virial" [1063]
└ @ IPFitting.Data /Users/Cas/.julia/dev/IPFitting/src/data.jl:157
Progress: 100%|█████████████████████████████████████████| Time: 0:00:07
┌─────────────────┬───────┬───────┬───────┬───────┬───────┐ │ config_type │ #cfgs │ #envs │ #E │ #F │ #V │ │ String │ Int64 │ Int64 │ Int64 │ Int64 │ Int64 │ ├─────────────────┼───────┼───────┼───────┼───────┼───────┤ │ Al_T4000 │ 32 │ 2240 │ 32 │ 6720 │ 288 │ │ Al_T1000 │ 32 │ 2240 │ 32 │ 6720 │ 288 │ │ Al_fcc_LD_1000K │ 999 │ 999 │ 999 │ 2997 │ 8991 │ ├─────────────────┼───────┼───────┼───────┼───────┼───────┤ │ total │ 1063 │ 5479 │ 1063 │ 16437 │ 9567 │ │ missing │ 0 │ 0 │ 0 │ 0 │ 0 │ └─────────────────┴───────┴───────┴───────┴───────┴───────┘
Listing the unique configuration types (taken from Python ASE Atoms objects at.info["config_type"]
from .xyz file)
@show unique(configtype.(cfgs));
unique(configtype.(cfgs)) = ["Al_T4000", "Al_T1000", "Al_fcc_LD_1000K"]
We set up a 2B + ACE basis. We will use the many-body ACE basis for interatomic distances [0.8*r0
, 2.0*r0
], and the 2B for all distances within a 3.0*r0
cutoff.
Here B2
is the polynomial degree of the 2B part, N
is the number of interaction neighbours, polydeg
the maximum polynomial degree in the ACE fit
r0 = rnn(:Al)
N = 3
deg_site = 12
deg_pair = 3
# construction of a basic basis for site energies
Bsite = rpi_basis(species = :Al,
N = N, # correlation order = body-order - 1
maxdeg = deg_site, # polynomial degree
r0 = r0, # estimate for NN distance
rin = 0.8*r0, rcut = 2*r0, # domain for radial basis (cf documentation)
pin = 2) # require smooth inner cutoff
# pair potential basis
Bpair = pair_basis(species = :Al, r0 = r0, maxdeg = deg_pair,
rcut = 3 * r0, rin = 0.0,
pin = 0 ) # pin = 0 means no inner cutoff
B = JuLIP.MLIPs.IPSuperBasis([Bpair, Bsite]);
length(B)
returns the size of the ACE Basis. Generally chosen in a 1000 to 10 000 range
length(B)
214
Use export JULIA_NUM_THREADS=16
to parallelise the LSQDB assembly over 16 cores. Default is assembly in serial. Unfortunately Julia doesn't allow (yet?) controlling the number of threads from within the code.
The assembled least squares system is then stored on disk. For a simple problem such as this, it is not necessary. But for larger problems it can take hours to assemble the lsq matrix versus second to load it from disk. This makes it possible to experiment interactively with different fitting parameters
dbname = "./Al_MD_LD_PH_B$(deg_pair)_N$(N)_$(deg_site)"
@show dbname
dB = LsqDB(dbname, B, cfgs);
dbname = "./Al_MD_LD_PH_B3_N3_12" Assemble LSQ blocks in serial
Progress: 100%|█████████████████████████████████████████| Time: 0:01:15
┌ Info: Elapsed: 75.8s
└ @ IPFitting.Tools /Users/Cas/.julia/dev/IPFitting/src/tools.jl:68
┌ Info: Writing db to disk...
└ @ IPFitting.DB /Users/Cas/.julia/dev/IPFitting/src/lsq_db.jl:188
┌ Warning: The file ./Al_MD_LD_PH_B3_N3_12_info.json already exists. It will be renamed to ./Al_MD_LD_PH_B3_N3_12_info.json.gyrja to avoid overwriting.
└ @ IPFitting.DB /Users/Cas/.julia/dev/IPFitting/src/lsq_db.jl:76
┌ Warning: The file ./Al_MD_LD_PH_B3_N3_12_kron.h5 already exists. It will be renamed to ./Al_MD_LD_PH_B3_N3_12_kron.h5.zvtwj to avoid overwriting.
└ @ IPFitting.DB /Users/Cas/.julia/dev/IPFitting/src/lsq_db.jl:76
┌ Info: ... done
└ @ IPFitting.DB /Users/Cas/.julia/dev/IPFitting/src/lsq_db.jl:195
Load the assembled LsqDB database. This is not strictly necessary in this case since it is already in memory. But the more typical workflow is to have one script that assembles the database and a second script that loads it and produces fits.
dB = LsqDB(dbname);
E0
is the isolated atom energy
Set the weights, default
sets the default weights, which can be overwritten for a specific configuration type
E0 = -105.5951
weights = Dict(
"default" => Dict("E" => 30.0, "F" => 1.0 , "V" => 1.0 ),
"Al_fcc_LD_1000K" => Dict("E" => 40.0, "F" => 1.0 , "V" => 10.0 ));
Vref = OneBody(:Al => E0)
OneBody{Float64}(Dict(:Al => -105.5951))
Perform the fit using the lsqfit
command. We're using simple ridge regression here.
IP, lsqinfo = lsqfit(dB, solver=(:rid, 1.05), weights = weights, Vref = Vref,
asmerrs = true);
┌ Info: assemble lsq system └ @ IPFitting.Lsq /Users/Cas/.julia/dev/IPFitting/src/lsq.jl:383 ┌ Info: Free Memory: ≈ 0.03 GB └ @ IPFitting.Lsq /Users/Cas/.julia/dev/IPFitting/src/lsq.jl:307 ┌ Info: Free Memory: ≈ 0.03 GB └ @ IPFitting.Lsq /Users/Cas/.julia/dev/IPFitting/src/lsq.jl:307 ┌ Info: Free Memory: ≈ 0.13 GB └ @ IPFitting.Lsq /Users/Cas/.julia/dev/IPFitting/src/lsq.jl:307 ┌ Info: solve (23878, 214) LSQ system using Ridge Regression [r = 1.05] └ @ IPFitting.Lsq /Users/Cas/.julia/dev/IPFitting/src/lsq.jl:432 ┌ Info: `reglsq` : solve regularised least squares └ @ IPFitting.Lsq /Users/Cas/.julia/dev/IPFitting/src/lsq.jl:1160 ┌ Info: found bracket, starting bisection └ @ IPFitting.Lsq /Users/Cas/.julia/dev/IPFitting/src/lsq.jl:1188 ┌ Info: found a solution └ @ IPFitting.Lsq /Users/Cas/.julia/dev/IPFitting/src/lsq.jl:1211 ┌ Info: Free Memory: ≈ 0.09 GB └ @ IPFitting.Lsq /Users/Cas/.julia/dev/IPFitting/src/lsq.jl:307 ┌ Info: Relative RMSE on training set: 1.4690414202516895 └ @ IPFitting.Lsq /Users/Cas/.julia/dev/IPFitting/src/lsq.jl:1024 ┌ Info: Assemble errors table └ @ IPFitting.Lsq /Users/Cas/.julia/dev/IPFitting/src/lsq.jl:1056 ┌ Warning: new error implementation... redo this part please └ @ IPFitting.Lsq /Users/Cas/.julia/dev/IPFitting/src/lsq.jl:1057 ┌ Info: Assemble Information about the fit └ @ IPFitting.Lsq /Users/Cas/.julia/dev/IPFitting/src/lsq.jl:1070
rmse_table(lsqinfo["errors"])
┌─────────────────┬─────────┬──────────┬────────┬────────┐ │ config type │ E [meV] │ F [eV/A] │ F [%] │ V[meV] │ ├─────────────────┼─────────┼──────────┼────────┼────────┤ │ Al_T1000 │ 3.402 │ 0.078 │ 10.306 │ 19.783 │ │ Al_T4000 │ 5.238 │ 0.296 │ 8.816 │ 50.306 │ │ Al_fcc_LD_1000K │ 4.092 │ 0.000 │ NaN │ 55.919 │ ├─────────────────┼─────────┼──────────┼────────┼────────┤ │ set │ 4.112 │ 0.196 │ 8.894 │ 55.014 │ └─────────────────┴─────────┴──────────┴────────┴────────┘
save_dict("ACE_Al.json", Dict("IP" => write_dict(IP), "info" => lsqinfo))
Add 2B repulsion for interatomic distances < 0.6*r0
. This ensures the potential is repulsive at small interatomic distances where there is no data.
ri = round(0.6 * r0, digits=2)
Vfit = IP.components[2]
Vrep = ACE.PairPotentials.RepulsiveCore(Vfit, ri)
IP.components[2] = Vrep;
Calculating error/force/virial errors with regularised and 2B repulsion fit.
add_fits_serial!(IP, cfgs; fitkey="fit_rep")
rmse_, rmserel_ = rmse(cfgs; fitkey="fit_rep");
rmse_table(rmse_, rmserel_)
┌─────────────────┬─────────┬──────────┬────────┬────────┐ │ config type │ E [meV] │ F [eV/A] │ F [%] │ V[meV] │ ├─────────────────┼─────────┼──────────┼────────┼────────┤ │ Al_T1000 │ 3.402 │ 0.078 │ 10.306 │ 19.783 │ │ Al_T4000 │ 5.198 │ 0.296 │ 8.810 │ 50.504 │ │ Al_fcc_LD_1000K │ 4.092 │ 0.000 │ NaN │ 55.919 │ ├─────────────────┼─────────┼──────────┼────────┼────────┤ │ set │ 4.111 │ 0.196 │ 8.888 │ 55.020 │ └─────────────────┴─────────┴──────────┴────────┴────────┘
Saving the fit
lsqinfo["errors"]["rmse"] = rmse_
lsqinfo["errors"]["rmserel"] = rmserel_
potname = "$(dbname)_regfit_$(ri)_rep.json"
save_dict(potname, Dict("IP" => write_dict(IP_reg), "info" => lsqinfo))
UndefVarError: IP_reg not defined
Stacktrace:
[1] top-level scope at In[15]:5
[2] include_string(::Function, ::Module, ::String, ::String) at ./loading.jl:1091
V = read_dict(load_dict(potname)["IP"]);
SystemError: opening file "./Al_MD_LD_PH_B3_N3_12_regfit_1.72_rep.json": No such file or directory
Stacktrace:
[1] systemerror(::String, ::Int32; extrainfo::Nothing) at ./error.jl:168
[2] #systemerror#48 at ./error.jl:167 [inlined]
[3] systemerror at ./error.jl:167 [inlined]
[4] open(::String; lock::Bool, read::Nothing, write::Nothing, create::Nothing, truncate::Nothing, append::Nothing) at ./iostream.jl:284
[5] open at ./iostream.jl:273 [inlined]
[6] open(::JSON.Parser.var"#4#5"{DataType,DataType,Nothing,Bool,Bool,Int64}, ::String; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at ./io.jl:323
[7] open at ./io.jl:323 [inlined]
[8] #parsefile#3 at /Users/Cas/.julia/packages/JSON/3rsiS/src/Parser.jl:519 [inlined]
[9] parsefile at /Users/Cas/.julia/packages/JSON/3rsiS/src/Parser.jl:518 [inlined]
[10] load_dict(::String) at /Users/Cas/.julia/dev/JuLIP/src/FIO.jl:78
[11] top-level scope at In[16]:1
[12] include_string(::Function, ::Module, ::String, ::String) at ./loading.jl:1091
at = bulk(:Al) * (2,2,2);
energy(V, at)
UndefVarError: V not defined
Stacktrace:
[1] top-level scope at In[18]:1
[2] include_string(::Function, ::Module, ::String, ::String) at ./loading.jl:1091
forces(V, at)
UndefVarError: V not defined
Stacktrace:
[1] top-level scope at In[19]:1
[2] include_string(::Function, ::Module, ::String, ::String) at ./loading.jl:1091