CastepBaseWorkChain
¶aiida-castep
is a plugin to interface CASTEP with AiiDA (Automated Interactive Infrastructure and Database for Computational Science).
This example notebook goes through running calculations though CastepBaseWorkChain
, which wrapps around CastepCalculation
and attempts to correct common errors.
In AiiDA's terminology a Calculation
can be thought as a single transaction with the underlying code,
whereas WorkChain
s are entities that perform workflows for solving specific problems.
%load_ext aiida
from aiida import load_profile, engine, orm, plugins
from aiida.storage.sqlite_temp import SqliteTempBackend
profile = load_profile(
SqliteTempBackend.create_profile(
'myprofile',
sandbox_path='_sandbox',
options={
'warnings.development_version': False,
'runner.poll.interval': 1
},
debug=False
),
allow_switch=True
)
profile
Profile<uuid='79db3e4c3778408da86dfdc8f680f521' name='myprofile'>
Please also take a look at the CastepCalculation
example before you proceed to learn what the input and outputs look like.
As CastepBaseWorkChain
takes similar inputs are for a CastepCalculation
, but it allows certain simplifications:
The main difference is that what originally goes into CastepCalculation
now resides under a PortNameSpace
names calc
. For example, to pass the parameters
to a CastepCalculation
, ones needs to do:
inputs = {
'parameters': Dict(dict={
'PARAM': {
'task': 'singlepoint',
....
},
'CELL': {
'symmetry_generate': True,
.....
}
})
}
submit(CastepCalculation, **inputs)
and the ProcessBuilder
interface follows the same scheme.
Using CastepBaseWorkChain
, this becomes:
inputs = {
'calc': {
'parameters': Dict(dict={
'task': 'singlepoint',
'symmetry_generate': True,
.....
}
})
}
}
where Dict
node no longer needs to have sub fields corresponding to cell
and param
files - all of the keys can be placed at the top level.
The other differences are:
kpoints_spacing
input port for defining the density of kpoints. Note that CASTEP itself supports kpoints_mp_spacing
, but using this would not allow the grid to be recorded in the provenance graph, so using kpoints_spacing
input port is the preferred approach.clean_workdir
is set to Bool(True)
the remote workdir(s) will be cleaned it the workflow is successful.ensure_gamma_centering
is set to Bool(True)
the kpoints mesh will include the necessary offsets to make sure it is Gamma centred.pseudo_family
port.continuation_folder
or reuse_folder
is set, the flag for continuation/reuse will be included automatically for the underlying calculations.Below is a walk-through of the steps to create a single point calculation using the BaseCastepWorkChain
.
import numpy as np
from pprint import pprint
# Load the AiiDA environment
import aiida.orm as orm
from aiida.plugins import WorkflowFactory
This is taken from the online tutorial.
The cell
file contain the crystal structure and related setting.
!cat 'bandstructure/silicon/Si2.cell' | grep -v -e "^!"
%block lattice_cart 2.6954645 2.6954645 0.0 2.6954645 0.0 2.6954645 0.0 2.6954645 2.6954645 %endblock lattice_cart %block positions_frac Si 0.00 0.00 0.00 Si 0.25 0.25 0.25 %endblock positions_frac symmetry_generate %block species_pot Si Si_00.usp %endblock species_pot kpoint_mp_grid 4 4 4 %block bs_kpoint_path 0.5 0.25 0.75 ! W 0.5 0.5 0.5 ! L 0.0 0.0 0.0 ! Gamma 0.5 0.0 0.5 ! X 0.5 0.25 0.75 ! W 0.375 0.375 0.75 ! K %endblock bs_kpoint_path
The param
file contains a list of key-value pairs
!cat 'bandstructure/silicon/Si2.param' | grep -v -e "^!"
task bandstructure ! The TASK keyword instructs CASTEP what to do xc_functional LDA ! Which exchange-correlation functional to use. basis_precision MEDIUM ! Choose high cut-off COARSE/MEDIUM/FINE/PRECISE fix_occupancy true ! Treat the system as an insulator opt_strategy speed ! Choose algorithms for best speed at expense of memory. num_dump_cycles 0 ! Don't write unwanted ".wvfn" files. write_formatted_density TRUE ! Write out a density file that we can view using (e.g.) Jmol.
We setup a similar calculation with Si here. Instead of going for the band structure, we just do a single point run.
StructureData
node¶# Define the structure
from aiida.plugins import DataFactory
StructureData = DataFactory('structure')
silicon = StructureData()
r_unit = 2.6954645
silicon.set_cell(np.array([[1, 1, 0], [1, 0, 1], [0, 1, 1]]) * r_unit)
silicon.append_atom(symbols=["Si"], position=[0, 0, 0])
silicon.append_atom(symbols=["Si"], position=[r_unit * 0.5] * 3)
silicon.label = "Si"
silicon.description = "A silicon structure"
Atomic Simulation Environment (ASE) has a rich set of tools for handling structures, center around the ase.Atoms
class.
They can be converted to StructureData
that AiiDA understands and saves to the provenance graph.
# You can also use ase.Atoms object to create the StructureData
from ase import Atoms
silicon_atoms = Atoms('Si2', cell=silicon.cell, scaled_positions=((0, 0, 0), (0.25, 0.25, 0.25)))
silicon_from_atoms = StructureData(ase=silicon_atoms)
StructureData
can be converted back to Atoms
for complex operation using ase
.
# You can also convent the StructureData back to ase.Atoms
silicon_atoms_2 = silicon_from_atoms.get_ase()
silicon_atoms_2 == silicon_atoms
True
Note that a similar interface also exists to work with pymatgen
.
ProcessBuilder
for setting up the inputs¶The ProcessBuilder
is very useful - it allows the inputs to be defined interactively.
CastepBaseWorkChain = WorkflowFactory('castep.base')
builder = CastepBaseWorkChain.get_builder()
builder.calc.parameters = {
"task": "singlepoint",
"basis_precision": "medium",
"fix_occupancy": True,
"opt_strategy": "speed",
"num_dump_cycles": 0,
"write_formatted_density": True,
"symmetry_generate": True,
"snap_to_symmetry": True,
}
Note that we have used a plain python dict
here - the builder will automatically convert it to a
Dict
.
Try to include a typo in the cell above and see what happens - the builder will validate the keys before the Dict
node is created.
from aiida_castep.data.otfg import upload_otfg_family
upload_otfg_family(['C19'], 'C19', 'C19 potential library')
builder.kpoints_spacing = 0.07
builder.pseudos_family = 'C19'
# Locate a previously set mock code in the Calculation Example
builder.calc.structure = silicon
# Note that the resources needs to go under `calc.metadata.options` instead of `metadata.options`
builder.calc.metadata.options.resources = {'num_machines': 1, 'tot_num_mpiprocs': 2}
builder.calc.metadata.options.max_wallclock_seconds = 600
builder.metadata.description = 'A Example CASTEP calculation for silicon'
builder.metadata.label = 'Si SINGLEPOINT'
# Define a mock code on the localhost computer
comp = orm.Computer('localhost', 'localhost', transport_type='core.local', scheduler_type='core.direct')
comp.store()
comp.set_workdir('/tmp/aiida_run/')
comp.configure()
code_path = !which castep.mock
castep_mock = orm.Code((comp, code_path[0]), input_plugin_name='castep.castep')
builder.calc.code = castep_mock
from aiida.engine import run_get_node
results, work = run_get_node(builder)
05/26/2022 10:31:56 PM <23688> aiida.orm.nodes.process.workflow.workchain.WorkChainNode: [REPORT] [8|CastepBaseWorkChain|validate_inputs]: Direct input of calculations metadata is deprecated - please pass them with `calc_options` input port. 05/26/2022 10:31:56 PM <23688> aiida.orm.nodes.process.workflow.workchain.WorkChainNode: [REPORT] [8|CastepBaseWorkChain|validate_inputs]: Using kpoints: Kpoints mesh: 5x5x5 (+0.0,0.0,0.0) 05/26/2022 10:31:57 PM <23688> aiida.orm.nodes.process.workflow.workchain.WorkChainNode: [REPORT] [8|CastepBaseWorkChain|run_calculation]: launching CastepCalculation<12> iteration #1 05/26/2022 10:31:59 PM <23688> aiida.orm.nodes.process.workflow.workchain.WorkChainNode: [REPORT] [8|CastepBaseWorkChain|inspect_calculation]: CastepCalculation<12> completed successfully 05/26/2022 10:31:59 PM <23688> aiida.orm.nodes.process.workflow.workchain.WorkChainNode: [REPORT] [8|CastepBaseWorkChain|results]: workchain completed after 1 iterations 05/26/2022 10:31:59 PM <23688> aiida.orm.nodes.process.workflow.workchain.WorkChainNode: [REPORT] [8|CastepBaseWorkChain|on_terminated]: remote folders will not be cleaned
Check if the workflow is finished OK
work.exit_status
0
To can the list of possible exit_status
and they meanings, we can use verdi plugin list
. This also shows list of inputs of which the required ones are in bold.
!verdi plugin list aiida.workflows castep.base
Description: A basic workchain for generic CASTEP calculations. We try to handle erros such as walltime exceeded or SCF not converged Inputs: calc: required calc_options: optional Dict Options to be passed to calculations's metadata.options clean_workdir: optional Bool Wether to clean the workdir of the calculations or not, the default is not ... continuation_folder: optional RemoteData Use a remote folder as the parent folder. Useful for restarts. ensure_gamma_centering: optional Bool Ensure the kpoint grid is gamma centred. kpoints_spacing: optional Float Kpoint spacing max_iterations: optional Int Maximum number of restarts metadata: optional options: optional Dict Options specific to the workchain.Avaliable options: queue_wallclock_limit, ... pseudos_family: optional Str Pseudopotential family to be used reuse_folder: optional RemoteData Use a remote folder as the parent folder. Useful for restarts. Outputs: output_bands: required BandsData output_parameters: required Dict remote_folder: required RemoteData output_array: optional ArrayData output_structure: optional StructureData output_trajectory: optional ArrayData Exit codes: 1: The process has failed with an unspecified error. 2: The process failed with legacy failure mode. 10: The process returned an invalid output. 11: The process did not register a required output. 200: The maximum number of iterations has been exceeded 201: The maximum length of the wallclocks has been exceeded 301: CASTEP generated error files and is not recoverable 302: Cannot reach SCF convergence despite restart efforts 400: The stop flag has been put in the .param file to request termination of the calculation. 900: Input validate is failed 901: Completed one iteration but found not calculation returned 1000: Error is not known
In this simple example, it only calls a single calculation calculation. In practice, the workflow will act to mitigate certain errors such as running out of walltime and electronic convergence problems.
from aiida.cmdline.utils.ascii_vis import format_call_graph
print(format_call_graph(work))
CastepBaseWorkChain<8> Finished [0] [4:results] └── CastepCalculation<12> Finished [0]
Results of workflow can be accessed in a similar way compared with CastepCalculation
.
However, here we do not have the .res
interface as for CastepCalculation
.
# A Dict node is created containing some of the parsed results
work.outputs.output_parameters.get_dict()
{'warnings': [], 'castep_version': '20.11', 'unit_length': 'A', 'unit_time': 'ps', 'unit_energy': 'eV', 'unit_force': 'eV/A', 'num_ions': 2, 'n_kpoints': '10', 'point_group': '32: Oh, m-3m, 4/m -3 2/m', 'space_group': '227: Fd-3m, F 4d 2 3 -1d', 'cell_constraints': '1 1 1 0 0 0', 'pseudo_pots': {'Si': '3|1.8|5|6|7|30:31:32'}, 'initialisation_time': 4.49, 'calculation_time': 0.87, 'finalisation_time': 0.01, 'total_time': 5.37, 'parallel_efficiency': 57, 'geom_unconverged': None, 'parser_warnings': [], 'parser_info': 'AiiDA CASTEP basic Parser v1.1.1', 'total_energy': -337.8473203292, 'error_messages': []}
Eigenvalues parsed from .bands
output
work.outputs.output_bands.get_array('bands')
array([[-4.72022699, -2.20565637, 1.06932498, 2.83544156], [-5.63940581, -0.76783726, 1.89878269, 2.54227671], [-4.29355492, -2.00728782, 0.48127062, 1.49935793], [-6.80455106, 1.9917044 , 3.05980487, 3.06015916], [-6.98398755, 1.54800182, 4.00173867, 4.00257351], [-3.50106388, -2.90938714, 0.39021289, 1.95043126], [-7.50759964, 4.60236006, 4.60306892, 4.60501208], [-5.58411173, -1.78292124, 3.44303941, 3.44488053], [-6.13414661, 0.08379311, 1.49548901, 3.73091709], [-4.75110484, -1.62044505, 1.82693538, 1.82725022]])
Forces on the atoms
work.outputs.output_array.get_array('forces')
array([[[ 0., -0., 0.], [-0., 0., 0.]]])
# Alternatively, you can access the parsed results using
work.called[0].res.total_energy # hit tab for completion after res
-337.8473203292
work.outputs.output_parameters['total_energy']
-337.8473203292
work.outputs.output_parameters['parallel_efficiency']
57