arc.scheduler
A module for scheduling ARC jobs Includes spawning, terminating, checking, and troubleshooting various jobs
- class arc.scheduler.Scheduler(project: str, ess_settings: dict, species_list: list, project_directory: str, composite_method: Level | None = None, conformer_opt_level: Level | None = None, conformer_sp_level: Level | None = None, opt_level: Level | None = None, freq_level: Level | None = None, sp_level: Level | None = None, scan_level: Level | None = None, ts_guess_level: Level | None = None, irc_level: Level | None = None, orbitals_level: Level | None = None, adaptive_levels: dict | None = None, job_types: dict | None = None, rxn_list: list | None = None, bath_gas: str | None = None, restart_dict: dict | None = None, max_job_time: float | None = None, allow_nonisomorphic_2d: bool | None = False, memory: float | None = None, testing: bool | None = False, dont_gen_confs: list | None = None, n_confs: int | None = 10, e_confs: float | None = 5, fine_only: bool | None = False, trsh_ess_jobs: bool | None = True, trsh_rotors: bool | None = True, kinetics_adapter: str = 'arkane', freq_scale_factor: float = 1.0, ts_adapters: list[str] = None, report_e_elect: bool | None = False, skip_nmd: bool | None = False, output: dict | None = None)[source]
ARC’s Scheduler class. Creates jobs, submits, checks status, troubleshoots. Each species in species_list has to have a unique label.
Dictionary structures:
job_dict = {label_1: {'conf_opt': {0: Job1, 1: Job2, ...}, 'conf_sp': {0: Job1, 1: Job2, ...}, 'tsg': {0: Job1, 1: Job2, ...}, # TS guesses 'opt': {job_name1: Job1, job_name2: Job2, ...}, 'sp': {job_name1: Job1, job_name2: Job2, ...}, 'freq': {job_name1: Job1, job_name2: Job2, ...}, 'composite': {job_name1: Job1, job_name2: Job2, ...}, 'scan': {job_name1: Job1, job_name2: Job2, ...}, <job_type>: {job_name1: Job1, job_name2: Job2, ...}, ... } label_2: {...}, } output = {label_1: {'job_types': {job_type1: <status1>, # boolean job_type2: <status2>, }, 'paths': {'geo': <path to geometry optimization output file>, 'freq': <path to freq output file>, 'sp': <path to sp output file>, 'composite': <path to composite output file>, 'irc': [list of two IRC paths], }, 'conformers': <comments>, 'isomorphism': <comments>, 'convergence': <status>, # bool | None 'restart': <comments>, 'info': <comments>, 'warnings': <comments>, 'errors': <comments>, }, label_2: {...}, }
Note
The rotor scan dicts are located under Species.rotors_dict
- Parameters:
project (str) – The project’s name. Used for naming the working directory.
ess_settings (dict) – A dictionary of available ESS and a corresponding server list.
species_list (list) – Contains input ARCSpecies objects (both wells and TSs).
rxn_list (list) – Contains input ARCReaction objects.
project_directory (str) – Folder path for the project: the input file path or ARC/Projects/project-name.
composite_method (str, optional) – A composite method to use.
conformer_opt_level (str | dict, optional) – The level of theory to use for conformer comparisons.
conformer_sp_level (str | dict, optional) – The level of theory to use for conformer sp jobs.
opt_level (str | dict, optional) – The level of theory to use for geometry optimizations.
freq_level (str | dict, optional) – The level of theory to use for frequency calculations.
sp_level (str | dict, optional) – The level of theory to use for single point energy calculations.
scan_level (str | dict, optional) – The level of theory to use for torsion scans.
ts_guess_level (str | dict, optional) – The level of theory to use for TS guess comparisons.
irc_level (str | dict, optional) – The level of theory to use for IRC calculations.
orbitals_level (str | dict, optional) – The level of theory to use for calculating MOs (for plotting).
adaptive_levels (dict, optional) – A dictionary of levels of theory for ranges of the number of heavy atoms in the species. Keys are tuples of (min_num_atoms, max_num_atoms), values are dictionaries with job type tuples as keys and levels of theory as values. ‘inf’ is accepted in max_num_atoms
job_types (dict, optional) – A dictionary of job types to execute. Keys are job types, values are boolean.
bath_gas (str, optional) – A bath gas. Currently used in OneDMin to calc L-J parameters. Allowed values are He, Ne, Ar, Kr, H2, N2, O2.
restart_dict (dict, optional) – A restart dictionary parsed from a YAML restart file.
max_job_time (float, optional) – The maximal allowed job time on the server in hours (can be fractional).
allow_nonisomorphic_2d (bool, optional) – Whether to optimize species even if they do not have a 3D conformer that is isomorphic to the 2D graph representation.
memory (float, optional) – The total allocated job memory in GB (14 by default).
testing (bool, optional) – Used for internal ARC testing (generating the object w/o executing it).
dont_gen_confs (list, optional) – A list of species labels for which conformer jobs were loaded from a restart file, or user-requested. Additional conformer generation should be avoided.
n_confs (int, optional) – The number of lowest force field conformers to consider.
e_confs (float, optional) – The energy threshold in kJ/mol above the lowest energy conformer below which force field conformers are considered.
fine_only (bool) – If
TrueARC will not run optimization jobs withoutfine=True.kinetics_adapter (str, optional) – The statmech software to use for kinetic rate coefficient calculations.
freq_scale_factor (float, optional) – The harmonic frequencies scaling factor.
trsh_ess_jobs (bool, optional) – Whether to attempt troubleshooting failed ESS jobs. Default is
True.trsh_rotors (bool, optional) – Whether to attempt troubleshooting failed rotor scan jobs. Default is
True.ts_adapters (list, optional) – Entries represent different TS adapters.
report_e_elect (bool, optional) – Whether to report electronic energy. Default is
False.skip_nmd (bool, optional) – Whether to skip normal mode displacement check. Default is
False.output (dict, optional) – Output dictionary with status per job type and final QM file paths for all species.
- project
The project’s name. Used for naming the working directory.
- Type:
str
- servers
A list of servers used for the present project.
- Type:
list
- species_list
Contains input ARCSpecies objects (both species and TSs).
- Type:
list
- species_dict
Keys are labels, values are ARCSpecies objects.
- Type:
dict
- rxn_list
Contains input ARCReaction objects.
- Type:
list
- unique_species_labels
A list of species labels (checked for duplicates).
- Type:
list
- job_dict
A dictionary of all scheduled jobs. Keys are species / TS labels, values are dictionaries where keys are job names (corresponding to ‘running_jobs’ if job is running) and values are the Job objects.
- Type:
dict
- running_jobs
A dictionary of currently running jobs (a subset of job_dict). Keys are species/TS label, values are lists of job names (e.g. ‘conformer3’, ‘opt_a123’).
- Type:
dict
- server_job_ids
A list of relevant job IDs currently running on the server.
- Type:
list
- output
Output dictionary with status per job type and final QM file paths for all species.
- Type:
dict
- output_multi_spc
Output dictionary with status per job type of multi-species clusters.
- Type:
dict
- ess_settings
A dictionary of available ESS and a corresponding server list.
- Type:
dict
- restart_dict
A restart dictionary parsed from a YAML restart file.
- Type:
dict
- project_directory
Folder path for the project: the input file path or ARC/Projects/project-name.
- Type:
str
- save_restart
Whether to start saving a restart file.
Trueonly after all species are loaded (otherwise saves a partial file and may cause loss of information).- Type:
bool
- restart_path
Path to the restart.yml file to be saved.
- Type:
str
- max_job_time
The maximal allowed job time on the server in hours (can be fractional).
- Type:
float
- testing
Used for internal ARC testing (generating the object w/o executing it).
- Type:
bool
- allow_nonisomorphic_2d
Whether to optimize species even if they do not have a 3D conformer that is isomorphic to the 2D graph representation.
- Type:
bool
- dont_gen_confs
A list of species labels for which conformer jobs were loaded from a restart file, or user-requested. Additional conformer generation should be avoided for them.
- Type:
list
- memory
The total allocated job memory in GB (14 by default).
- Type:
float
- n_confs
The number of lowest force field conformers to consider.
- Type:
int
- e_confs
The energy threshold in kJ/mol above the lowest energy conformer below which force field conformers are considered.
- Type:
float
- job_types
A dictionary of job types to execute. Keys are job types, values are boolean.
- Type:
dict
- bath_gas
A bath gas. Currently used in OneDMin to calc L-J parameters. Allowed values are He, Ne, Ar, Kr, H2, N2, O2.
- Type:
str
- composite_method
A composite method to use.
- Type:
str
- conformer_opt_level
The level of theory to use for conformer comparisons.
- Type:
dict
- conformer_sp_level
The level of theory to use for conformer sp jobs.
- Type:
dict
- opt_level
The level of theory to use for geometry optimizations.
- Type:
dict
- freq_level
The level of theory to use for frequency calculations.
- Type:
dict
- sp_level
The level of theory to use for single point energy calculations.
- Type:
dict
- scan_level
The level of theory to use for torsion scans.
- Type:
dict
- ts_guess_level
The level of theory to use for TS guess comparisons.
- Type:
dict
- irc_level
The level of theory to use for IRC calculations.
- Type:
dict
- orbitals_level
The level of theory to use for calculating MOs (for plotting).
- Type:
dict
- adaptive_levels
A dictionary of levels of theory for ranges of the number of heavy atoms in the species. Keys are tuples of (min_num_atoms, max_num_atoms), values are dictionaries with job type tuples as keys and levels of theory as values. ‘inf’ is accepted in max_num_atoms
- Type:
dict
- fine_only
If
TrueARC will not run optimization jobs withoutfine=True.- Type:
bool
- kinetics_adapter
The statmech software to use for kinetic rate coefficient calculations.
- Type:
str
- freq_scale_factor
The harmonic frequencies scaling factor.
- Type:
float
- trsh_ess_jobs
Whether to attempt troubleshooting failed ESS jobs. Default is
True.- Type:
bool
- trsh_rotors
Whether to attempt troubleshooting failed rotor scan jobs. Default is
True.- Type:
bool
- ts_adapters
Entries represent different TS adapters.
- Type:
list
- report_e_elect
Whether to report electronic energy.
- Type:
bool
- skip_nmd
Whether to skip normal mode displacement check.
- Type:
bool
- add_label_to_unique_species_labels(label: str) str[source]
Adds a label to self.unique_species_labels. Modifies the label if it is not unique.
- Parameters:
label (str) – A species label.
- Returns:
The modified species label
- Return type:
str
- check_all_done(label: str)[source]
Check that we have all required data for the species/TS.
- Parameters:
label (str) – The species label.
- check_directed_scan(label, pivots, scan, energies)[source]
Checks (QA) whether the directed scan is relatively “smooth”, and whether the optimized geometry indeed represents the minimum energy conformer. Recommends whether or not to use this rotor using the ‘successful_rotors’ and ‘unsuccessful_rotors’ attributes. This method differs from check_directed_scan_job(), since here we consider the entire scan.
- Parameters:
label (str) – The species label.
pivots (list[list[int]]) – The rotor pivots.
scan (list[int]) – The four atoms defining the dihedral.
energies (list[float]) – The rotor scan energies in kJ/mol.
- check_irc_species(label: str)[source]
Check that the optimized geometry of the two species created from a TS IRC runs makes sense
- Parameters:
label (str) – The label of one of the optimized IRC-resulting species.
- check_max_simultaneous_jobs_limit(server: str | None)[source]
Check if the number of running jobs on the server is not above the set server limit.
- Parameters:
server (str) – The server name.
- check_rxn_e0_by_spc(label: str)[source]
Check the E0 (electronic energy + ZPE) of reactions related to a specific species. Requires all opt + freq computations to be converged for all species (and TS) participating in each reaction.
- Parameters:
label (str) – A label representing a species.
- deduce_job_adapter(level: Level, job_type: str) str[source]
Deduce the job adapter (the software) to be used for jobs other than TS searches.
- Parameters:
level (Level) – The level of theory that will be used for the job.
job_type (str) – The job’s type.
- Returns: str
The deduced job adapter.
- delete_all_species_jobs(label: str)[source]
Delete all jobs of a species/TS.
- Parameters:
label (str) – The species label.
- determine_adaptive_level(original_level_of_theory: Level, job_type: str, heavy_atoms: int) Level[source]
Determine the level of theory to be used according to the job type and number of heavy atoms. self.adaptive_levels is a dictionary of levels of theory for ranges of the number of heavy atoms in the species. Keys are tuples of (min_num_atoms, max_num_atoms), values are dictionaries with job type tuples as keys and levels of theory as values. The string ‘inf’ is accepted instead of an integer in max_num_atoms.
- Parameters:
original_level_of_theory (Level) – The level of theory for non-sp/opt/freq job types.
job_type (str) – The job type for which the level of theory is determined.
heavy_atoms (int) – The number of heavy atoms in the species.
- determine_most_likely_ts_conformer(label: str)[source]
Determine the most likely TS conformer. Save the resulting xyz as the
.initial_xyzattribute of the TS Species.- Parameters:
label (str) – The TS species label.
- determine_most_stable_conformer(label, sp_flag=False)[source]
Determine the most stable conformer for a species (which is not a TS). Also run an isomorphism check. Save the resulting xyz as initial_xyz.
- Parameters:
label (str) – The species label.
sp_flag (bool) – Whether this is a single point calculation job.
- flush_pending_pipe_batches() None[source]
Attempt to submit accumulated deferred pipe batches for SP, freq, IRC, and conf_sp.
- For each family:
Snapshot and clear the pending set.
Ask the planner for the handled subset.
Fall back to per-job submission for the unhandled remainder.
Called once per main-loop iteration, after all newly-ready work has been discovered and before the loop sleeps.
- generate_final_ts_guess_report()[source]
Generate a TS report for this ARC project and saves it as a YAML file.
- get_completed_incore_jobs()[source]
Check job status of all incore jobs, get a list of relevant completed job IDs.
Todo: Add tests.
- get_server_job_ids(specific_server: str | None = None)[source]
Check job status on a specific server or on all active servers, get a list of relevant running job IDs.
- Parameters:
specific_server (str, optional) – The server to check. If
None, check all active servers.
- initialize_output_dict(label: str | None = None)[source]
Initialize self.output. Do not initialize keys that will contain paths (‘geo’, ‘freq’, ‘sp’, ‘composite’), their existence indicate the job was terminated for restarting purposes. If
labelis notNone, will initialize for a specific species, otherwise will initialize for all species.- Parameters:
label (str, optional) – A species label.
- make_reaction_labels_info_file()[source]
A helper function for creating the reactions labels.info file.
- post_sp_actions(label: str, sp_path: str, level: Level | None = None)[source]
Perform post-sp actions.
- Parameters:
label (str) – The species label.
sp_path (str) – The path to ‘output.out’ for the single point job.
level (Level, optional) – The level of theory used for the sp job.
- process_conformers(label)[source]
Process the generated conformers and spawn DFT jobs at the conformer_opt_level. If more than one conformer is available, they will be optimized at the DFT conformer_opt_level.
- Parameters:
label (str) – The species label.
- process_directed_scans(label: str, pivots: list[int] | list[list[int]])[source]
Process all directed rotors for a species and check the quality of the scan.
- Parameters:
label (str) – The species label.
pivots (list[int] | list[list[int]]) – The rotor pivots.
- restore_running_jobs()[source]
Make Job objects for jobs which were running in the previous session. Important for the restart feature so long jobs won’t run twice.
- run_composite_job(label: str)[source]
Spawn a composite job (e.g., CBS-QB3) using ‘final_xyz’ for species ot TS ‘label’.
- Parameters:
label (str) – The species label.
- run_conformer_jobs(labels: list[str] | None = None)[source]
Select the most stable conformer for each species using molecular dynamics (force fields) and subsequently spawning opt jobs at the conformer level of theory, usually a reasonable yet cheap DFT, e.g., b97d3/6-31+g(d,p). The resulting conformer is saved in a string format xyz in the Species initial_xyz attribute.
- Parameters:
labels (list) – Labels of specific species to run conformer jobs for. If
None, conformer jobs will be spawned for all species in self.species_list.
- run_freq_job(label)[source]
Spawn a freq job using ‘final_xyz’ for species ot TS ‘label’. If this was originally a composite job, run an appropriate separate freq job outputting the Hessian.
- Parameters:
label (str) – The species label.
- run_irc_job(label, irc_direction='forward')[source]
Spawn an IRC job.
- Parameters:
label (str) – The species label.
irc_direction (str) – The IRC job direction, either ‘forward’ or ‘reverse’.
- run_onedmin_job(label)[source]
Spawn a lennard-jones calculation using OneDMin.
- Parameters:
label (str) – The species label.
- run_opt_job(label: str, fine: bool = False)[source]
Spawn a geometry optimization job. The initial guess is taken from the initial_xyz attribute.
- Parameters:
label (str) – The species label.
fine (bool) – Whether a fine grid should be used during optimization.
- run_orbitals_job(label)[source]
Spawn orbitals job used for molecular orbital visualization. Currently supporting QChem for printing the orbitals, the output could be visualized using IQMol.
- Parameters:
label (str) – The species label.
- run_scan_jobs(label: str)[source]
Spawn rotor scan jobs using ‘final_xyz’ for species (or TS).
- Parameters:
label (str) – The species label.
- run_sp_job(label: str, level: Level | None = None, conformer: int | None = None)[source]
Spawn a single point job using ‘final_xyz’ for species or a TS represented by ‘label’. If the method is MRCI, first spawn a simple CCSD(T) job, and use orbital determination to run the MRCI job.
- Parameters:
label (str) – The species label.
level (Level) – An alternative level of theory to run at. If
None, self.sp_level will be used.conformer (int) – The conformer number.
- run_ts_conformer_jobs(label: str)[source]
Spawn opt jobs at the ts_guesses level of theory for the TS guesses.
- Parameters:
label (str) – The TS species label.
- save_e_elect(label: str)[source]
Save the electronic energy of the corresponding species. It will append if the file already exists.
- spawn_directed_scan_jobs(label: str, rotor_index: int, xyz: str | None = None)[source]
Spawn directed scan jobs. Directed scan types could be one of the following: ‘brute_force_sp’, ‘brute_force_opt’, ‘cont_opt’, ‘brute_force_sp_diagonal’, ‘brute_force_opt_diagonal’, or ‘cont_opt_diagonal’. Here we treat
contandbrute_forceseparately, and also consider thediagonalkeyword. The differentiation betweenspandoptis done in the Job module.- Parameters:
label (str) – The species label.
rotor_index (int) – The 0-indexed rotor number (key) in the species.rotors_dict dictionary.
xyz (str, optional) – The 3D coordinates for a continuous directed scan.
- Raises:
InputError – If the species directed scan type has an unexpected value, or if
xyzwasn’t given for a cont_opt job.SchedulerError – If the rotor scan resolution as defined in settings.py is illegal.
- spawn_post_opt_jobs(label: str, job_name: str)[source]
Spawn additional jobs after opt has converged.
- Parameters:
label (str) – The species label.
job_name (str) – The opt job name (used for differentiating between
optandoptfreqjobs).
- spawn_ts_jobs()[source]
Check if any new reaction has all of its reactants and products optimized, and if so spawn the respective TSG jobs. Don’t spawn TS jobs if the multiplicity of the reaction could not be determined.
- switch_ts(label: str)[source]
Try the next optimized TS guess in line if a previous TS guess was found to be wrong.
- Parameters:
label (str) – The TS species label.
- troubleshoot_conformer_isomorphism(label: str)[source]
Troubleshoot conformer optimization for a species that failed isomorphic test in
determine_most_stable_conformer.- Parameters:
label (str) – The species label.
- troubleshoot_opt_jobs(label)[source]
We’re troubleshooting for opt jobs. First check for server status and troubleshoot if needed. Then check for ESS status and troubleshoot if needed. Finally, check whether the last job had fine=True, add if it didn’t run with fine.
- Parameters:
label (str) – The species label.
- arc.scheduler.species_has_freq(species_output_dict: dict, yml_path: str | None = None) bool[source]
Checks whether a species has valid converged frequencies using it’s output dict.
- Parameters:
species_output_dict (dict) – The species output dict (i.e., Scheduler.output[label]).
yml_path (str) – THe species Arkane YAML file path.
- Returns: bool
Whether a species has valid converged frequencies.
- arc.scheduler.species_has_geo(species_output_dict: dict, yml_path: str | None = None) bool[source]
Checks whether a species has a valid converged geometry using it’s output dict.
- Parameters:
species_output_dict (dict) – The species output dict (i.e., Scheduler.output[label]).
yml_path (str) – THe species Arkane YAML file path.
- Returns: bool
Whether a species has a valid converged geometry.
- arc.scheduler.species_has_sp(species_output_dict: dict, yml_path: str | None = None) bool[source]
Checks whether a species has a valid converged single-point energy using it’s output dict.
- Parameters:
species_output_dict (dict) – The species output dict (i.e., Scheduler.output[label]).
yml_path (str) – THe species Arkane YAML file path.
- Returns: bool
Whether a species has a valid converged single-point energy.
- arc.scheduler.species_has_sp_and_freq(species_output_dict: dict, yml_path: str | None = None) bool[source]
Checks whether a species has a valid converged single-point energy and valid converged frequencies.
- Parameters:
species_output_dict (dict) – The species output dict (i.e., Scheduler.output[label]).
yml_path (str) – THe species Arkane YAML file path.
- Returns: bool
Whether a species has a valid converged single-point energy and frequencies.