arc.job.job

The ARC Job module

class arc.job.job.Job(project: str, project_directory: str, species_name: str, multiplicity: int, job_type: str, level: Union[arc.level.Level, dict, str], ess_settings: dict, xyz: Optional[dict] = None, charge: int = 0, conformer: int = - 1, fine: bool = False, shift: str = '', software: str = None, is_ts: bool = False, scan: Optional[list] = None, pivots: Optional[list] = None, total_job_memory_gb: Optional[int] = None, comments: str = '', args: Optional[Union[Dict[str, Dict[str, str]], str]] = None, scan_trsh: str = '', ess_trsh_methods: Optional[list] = None, bath_gas: Optional[str] = None, job_num: Optional[int] = None, job_server_name: Optional[str] = None, job_name: Optional[str] = None, job_id: Optional[int] = None, job_status: Optional[list] = None, server: Optional[str] = None, server_nodes: Optional[list] = None, initial_time: Optional[Union[datetime.datetime, str]] = None, final_time: Optional[Union[datetime.datetime, str]] = None, occ: Optional[int] = None, max_job_time: Optional[float] = None, scan_res: Optional[int] = None, checkfile: Optional[str] = None, number_of_radicals: Optional[int] = None, radius: Optional[float] = None, directed_scan_type: Optional[str] = None, directed_scans: Optional[list] = None, directed_dihedrals: Optional[list] = None, rotor_index: Optional[int] = None, testing: bool = False, cpu_cores: Optional[int] = None, irc_direction: Optional[str] = None)[source]

ARC’s Job class.

Parameters
  • project (str) – The project’s name. Used for naming the directory.

  • project_directory (str) – The path to the project directory.

  • ess_settings (dict) – A dictionary of available ESS and a corresponding server list.

  • species_name (str) – The species/TS name. Used for naming the directory.

  • xyz (dict) – The xyz geometry. Used for the calculation.

  • job_type (str) – The job’s type.

  • level (Level, dict, str) – The level of theory to use.

  • multiplicity (int) – The species multiplicity.

  • charge (int, optional) – The species net charge. Default is 0.

  • conformer (int, optional) – Conformer number if optimizing conformers.

  • fine (bool, optional) – Whether to use fine geometry optimization parameters.

  • shift (str, optional) – A string representation alpha- and beta-spin orbitals shifts (molpro only).

  • software (str, optional) – The electronic structure software to be used.

  • is_ts (bool) – Whether this species represents a transition structure. Default: False.

  • scan (list, optional) – A list representing atom labels for the dihedral scan (e.g., “2 1 3 5” as a string or [2, 1, 3, 5] as a list of integers).

  • pivots (list, optional) – The rotor scan pivots, if the job type is scan. Not used directly in these methods, but used to identify the rotor.

  • total_job_memory_gb (int, optional) – The total job allocated memory in GB (14 by default).

  • comments (str, optional) – Job comments (archived, not used).

  • args (str, dict optional) – Methods (including troubleshooting) to be used in input files. Keys are either ‘keyword’ or ‘block’, values are dictionaries with values to be used either as keywords or as blocks in the respective software input file. If args attribute is given as a string, it will be converted to a dictionary format with ‘keyword’ and ‘general’ key.

  • scan_trsh (str, optional) – A troubleshooting method for rotor scans.

  • ess_trsh_methods (list, optional) – A list of troubleshooting methods already tried out for ESS convergence.

  • bath_gas (str, optional) – A bath gas. Currently used in OneDMin to calc L-J parameters. Allowed values are He, Ne, Ar, Kr, H2, N2, O2

  • job_num (int, optional) – Used as the entry number in the database, as well as the job name on the server.

  • job_server_name (str, optional) – Job’s name on the server (e.g., ‘a103’).

  • job_name (str, optional) – Job’s name for internal usage (e.g., ‘opt_a103’).

  • job_id (int, optional) – The job’s ID determined by the server.

  • server (str, optional) – Server’s name.

  • initial_time (datetime, optional) – The date-time this job was initiated.

  • occ (int, optional) – The number of occupied orbitals (core + val) from a molpro CCSD sp calc.

  • max_job_time (float, optional) – The maximal allowed job time on the server in hours (can be fractional).

  • scan_res (int, optional) – The rotor scan resolution in degrees.

  • checkfile (str, optional) – The path to a previous Gaussian checkfile to be used in the current job.

  • number_of_radicals (int, optional) – The number of radicals (inputted by the user, ARC won’t attempt to determine it). Defaults to None. Important, e.g., if a Species is a bi-rad singlet, in which case the job should be unrestricted with multiplicity = 1.

  • radius (float, optional) – The species radius in Angstrom.

  • directed_scans (list) – Entries are lists of four-atom dihedral scan indices to constrain during a directed scan.

  • directed_dihedrals (list) – The dihedral angles of a directed scan job corresponding to directed_scans.

  • directed_scan_type (str) – The type of the directed scan.

  • rotor_index (int) – The 0-indexed rotor number (key) in the species.rotors_dict dictionary.

  • testing (bool, optional) – Whether the object is generated for testing purposes, True if it is.

  • cpu_cores (int, optional) – The total number of cpu cores requested for a job.

  • irc_direction (str, optional) – The direction of the IRC job (forward or reverse).

project

The project’s name. Used for naming the directory.

Type

str

ess_settings

A dictionary of available ESS and a corresponding server list.

Type

dict

species_name

The species/TS name. Used for naming the directory.

Type

str

charge

The species net charge. Default is 0.

Type

int

multiplicity

The species multiplicity.

Type

int

number_of_radicals

The number of radicals (inputted by the user, ARC won’t attempt to determine it). Defaults to None. Important, e.g., if a Species is a bi-rad singlet, in which case the job should be unrestricted with multiplicity = 1.

Type

int

spin

The spin. automatically derived from the multiplicity.

Type

int

xyz

The xyz geometry. Used for the calculation.

Type

dict

radius

The species radius in Angstrom.

Type

float

n_atoms

The number of atoms in self.xyz.

Type

int

conformer

Conformer number if optimizing conformers.

Type

int

is_ts

Whether this species represents a transition structure.

Type

bool

level

The level of theory to use.

Type

Level

job_type

The job’s type.

Type

str

scan

A list representing atom labels for the dihedral scan (e.g., [2, 1, 3, 5]).

Type

list

pivots

The rotor scan pivots, if the job type is scan. Not used directly in these methods, but used to identify the rotor.

Type

list

scan_res

The rotor scan resolution in degrees.

Type

int

software

The electronic structure software to be used.

Type

str

server_nodes

A list of nodes this job was submitted to (for troubleshooting).

Type

list

cpu_cores

The total number of cpu cores requested for a job. ARC adopts the following naming system to describe computing hardware hierarchy node > cpu > cpu_cores > cpu_threads

Type

int

input_file_memory

The memory ARC writes to job input files in appropriate formats per ESS. In software like Gaussian, this variable is total memory for all cpu cores. In software like Orca, this variable is memory per cpu core.

Type

int

submit_script_memory

The memory ARC writes to submit script in appropriate formats per cluster system. In system like Sun Grid Engine, this variable is total memory for all cpu cores. In system like Slurm, this variable is memory per cpu core. Notice that submit_script_memory > input_file_memory because additional memory is needed to execute a job on server properly

Type

int

total_job_memory_gb

The total memory ARC specifies for a job in GB.

Type

int

fine

Whether to use fine geometry optimization parameters.

Type

bool

shift

A string representation alpha- and beta-spin orbitals shifts (molpro only).

Type

str

comments

Job comments (archived, not used).

Type

str

initial_time

The date-time this job was initiated.

Type

datetime

final_time

The date-time this job was initiated.

Type

datetime

run_time

Job execution time.

Type

timedelta

job_status

The job’s server and ESS statuses. The job server status is in job.job_status[0] and can be either ‘initializing’ / ‘running’ / ‘errored’ / ‘done’. The job ESS status is in job.job_status[1] is a dictionary of {‘status’: str, ‘keywords’: list, ‘error’: str, ‘line’: str}. The values of ‘status’ can be either initializing, running, errored, unconverged, or done. If the status is ‘errored’, then standardized error keywords, the error description and the identified error line from the ESS log file will be given as well.

Type

list

job_server_name

Job’s name on the server (e.g., ‘a103’).

Type

str

job_name

Job’s name for internal usage (e.g., ‘opt_a103’).

Type

str

job_id

The job’s ID determined by the server.

Type

int

job_num

Used as the entry number in the database, as well as the job name on the server.

Type

int

local_path

Local path to job’s folder.

Type

str

local_path_to_output_file

The local path to the output.out file.

Type

str

local_path_to_orbitals_file

The local path to the orbitals.fchk file (only for orbitals jobs).

Type

str

local_path_to_check_file

The local path to the Gaussian check file of the current job (downloaded).

Type

str

local_path_to_lj_file

The local path to the lennard_jones data file (from OneDMin).

Type

str

local_path_to_xyz
Type

str

checkfile

The path to a previous Gaussian checkfile to be used in the current job.

Type

str

remote_path

Remote path to job’s folder.

Type

str

submit

The submit script. Created automatically.

Type

str

input

The input file. Created automatically.

Type

str

server

Server’s name.

Type

str

args

Methods (including troubleshooting) to be used in input files. Keys are either ‘keyword’ or ‘block’, values are dictionaries with values to be used either as keywords or as blocks in the respective software input file.

Type

dict

ess_trsh_methods

A list of troubleshooting methods already tried out for ESS convergence.

Type

list

scan_trsh

A troubleshooting method for rotor scans.

Type

str

occ

The number of occupied orbitals (core + val) from a molpro CCSD sp calc.

Type

int

project_directory

The path to the project directory.

Type

str

max_job_time

The maximal allowed job time on the server in hours (can be fractional).

Type

float

bath_gas

A bath gas. Currently used in OneDMin to calc L-J parameters. Allowed values are He, Ne, Ar, Kr, H2, N2, O2.

Type

str

directed_scans

Entries are lists of four-atom dihedral scan indices to constrain during a directed scan.

Type

list

directed_dihedrals

The dihedral angles of a directed scan job corresponding to directed_scans.

Type

list

directed_scan_type

The type of the directed scan.

Type

str

rotor_index

The 0-indexed rotor number (key) in the species.rotors_dict dictionary.

Type

int

irc_direction

The direction of the IRC job (forward or reverse).

Type

str

add_to_args(val: str, key1: str = 'keyword', key2: str = 'general', separator: Optional[str] = None, check_val: bool = True)[source]

Add arguments to self.args in a nested dictionary under self.args[key1][key2].

Parameters
  • val (str) – The value to add.

  • key1 (str, optional) – Key1.

  • key2 (str, optional) – Key2.

  • separator (str, optional) – A separator (e.g., ' ' or '\n') to apply between existing values and new values.

  • check_val (bool, optional) – Only append val if it doesn’t exist in the dictionary.

as_dict() → dict[source]

A helper function for dumping this object as a dictionary in a YAML file for restarting ARC.

deduce_software()[source]

Deduce the software to be used.

Returns: str

The deduced software.

delete()[source]

Delete a running Job.

determine_job_status()[source]

Determine the Job’s status. Updates self.job_status.

Raises

IOError – If the output file and any additional server information cannot be found.

determine_run_time()[source]

Determine the run time. Update self.run_time and round to seconds.

format_max_job_time(time_format)[source]

Convert the max_job_time attribute into the format supported by the server submission script

Parameters

time_format (str) – Either ‘days’ (e.g., 5-0:00:00) or ‘hours’ (e.g., 120:00:00)

Returns: str

The formatted maximum job time string

run()[source]

Execute the Job.

set_cpu_and_mem()[source]

Set the amount of cpus and memory based on ESS and cluster software.

set_file_paths()[source]

Set local and remote job file paths.

troubleshoot_server()[source]

Troubleshoot server errors.

write_completed_job_to_csv_file()[source]

Write a completed ARCJob into the completed_jobs.csv file.

write_input_file()[source]

Write a software-specific, job-specific input file. Save the file locally and also upload it to the server.

write_submit_script()[source]

Write the Job’s submit script.