arc.common

This module contains functions which are shared across multiple ARC modules. As such, it should not import any other ARC module (specifically ones that use the logger defined here) to avoid circular imports.

VERSION is the full ARC version, using semantic versioning.

arc.common.almost_equal_coords(xyz1: dict, xyz2: dict, rtol: float = 1e-05, atol: float = 1e-08) bool[source]

A helper function for checking whether two xyz’s are almost equal. Also checks equal symbols.

Parameters:
  • xyz1 (dict) – Cartesian coordinates.

  • xyz2 (dict) – Cartesian coordinates.

  • rtol (float, optional) – The relative tolerance parameter.

  • atol (float, optional) – The absolute tolerance parameter.

Returns: bool

True if they are almost equal, False otherwise.

arc.common.almost_equal_coords_lists(xyz1: Union[List[dict], dict], xyz2: Union[List[dict], dict], rtol: float = 1e-05, atol: float = 1e-08) bool[source]

A helper function for checking two lists of xyzs has at least one entry in each that is almost equal. Useful for comparing xyzs in unit tests.

Parameters:
  • xyz1 (Union[List[dict], dict]) – Either a dict-format xyz, or a list of them.

  • xyz2 (Union[List[dict], dict]) – Either a dict-format xyz, or a list of them.

  • rtol (float, optional) – The relative tolerance parameter.

  • atol (float, optional) – The absolute tolerance parameter.

Returns: bool

Whether at least one entry in each input xyzs is almost equal to an entry in the other xyz.

arc.common.almost_equal_lists(iter1: Union[list, tuple, ndarray], iter2: Union[list, tuple, ndarray], rtol: float = 1e-05, atol: float = 1e-08) bool[source]

A helper function for checking whether two iterables are almost equal.

Parameters:
  • iter1 (list, tuple, np.array) – An iterable.

  • iter2 (list, tuple, np.array) – An iterable.

  • rtol (float, optional) – The relative tolerance parameter.

  • atol (float, optional) – The absolute tolerance parameter.

Returns: bool

True if they are almost equal, False otherwise.

arc.common.calc_rmsd(x: Union[list, array], y: Union[list, array]) float[source]

Compute the root-mean-square deviation between two matrices.

Parameters:
  • x (np.array) – Matrix 1.

  • y (np.array) – Matrix 2.

Returns:

The RMSD score of two matrices.

Return type:

float

arc.common.check_ess_settings(ess_settings: Optional[dict] = None) dict[source]

A helper function to convert servers in the ess_settings dict to lists Assists in troubleshooting job and trying a different server Also check ESS and servers.

Parameters:

ess_settings (dict, optional) – ARC’s ESS settings dictionary.

Returns: dict

An updated ARC ESS dictionary.

arc.common.check_that_all_entries_are_in_list(list_1: Union[list, tuple], list_2: Union[list, tuple]) bool[source]

Check that all entries from list_2 are in list_1, and that the lists are the same length. Useful for testing that two lists are equal regardless of entry order.

Parameters:
  • list_1 (list, tuple) – Entries are floats or ints (could also be None).

  • list_2 (list, tuple) – Entries could be anything.

Returns: bool

Whether all entries from list_2 are in list_1 and the lists are the same length.

arc.common.check_torsion_change(torsions: DataFrame, index_1: Union[int, str], index_2: Union[int, str], threshold: Union[float, int] = 20.0, delta: Union[float, int] = 0.0) DataFrame[source]

Compare two sets of torsions (in DataFrame) and check if any entry has a difference larger than threshold. The output is a DataFrame consisting of True/False, indicating which torsions changed significantly.

Parameters:
  • torsions (pd.DataFrame) – A DataFrame consisting of multiple sets of torsions.

  • index_1 (Union[int, str]) – The index of the first conformer.

  • index_2 (Union[int, str]) – The index of the second conformer.

  • threshold (Union[float, int]) – The threshold used to determine the difference significance.

  • delta (Union[float, int]) – A known difference between torsion pairs, delta = tor[index_1] - tor[index_2]. E.g.,for the torsions to be scanned, the differences are equal to the scan resolution.

Returns: pd.DataFrame

A DataFrame consisting of True/False, indicating which torsions changed significantly. True for significant change.

arc.common.convert_list_index_0_to_1(_list: Union[list, tuple], direction: int = 1) Union[list, tuple][source]

Convert a list from 0-indexed to 1-indexed, or vice versa. Ensures positive values in the resulting list.

Parameters:
  • _list (list) – The list to be converted.

  • direction (int, optional) – Either 1 or -1 to convert 0-indexed to 1-indexed or vice versa, respectively.

Raises:

ValueError – If the new list contains negative values.

Returns:

The converted indices.

Return type:

Union[list, tuple]

arc.common.convert_to_hours(time_str: str) float[source]

Convert walltime string in format HH:MM:SS to hours.

Parameters:

time_str (str) – A time string in format HH:MM:SS

Returns:

The time in hours

Return type:

float

arc.common.delete_check_files(project_directory: str)[source]

Delete ESS checkfiles. They usually take up lots of space and are not needed after ARC terminates. Pass True to the keep_checks flag in ARC to avoid deleting check files.

Parameters:

project_directory (str) – The path to the ARC project folder.

arc.common.determine_ess(log_file: str) str[source]

Determine the ESS to which the log file belongs.

Parameters:

log_file (str) – The ESS log file path.

Returns: str

The ESS log class from Arkane.

arc.common.determine_symmetry(xyz: dict) Tuple[int, int][source]

Determine external symmetry and chirality (optical isomers) of the species.

Parameters:

xyz (dict) – The 3D coordinates.

Returns: Tuple[int, int]
  • The external symmetry number.

  • 1 if no chiral centers are present, 2 if chiral centers are present.

arc.common.determine_top_group_indices(mol, atom1, atom2, index=1) Tuple[list, bool][source]

Determine the indices of a “top group” in a molecule. The top is defined as all atoms connected to atom2, including atom2, excluding the direction of atom1. Two atom_list_to_explore are used so the list the loop iterates through isn’t changed within the loop.

Parameters:
  • mol (Molecule) – The Molecule object to explore.

  • atom1 (Atom) – The pivotal atom in mol.

  • atom2 (Atom) – The beginning of the top relative to atom1 in mol.

  • index (bool, optional) – Whether to return 1-index or 0-index conventions. 1 for 1-index.

Returns: Tuple[list, bool]
  • The indices of the atoms in the top (either 0-index or 1-index, as requested).

  • Whether the top has heavy atoms (is not just a hydrogen atom). True if it has heavy atoms.

arc.common.dfs(mol: Molecule, start: int, sort_result: bool = True) List[int][source]

A depth-first search algorithm for graph traversal of a Molecule object instance.

Parameters:
  • mol (Molecule) – The Molecule to search.

  • start (int) – The index of the first atom in the search.

  • sort_result (bool, optional) – Whether to sort the returned visited indices.

Returns:

Indices of all atoms connected to the starting atom.

Return type:

List[int]

arc.common.estimate_orca_mem_cpu_requirement(num_heavy_atoms: int, server: str = '', consider_server_limits: bool = False) Tuple[int, float][source]

Estimates memory and cpu requirements for an Orca job.

Parameters:
  • num_heavy_atoms (int) – The number of heavy atoms in the species.

  • server (str) – The name of the server where Orca runs.

  • consider_server_limits (bool) – Try to give realistic estimations.

Returns: Tuple[int, float]:
  • The amount of total memory (MB)

  • The number of cpu cores required for the Orca job for a given species.

arc.common.extremum_list(lst: list, return_min: bool = True) Optional[int][source]

A helper function for finding the extremum (either minimum or maximum) of a list of numbers (int/float) where some entries could be None.

Parameters:
  • lst (list) – The list.

  • return_min (bool, optional) – Whether to return the minimum or the maximum. True for minimum, False for maximum, True by default.

Returns: Optional[Union[int, None]]

The entry with the minimal/maximal value.

arc.common.from_yaml(yaml_str: str) Union[dict, list][source]

Read a YAML string and decode to the respective Python object. :param yaml_str: The YAML string content. :type yaml_str: str

Returns: Union[dict, list]

The respective Python object.

arc.common.generate_resonance_structures(object_: Union[Species, Molecule], keep_isomorphic: bool = False, filter_structures: bool = True, save_order: bool = True) Optional[List[Molecule]][source]

Safely generate resonance structures for either an RMG Molecule or an RMG Species object instances.

Parameters:
  • object (Species, Molecule) – The object to generate resonance structures for.

  • keep_isomorphic (bool, optional) – Whether to keep isomorphic isomers.

  • filter_structures (bool, optional) – Whether to filter resonance structures.

  • save_order (bool, optional) – Whether to make sure atom order is preserved.

Returns:

If a Molecule object instance was given, the function returns a list of resonance

structures (each is a Molecule object instance). If a Species object instance is given, the resonance structures are stored within the given object (in a .molecule attribute), and the function returns None.

Return type:

Optional[List[Molecule]]

arc.common.get_angle_in_180_range(angle: float, round_to: Optional[int] = 2) float[source]

Get the corresponding angle in the -180 to +180 degree range.

Parameters:
  • angle (float) – An angle in degrees.

  • round_to (int, optional) – The number of decimal figures to round the result to. None to not round. Default: 2.

Returns:

The corresponding angle in the -180 to +180 degree range.

Return type:

float

arc.common.get_atom_radius(symbol: str) float[source]

Get the atom covalent radius of an atom in Angstroms.

Parameters:

symbol (str) – The atomic symbol.

Raises:

TypeError – If symbol is of wrong type.

Returns: float

The atomic covalent radius (None if not found).

arc.common.get_bonds_from_dmat(dmat: ndarray, elements: Union[Tuple[str, ...], List[str]], charges: Optional[List[int]] = None, tolerance: float = 1.2, bond_lone_hydrogens: bool = True) List[Tuple[int, int]][source]

Guess the connectivity of a molecule from its distance matrix representation.

Parameters:
  • dmat (np.ndarray) – An NxN matrix of atom distances in Angstrom.

  • elements (List[str]) – The corresponding element list in the same atomic order.

  • charges (List[int], optional) – A corresponding list of formal atomic charges.

  • tolerance (float, optional) – A factor by which the single bond length threshold is multiplied for the check.

  • bond_lone_hydrogens (bool, optional) – Whether to assign a bond to hydrogen atoms which were not identified as bonded. If so, the closest atom will be considered.

Returns:

A list of tuple entries, each represents a bond and contains sorted atom indices.

Return type:

List[Tuple[int, int]]

arc.common.get_close_tuple(key_1: Tuple[Union[float, str], ...], keys: List[Tuple[Union[float, str], ...]], tolerance: float = 0.05, raise_error: bool = False) Optional[Tuple[Union[float, str], ...]][source]

Get a key from a list of keys close in value to the given key. Even if just one of the items in the key has a close match, use the close value.

Parameters:
  • key_1 (Tuple[Union[float, str], Union[float, str]]) – The key used for the search.

  • keys (List[Tuple[Union[float, str], Union[float, str]]]) – The list of keys to search within.

  • tolerance (float, optional) – The tolerance within which keys are determined to be close.

  • raise_error (bool, optional) – Whether to raise a ValueError if a close key wasn’t found.

Raises:
  • ValueError – If a key in keys has a different length than key_1.

  • ValueError – If a close key was not found and raise_error is True.

Returns:

A key from the keys list close in value to the given key.

Return type:

Optional[Tuple[Union[float, str], …]]

arc.common.get_extremum_index(lst: list, return_min: bool = True, skip_values: Optional[list] = None) Optional[int][source]

A helper function for finding the extremum (either minimum or maximum) of a list of numbers (int/float) where some entries could be None.

Parameters:
  • lst (list) – The list.

  • return_min (bool, optional) – Whether to return the minimum or the maximum. True for minimum, False for maximum, True by default.

  • skip_values (list, optional) – Values to skip when checking for extermum, e.g., 0.

Returns: Optional[Union[int, None]]

The index of an entry with the minimal/maximal value.

arc.common.get_git_branch(path: Optional[str] = None) str[source]

Get the git branch to be logged.

Parameters:

path (str, optional) – The path to check.

Returns: str

The git branch name.

arc.common.get_git_commit(path: Optional[str] = None) Tuple[str, str][source]

Get the recent git commit to be logged.

Note

Returns empty strings if hash and date cannot be determined.

Parameters:

path (str, optional) – The path to check.

Returns: tuple

The git HEAD commit hash and the git HEAD commit date, each as a string.

arc.common.get_logger()[source]

Get the ARC logger (avoid having multiple entries of the logger).

arc.common.get_number_with_ordinal_indicator(number: int) str[source]

Returns the number as a string with the ordinal indicator.

Parameters:

number (int) – An integer for which the ordinal indicator will be determined.

Returns: str

The number with the respective ordinal indicator.

arc.common.get_ordered_intersection_of_two_lists(l1: list, l2: list, order_by_first_list: Optional[bool] = True, return_unique: Optional[bool] = True) list[source]

Find the intersection of two lists by order.

Examples

  • l1 = [1, 2, 3, 3, 5, 6], l2 = [6, 3, 5, 5, 1], order_by_first_list = True, return_unique = True -> [1, 3, 5, 6] unique values in the intersection of l1 and l2, order following value’s first appearance in l1

  • l1 = [1, 2, 3, 3, 5, 6], l2 = [6, 3, 5, 5, 1], order_by_first_list = True, return_unique = False -> [1, 3, 3, 5, 6] unique values in the intersection of l1 and l2, order following value’s first appearance in l1

  • l1 = [1, 2, 3, 3, 5, 6], l2 = [6, 3, 5, 5, 1], order_by_first_list = False, return_unique = True -> [6, 3, 5, 1] unique values in the intersection of l1 and l2, order following value’s first appearance in l2

  • l1 = [1, 2, 3, 3, 5, 6], l2 = [6, 3, 5, 5, 1], order_by_first_list = False, return_unique = False -> [6, 3, 5, 5, 1] unique values in the intersection of l1 and l2, order following value’s first appearance in l2

Parameters:
  • l1 (list) – The first list.

  • l2 (list) – The second list.

  • order_by_first_list (bool, optional) – Whether to order the output list using the order of the values in the first list.

  • return_unique (bool, optional) – Whether to return only unique values in the intersection of two lists.

Returns: list

An ordered list of the intersection of two input lists.

arc.common.get_ordinal_indicator(number: int) str[source]

Returns the ordinal indicator for an integer.

Parameters:

number (int) – An integer for which the ordinal indicator will be determined.

Returns: str

The integer’s ordinal indicator.

arc.common.get_single_bond_length(symbol_1: str, symbol_2: str, charge_1: int = 0, charge_2: int = 0) float[source]

Get an approximate for a single bond length between two elements.

Parameters:
  • symbol_1 (str) – Symbol 1.

  • symbol_2 (str) – Symbol 2.

  • charge_1 (int, optional) – The partial charge of the atom represented by symbol_1.

  • charge_2 (int, optional) – The partial charge of the atom represented by symbol_2.

Returns: float

The estimated single bond length in Angstrom.

arc.common.globalize_path(string: str, project_directory: str) str[source]

Rebase an absolute file path on the current project path. Useful when restarting an ARC project in a different folder or on a different machine.

Parameters:
  • string (str) – A string containing a path to rebase.

  • project_directory (str) – The current project directory to rebase upon.

Returns: str

A string with the rebased path.

arc.common.globalize_paths(file_path: str, project_directory: str) str[source]

Rebase all file paths in the contents of the given file on the current project path. Useful when restarting an ARC project in a different folder or on a different machine.

Parameters:
  • file_path (str) – A path to the file to check. The contents of this file will be changed and saved as a different file.

  • project_directory (str) – The current project directory to rebase upon.

Returns: str

A path to the respective file with rebased absolute file paths.

arc.common.initialize_job_types(job_types: Optional[dict] = None, specific_job_type: str = '') dict[source]

A helper function for initializing job_types. Returns the comprehensive (default values for missing job types) job types for ARC.

Parameters:
  • job_types (dict, optional) – Keys are job types, values are booleans of whether or not to consider this job type.

  • specific_job_type (str, optional) – Specific job type to execute. Legal strings are job types (keys of job_types dict).

Returns: dict

An updated (comprehensive) job type dictionary.

arc.common.initialize_log(log_file: str, project: str, project_directory: Optional[str] = None, verbose: int = 20) None[source]

Set up a logger for ARC.

Parameters:
  • log_file (str) – The log file name.

  • project (str) – A name for the project.

  • project_directory (str, optional) – The path to the project directory.

  • verbose (int, optional) – Specify the amount of log text seen.

arc.common.is_angle_linear(angle: float, tolerance: float = 0.9) bool[source]

Check whether an angle is close to 180 or 0 degrees.

Parameters:
  • angle (float) – The angle in degrees.

  • tolerance (float) – The tolerance to consider.

Returns:

Whether the angle is close to 180 or 0 degrees, True if it is.

Return type:

bool

arc.common.is_notebook() bool[source]

Check whether ARC was called from an IPython notebook.

Returns: bool

True if ARC was called from a notebook, False otherwise.

arc.common.is_same_pivot(torsion1: Union[list, str], torsion2: Union[list, str]) Optional[bool][source]

Check if two torsions have the same pivots.

Parameters:
  • torsion1 (Union[list, str]) – The four atom indices representing the first torsion.

  • (Union (torsion2) – [list, str]): The four atom indices representing the second torsion.

Returns: Optional[bool]

True if two torsions share the same pivots.

arc.common.is_same_sequence_sublist(child_list: list, parent_list: list) bool[source]

Check if the parent list has a sublist which is identical to the child list including the sequence. .. rubric:: Examples

  • child_list = [1,2,3], parent_list=[5,1,2,3,9] -> True

  • child_list = [1,2,3], parent_list=[5,6,1,3,9] -> False

Parameters:
  • child_list (list) – The child list (the pattern to search in the parent list).

  • parent_list (list) – The parent list.

Returns: bool

True if the sublist is in the parent list.

arc.common.is_str_float(value: Optional[str]) bool[source]

Check whether a string can be converted to a floating number.

Parameters:

value (str) – The string to check.

Returns: bool

True if it can, False otherwise.

arc.common.is_str_int(value: Optional[str]) bool[source]

Check whether a string can be converted to an integer.

Parameters:

value (str) – The string to check.

Returns: bool

True if it can, False otherwise.

arc.common.is_xyz_mol_match(mol: Molecule, xyz: dict) bool[source]

A helper function that matches rmgpy.molecule.molecule.Molecule object to an xyz, used in _scissors to match xyz and the cut products. This function only checks the molecular formula.

Parameters:
  • mol – rmg Molecule object

  • xyz – coordinates of the cut product

Returns:

True if the xyz and molecule match, False otherwise

Return type:

bool

arc.common.key_by_val(dictionary: dict, value: Any) Any[source]

A helper function for getting a key from a dictionary corresponding to a certain value. Does not check for value unicity.

Parameters:
  • dictionary (dict) – The dictionary.

  • value – The value.

Raises:

ValueError – If the value could not be found in the dictionary.

Returns: Any

The key.

Output a footer for the log.

Parameters:
  • execution_time (str) – The overall execution time for ARC.

  • level (int, optional) – The desired logging level.

arc.common.log_header(project: str, level: int = 20) None[source]

Output a header containing identifying information about ARC to the log.

Parameters:
  • project (str) – The ARC project name to be logged in the header.

  • level (int, optional) – The desired logging level.

arc.common.read_yaml_file(path: str, project_directory: Optional[str] = None) Union[dict, list][source]

Read a YAML file (usually an input / restart file, but also conformers file) and return the parameters as python variables.

Parameters:
  • path (str) – The YAML file path to read.

  • project_directory (str, optional) – The current project directory to rebase upon.

Returns: Union[dict, list]

The content read from the file.

arc.common.rmg_mol_from_dict_repr(representation: dict, is_ts: bool = False) Optional[Molecule][source]

Generate a dict representation of an RMG Molecule object instance.

Parameters:
  • representation (dict) – A dict representation of an RMG Molecule object instance.

  • is_ts (bool, optional) – Whether the Molecule represents a TS.

Returns:

The corresponding RMG Molecule object instance.

Return type:

Molecule

arc.common.rmg_mol_to_dict_repr(mol: Molecule, reset_atom_ids: bool = False, testing: bool = False) dict[source]

Generate a dict representation of an RMG Molecule object instance.

Parameters:
  • mol (Molecule) – The RMG Molecule object instance.

  • reset_atom_ids (bool, optional) – Whether to reset the atom IDs in the .mol Molecule attribute. Useful when copying the object to avoid duplicate atom IDs between different object instances.

  • testing (bool, optional) – Whether this is called during a test, in which case atom IDs should be deterministic.

Returns:

The corresponding dict representation.

Return type:

dict

arc.common.safe_copy_file(source: str, destination: str, wait: int = 10, max_cycles: int = 15)[source]

Copy a file safely.

Parameters:
  • source (str) – The full path to the file to be copied.

  • destination (str) – The full path to the file destination.

  • wait (int, optional) – The number of seconds to wait between cycles.

  • max_cycles (int, optional) – The maximum number of cycles to try.

arc.common.save_yaml_file(path: str, content: Union[list, dict]) None[source]

Save a YAML file (usually an input / restart file, but also conformers file).

Parameters:
  • path (str) – The YAML file path to save.

  • content (list, dict) – The content to save.

arc.common.sort_atoms_in_descending_label_order(mol: Molecule) None[source]

If all atoms in the molecule object have a label, this function reassign the .atoms in Molecule with a list of atoms with the orders based on the labels of the atoms. for example, [int(atom.label) for atom in mol.atoms] is [1, 4, 32, 7], then the function will return the new atom with the order [1, 4, 7, 32]

Parameters:

mol (Molecule) – An RMG Molecule object, with labeled atoms

arc.common.sort_two_lists_by_the_first(list1: List[Optional[Union[float, int]]], list2: List[Optional[Union[float, int]]]) Tuple[List[Union[int, float]], List[Union[int, float]]][source]

Sort two lists in increasing order by the values of the first list. Ignoring None entries from list1 and their respective entries in list2. The function was written in this format rather the more pytonic zip(*sorted(zip(list1, list2))) style to accommodate for dictionaries as entries of list2, otherwise a TypeError: '<' not supported between instances of 'dict' and 'dict' error is raised.

Parameters:
  • list1 (list, tuple) – Entries are floats or ints (could also be None).

  • list2 (list, tuple) – Entries could be anything.

Raises:

InputError – If types are wrong, or lists are not the same length.

Returns: Tuple[list, list]
  • Sorted values from list1, ignoring None entries.

  • Respective entries from list2.

arc.common.string_representer(dumper, data)[source]

Add a custom string representer to use block literals for multiline strings.

arc.common.sum_list_entries(lst: List[Union[int, float]], multipliers: Optional[List[Union[int, float]]] = None) Optional[float][source]

Sum all entries in a list. If any entry is None, return None. If multipliers is given, multiply each entry in lst by the respective multiplier entry.

Parameters:
  • lst (list) – The list to process.

  • multipliers (list, optional) – A list of multipliers.

Returns:

The result.

Return type:

Optional[float]

arc.common.time_lapse(t0) str[source]

A helper function returning the elapsed time since t0.

Parameters:

t0 (time.pyi) – The initial time the count starts from.

Returns: str

A “D HH:MM:SS” formatted time difference between now and t0.

arc.common.timedelta_from_str(time_str: str)[source]

Get a datetime.timedelta object from its str() representation

Parameters:

time_str (str) – The string representation of a datetime.timedelta object.

Returns:

The corresponding timedelta object.

Return type:

datetime.timedelta

arc.common.to_yaml(py_content: Union[list, dict]) str[source]

Convert a Python list or dictionary to a YAML string format.

Parameters:

py_content (list, dict) – The Python content to save.

Returns: str

The corresponding YAML representation.

arc.common.torsions_to_scans(descriptor: Optional[List[List[int]]], direction: int = 1) Optional[List[List[int]]][source]

Convert torsions to scans or vice versa. In ARC we define a torsion as a list of four atoms with 0-indices. We define a scan as a list of four atoms with 1-indices. This function converts one format to the other.

Parameters:
  • descriptor (list) – The torsions or scans list.

  • direction (int, optional) – 1: Convert torsions to scans; -1: Convert scans to torsions.

Returns:

The converted indices.

Return type:

Optional[List[List[int]]]