arc.common¶
This module contains functions which are shared across multiple ARC modules. As such, it should not import any other ARC module (specifically ones that use the logger defined here) to avoid circular imports.
VERSION is the full ARC version, using semantic versioning.
- arc.common.almost_equal_coords(xyz1: dict, xyz2: dict, rtol: float = 1e-05, atol: float = 1e-08) bool [source]¶
A helper function for checking whether two xyz’s are almost equal. Also checks equal symbols.
- Parameters:
xyz1 (dict) – Cartesian coordinates.
xyz2 (dict) – Cartesian coordinates.
rtol (float, optional) – The relative tolerance parameter.
atol (float, optional) – The absolute tolerance parameter.
- Returns: bool
True
if they are almost equal,False
otherwise.
- arc.common.almost_equal_coords_lists(xyz1: Union[List[dict], dict], xyz2: Union[List[dict], dict], rtol: float = 1e-05, atol: float = 1e-08) bool [source]¶
A helper function for checking two lists of xyzs has at least one entry in each that is almost equal. Useful for comparing xyzs in unit tests.
- Parameters:
xyz1 (Union[List[dict], dict]) – Either a dict-format xyz, or a list of them.
xyz2 (Union[List[dict], dict]) – Either a dict-format xyz, or a list of them.
rtol (float, optional) – The relative tolerance parameter.
atol (float, optional) – The absolute tolerance parameter.
- Returns: bool
Whether at least one entry in each input xyzs is almost equal to an entry in the other xyz.
- arc.common.almost_equal_lists(iter1: Union[list, tuple, ndarray], iter2: Union[list, tuple, ndarray], rtol: float = 1e-05, atol: float = 1e-08) bool [source]¶
A helper function for checking whether two iterables are almost equal.
- Parameters:
iter1 (list, tuple, np.array) – An iterable.
iter2 (list, tuple, np.array) – An iterable.
rtol (float, optional) – The relative tolerance parameter.
atol (float, optional) – The absolute tolerance parameter.
- Returns: bool
True
if they are almost equal,False
otherwise.
- arc.common.calc_rmsd(x: Union[list, array], y: Union[list, array]) float [source]¶
Compute the root-mean-square deviation between two matrices.
- Parameters:
x (np.array) – Matrix 1.
y (np.array) – Matrix 2.
- Returns:
The RMSD score of two matrices.
- Return type:
float
- arc.common.check_ess_settings(ess_settings: Optional[dict] = None) dict [source]¶
A helper function to convert servers in the ess_settings dict to lists Assists in troubleshooting job and trying a different server Also check ESS and servers.
- Parameters:
ess_settings (dict, optional) – ARC’s ESS settings dictionary.
- Returns: dict
An updated ARC ESS dictionary.
- arc.common.check_that_all_entries_are_in_list(list_1: Union[list, tuple], list_2: Union[list, tuple]) bool [source]¶
Check that all entries from
list_2
are inlist_1
, and that the lists are the same length. Useful for testing that two lists are equal regardless of entry order.- Parameters:
list_1 (list, tuple) – Entries are floats or ints (could also be None).
list_2 (list, tuple) – Entries could be anything.
- Returns: bool
Whether all entries from
list_2
are inlist_1
and the lists are the same length.
- arc.common.check_torsion_change(torsions: DataFrame, index_1: Union[int, str], index_2: Union[int, str], threshold: Union[float, int] = 20.0, delta: Union[float, int] = 0.0) DataFrame [source]¶
Compare two sets of torsions (in DataFrame) and check if any entry has a difference larger than threshold. The output is a DataFrame consisting of
True
/False
, indicating which torsions changed significantly.- Parameters:
torsions (pd.DataFrame) – A DataFrame consisting of multiple sets of torsions.
index_1 (Union[int, str]) – The index of the first conformer.
index_2 (Union[int, str]) – The index of the second conformer.
threshold (Union[float, int]) – The threshold used to determine the difference significance.
delta (Union[float, int]) – A known difference between torsion pairs, delta = tor[index_1] - tor[index_2]. E.g.,for the torsions to be scanned, the differences are equal to the scan resolution.
- Returns: pd.DataFrame
A DataFrame consisting of
True
/False
, indicating which torsions changed significantly.True
for significant change.
- arc.common.convert_list_index_0_to_1(_list: Union[list, tuple], direction: int = 1) Union[list, tuple] [source]¶
Convert a list from 0-indexed to 1-indexed, or vice versa. Ensures positive values in the resulting list.
- Parameters:
_list (list) – The list to be converted.
direction (int, optional) – Either 1 or -1 to convert 0-indexed to 1-indexed or vice versa, respectively.
- Raises:
ValueError – If the new list contains negative values.
- Returns:
The converted indices.
- Return type:
Union[list, tuple]
- arc.common.convert_to_hours(time_str: str) float [source]¶
Convert walltime string in format HH:MM:SS to hours.
- Parameters:
time_str (str) – A time string in format HH:MM:SS
- Returns:
The time in hours
- Return type:
float
- arc.common.delete_check_files(project_directory: str)[source]¶
Delete ESS checkfiles. They usually take up lots of space and are not needed after ARC terminates. Pass
True
to thekeep_checks
flag in ARC to avoid deleting check files.- Parameters:
project_directory (str) – The path to the ARC project folder.
- arc.common.determine_ess(log_file: str) str [source]¶
Determine the ESS to which the log file belongs.
- Parameters:
log_file (str) – The ESS log file path.
- Returns: str
The ESS log class from Arkane.
- arc.common.determine_symmetry(xyz: dict) Tuple[int, int] [source]¶
Determine external symmetry and chirality (optical isomers) of the species.
- Parameters:
xyz (dict) – The 3D coordinates.
- Returns: Tuple[int, int]
The external symmetry number.
1
if no chiral centers are present,2
if chiral centers are present.
- arc.common.determine_top_group_indices(mol, atom1, atom2, index=1) Tuple[list, bool] [source]¶
Determine the indices of a “top group” in a molecule. The top is defined as all atoms connected to atom2, including atom2, excluding the direction of atom1. Two
atom_list_to_explore
are used so the list the loop iterates through isn’t changed within the loop.- Parameters:
mol (Molecule) – The Molecule object to explore.
atom1 (Atom) – The pivotal atom in mol.
atom2 (Atom) – The beginning of the top relative to atom1 in mol.
index (bool, optional) – Whether to return 1-index or 0-index conventions. 1 for 1-index.
- Returns: Tuple[list, bool]
The indices of the atoms in the top (either 0-index or 1-index, as requested).
Whether the top has heavy atoms (is not just a hydrogen atom).
True
if it has heavy atoms.
- arc.common.dfs(mol: Molecule, start: int, sort_result: bool = True) List[int] [source]¶
A depth-first search algorithm for graph traversal of a Molecule object instance.
- Parameters:
mol (Molecule) – The Molecule to search.
start (int) – The index of the first atom in the search.
sort_result (bool, optional) – Whether to sort the returned visited indices.
- Returns:
Indices of all atoms connected to the starting atom.
- Return type:
List[int]
- arc.common.estimate_orca_mem_cpu_requirement(num_heavy_atoms: int, server: str = '', consider_server_limits: bool = False) Tuple[int, float] [source]¶
Estimates memory and cpu requirements for an Orca job.
- Parameters:
num_heavy_atoms (int) – The number of heavy atoms in the species.
server (str) – The name of the server where Orca runs.
consider_server_limits (bool) – Try to give realistic estimations.
- Returns: Tuple[int, float]:
The amount of total memory (MB)
The number of cpu cores required for the Orca job for a given species.
- arc.common.extremum_list(lst: list, return_min: bool = True) Optional[int] [source]¶
A helper function for finding the extremum (either minimum or maximum) of a list of numbers (int/float) where some entries could be
None
.- Parameters:
lst (list) – The list.
return_min (bool, optional) – Whether to return the minimum or the maximum.
True
for minimum,False
for maximum,True
by default.
- Returns: Optional[Union[int, None]]
The entry with the minimal/maximal value.
- arc.common.from_yaml(yaml_str: str) Union[dict, list] [source]¶
Read a YAML string and decode to the respective Python object. :param yaml_str: The YAML string content. :type yaml_str: str
- Returns: Union[dict, list]
The respective Python object.
- arc.common.generate_resonance_structures(object_: Union[Species, Molecule], keep_isomorphic: bool = False, filter_structures: bool = True, save_order: bool = True) Optional[List[Molecule]] [source]¶
Safely generate resonance structures for either an RMG Molecule or an RMG Species object instances.
- Parameters:
object (Species, Molecule) – The object to generate resonance structures for.
keep_isomorphic (bool, optional) – Whether to keep isomorphic isomers.
filter_structures (bool, optional) – Whether to filter resonance structures.
save_order (bool, optional) – Whether to make sure atom order is preserved.
- Returns:
- If a
Molecule
object instance was given, the function returns a list of resonance structures (each is a
Molecule
object instance). If aSpecies
object instance is given, the resonance structures are stored within the given object (in a .molecule attribute), and the function returnsNone
.
- If a
- Return type:
Optional[List[Molecule]]
- arc.common.get_angle_in_180_range(angle: float, round_to: Optional[int] = 2) float [source]¶
Get the corresponding angle in the -180 to +180 degree range.
- Parameters:
angle (float) – An angle in degrees.
round_to (int, optional) – The number of decimal figures to round the result to.
None
to not round. Default: 2.
- Returns:
The corresponding angle in the -180 to +180 degree range.
- Return type:
float
- arc.common.get_atom_radius(symbol: str) float [source]¶
Get the atom covalent radius of an atom in Angstroms.
- Parameters:
symbol (str) – The atomic symbol.
- Raises:
TypeError – If
symbol
is of wrong type.
- Returns: float
The atomic covalent radius (None if not found).
- arc.common.get_bonds_from_dmat(dmat: ndarray, elements: Union[Tuple[str, ...], List[str]], charges: Optional[List[int]] = None, tolerance: float = 1.2, bond_lone_hydrogens: bool = True) List[Tuple[int, int]] [source]¶
Guess the connectivity of a molecule from its distance matrix representation.
- Parameters:
dmat (np.ndarray) – An NxN matrix of atom distances in Angstrom.
elements (List[str]) – The corresponding element list in the same atomic order.
charges (List[int], optional) – A corresponding list of formal atomic charges.
tolerance (float, optional) – A factor by which the single bond length threshold is multiplied for the check.
bond_lone_hydrogens (bool, optional) – Whether to assign a bond to hydrogen atoms which were not identified as bonded. If so, the closest atom will be considered.
- Returns:
A list of tuple entries, each represents a bond and contains sorted atom indices.
- Return type:
List[Tuple[int, int]]
- arc.common.get_close_tuple(key_1: Tuple[Union[float, str], ...], keys: List[Tuple[Union[float, str], ...]], tolerance: float = 0.05, raise_error: bool = False) Optional[Tuple[Union[float, str], ...]] [source]¶
Get a key from a list of keys close in value to the given key. Even if just one of the items in the key has a close match, use the close value.
- Parameters:
key_1 (Tuple[Union[float, str], Union[float, str]]) – The key used for the search.
keys (List[Tuple[Union[float, str], Union[float, str]]]) – The list of keys to search within.
tolerance (float, optional) – The tolerance within which keys are determined to be close.
raise_error (bool, optional) – Whether to raise a ValueError if a close key wasn’t found.
- Raises:
ValueError – If a key in
keys
has a different length thankey_1
.ValueError – If a close key was not found and
raise_error
isTrue
.
- Returns:
A key from the keys list close in value to the given key.
- Return type:
Optional[Tuple[Union[float, str], …]]
- arc.common.get_extremum_index(lst: list, return_min: bool = True, skip_values: Optional[list] = None) Optional[int] [source]¶
A helper function for finding the extremum (either minimum or maximum) of a list of numbers (int/float) where some entries could be
None
.- Parameters:
lst (list) – The list.
return_min (bool, optional) – Whether to return the minimum or the maximum.
True
for minimum,False
for maximum,True
by default.skip_values (list, optional) – Values to skip when checking for extermum, e.g., 0.
- Returns: Optional[Union[int, None]]
The index of an entry with the minimal/maximal value.
- arc.common.get_git_branch(path: Optional[str] = None) str [source]¶
Get the git branch to be logged.
- Parameters:
path (str, optional) – The path to check.
- Returns: str
The git branch name.
- arc.common.get_git_commit(path: Optional[str] = None) Tuple[str, str] [source]¶
Get the recent git commit to be logged.
Note
Returns empty strings if hash and date cannot be determined.
- Parameters:
path (str, optional) – The path to check.
- Returns: tuple
The git HEAD commit hash and the git HEAD commit date, each as a string.
- arc.common.get_number_with_ordinal_indicator(number: int) str [source]¶
Returns the number as a string with the ordinal indicator.
- Parameters:
number (int) – An integer for which the ordinal indicator will be determined.
- Returns: str
The number with the respective ordinal indicator.
- arc.common.get_ordered_intersection_of_two_lists(l1: list, l2: list, order_by_first_list: Optional[bool] = True, return_unique: Optional[bool] = True) list [source]¶
Find the intersection of two lists by order.
Examples
l1 = [1, 2, 3, 3, 5, 6], l2 = [6, 3, 5, 5, 1], order_by_first_list =
True
, return_unique =True
-> [1, 3, 5, 6] unique values in the intersection of l1 and l2, order following value’s first appearance in l1l1 = [1, 2, 3, 3, 5, 6], l2 = [6, 3, 5, 5, 1], order_by_first_list =
True
, return_unique =False
-> [1, 3, 3, 5, 6] unique values in the intersection of l1 and l2, order following value’s first appearance in l1l1 = [1, 2, 3, 3, 5, 6], l2 = [6, 3, 5, 5, 1], order_by_first_list =
False
, return_unique =True
-> [6, 3, 5, 1] unique values in the intersection of l1 and l2, order following value’s first appearance in l2l1 = [1, 2, 3, 3, 5, 6], l2 = [6, 3, 5, 5, 1], order_by_first_list =
False
, return_unique =False
-> [6, 3, 5, 5, 1] unique values in the intersection of l1 and l2, order following value’s first appearance in l2
- Parameters:
l1 (list) – The first list.
l2 (list) – The second list.
order_by_first_list (bool, optional) – Whether to order the output list using the order of the values in the first list.
return_unique (bool, optional) – Whether to return only unique values in the intersection of two lists.
- Returns: list
An ordered list of the intersection of two input lists.
- arc.common.get_ordinal_indicator(number: int) str [source]¶
Returns the ordinal indicator for an integer.
- Parameters:
number (int) – An integer for which the ordinal indicator will be determined.
- Returns: str
The integer’s ordinal indicator.
- arc.common.get_single_bond_length(symbol_1: str, symbol_2: str, charge_1: int = 0, charge_2: int = 0) float [source]¶
Get an approximate for a single bond length between two elements.
- Parameters:
symbol_1 (str) – Symbol 1.
symbol_2 (str) – Symbol 2.
charge_1 (int, optional) – The partial charge of the atom represented by
symbol_1
.charge_2 (int, optional) – The partial charge of the atom represented by
symbol_2
.
- Returns: float
The estimated single bond length in Angstrom.
- arc.common.globalize_path(string: str, project_directory: str) str [source]¶
Rebase an absolute file path on the current project path. Useful when restarting an ARC project in a different folder or on a different machine.
- Parameters:
string (str) – A string containing a path to rebase.
project_directory (str) – The current project directory to rebase upon.
- Returns: str
A string with the rebased path.
- arc.common.globalize_paths(file_path: str, project_directory: str) str [source]¶
Rebase all file paths in the contents of the given file on the current project path. Useful when restarting an ARC project in a different folder or on a different machine.
- Parameters:
file_path (str) – A path to the file to check. The contents of this file will be changed and saved as a different file.
project_directory (str) – The current project directory to rebase upon.
- Returns: str
A path to the respective file with rebased absolute file paths.
- arc.common.initialize_job_types(job_types: Optional[dict] = None, specific_job_type: str = '') dict [source]¶
A helper function for initializing job_types. Returns the comprehensive (default values for missing job types) job types for ARC.
- Parameters:
job_types (dict, optional) – Keys are job types, values are booleans of whether or not to consider this job type.
specific_job_type (str, optional) – Specific job type to execute. Legal strings are job types (keys of job_types dict).
- Returns: dict
An updated (comprehensive) job type dictionary.
- arc.common.initialize_log(log_file: str, project: str, project_directory: Optional[str] = None, verbose: int = 20) None [source]¶
Set up a logger for ARC.
- Parameters:
log_file (str) – The log file name.
project (str) – A name for the project.
project_directory (str, optional) – The path to the project directory.
verbose (int, optional) – Specify the amount of log text seen.
- arc.common.is_angle_linear(angle: float, tolerance: float = 0.9) bool [source]¶
Check whether an angle is close to 180 or 0 degrees.
- Parameters:
angle (float) – The angle in degrees.
tolerance (float) – The tolerance to consider.
- Returns:
Whether the angle is close to 180 or 0 degrees,
True
if it is.- Return type:
bool
- arc.common.is_notebook() bool [source]¶
Check whether ARC was called from an IPython notebook.
- Returns: bool
True
if ARC was called from a notebook,False
otherwise.
- arc.common.is_same_pivot(torsion1: Union[list, str], torsion2: Union[list, str]) Optional[bool] [source]¶
Check if two torsions have the same pivots.
- Parameters:
torsion1 (Union[list, str]) – The four atom indices representing the first torsion.
(Union (torsion2) – [list, str]): The four atom indices representing the second torsion.
- Returns: Optional[bool]
True
if two torsions share the same pivots.
- arc.common.is_same_sequence_sublist(child_list: list, parent_list: list) bool [source]¶
Check if the parent list has a sublist which is identical to the child list including the sequence. .. rubric:: Examples
child_list = [1,2,3], parent_list=[5,1,2,3,9] ->
True
child_list = [1,2,3], parent_list=[5,6,1,3,9] ->
False
- Parameters:
child_list (list) – The child list (the pattern to search in the parent list).
parent_list (list) – The parent list.
- Returns: bool
True
if the sublist is in the parent list.
- arc.common.is_str_float(value: Optional[str]) bool [source]¶
Check whether a string can be converted to a floating number.
- Parameters:
value (str) – The string to check.
- Returns: bool
True
if it can,False
otherwise.
- arc.common.is_str_int(value: Optional[str]) bool [source]¶
Check whether a string can be converted to an integer.
- Parameters:
value (str) – The string to check.
- Returns: bool
True
if it can,False
otherwise.
- arc.common.is_xyz_mol_match(mol: Molecule, xyz: dict) bool [source]¶
A helper function that matches rmgpy.molecule.molecule.Molecule object to an xyz, used in _scissors to match xyz and the cut products. This function only checks the molecular formula.
- Parameters:
mol – rmg Molecule object
xyz – coordinates of the cut product
- Returns:
True
if the xyz and molecule match,False
otherwise- Return type:
bool
- arc.common.key_by_val(dictionary: dict, value: Any) Any [source]¶
A helper function for getting a key from a dictionary corresponding to a certain value. Does not check for value unicity.
- Parameters:
dictionary (dict) – The dictionary.
value – The value.
- Raises:
ValueError – If the value could not be found in the dictionary.
- Returns: Any
The key.
Output a footer for the log.
- Parameters:
execution_time (str) – The overall execution time for ARC.
level (int, optional) – The desired logging level.
- arc.common.log_header(project: str, level: int = 20) None [source]¶
Output a header containing identifying information about ARC to the log.
- Parameters:
project (str) – The ARC project name to be logged in the header.
level (int, optional) – The desired logging level.
- arc.common.read_yaml_file(path: str, project_directory: Optional[str] = None) Union[dict, list] [source]¶
Read a YAML file (usually an input / restart file, but also conformers file) and return the parameters as python variables.
- Parameters:
path (str) – The YAML file path to read.
project_directory (str, optional) – The current project directory to rebase upon.
- Returns: Union[dict, list]
The content read from the file.
- arc.common.rmg_mol_from_dict_repr(representation: dict, is_ts: bool = False) Optional[Molecule] [source]¶
Generate a dict representation of an RMG
Molecule
object instance.- Parameters:
representation (dict) – A dict representation of an RMG
Molecule
object instance.is_ts (bool, optional) – Whether the
Molecule
represents a TS.
- Returns:
The corresponding RMG
Molecule
object instance.- Return type:
Molecule
- arc.common.rmg_mol_to_dict_repr(mol: Molecule, reset_atom_ids: bool = False, testing: bool = False) dict [source]¶
Generate a dict representation of an RMG
Molecule
object instance.- Parameters:
mol (Molecule) – The RMG
Molecule
object instance.reset_atom_ids (bool, optional) – Whether to reset the atom IDs in the .mol Molecule attribute. Useful when copying the object to avoid duplicate atom IDs between different object instances.
testing (bool, optional) – Whether this is called during a test, in which case atom IDs should be deterministic.
- Returns:
The corresponding dict representation.
- Return type:
dict
- arc.common.safe_copy_file(source: str, destination: str, wait: int = 10, max_cycles: int = 15)[source]¶
Copy a file safely.
- Parameters:
source (str) – The full path to the file to be copied.
destination (str) – The full path to the file destination.
wait (int, optional) – The number of seconds to wait between cycles.
max_cycles (int, optional) – The maximum number of cycles to try.
- arc.common.save_yaml_file(path: str, content: Union[list, dict]) None [source]¶
Save a YAML file (usually an input / restart file, but also conformers file).
- Parameters:
path (str) – The YAML file path to save.
content (list, dict) – The content to save.
- arc.common.sort_atoms_in_descending_label_order(mol: Molecule) None [source]¶
If all atoms in the molecule object have a label, this function reassign the .atoms in Molecule with a list of atoms with the orders based on the labels of the atoms. for example, [int(atom.label) for atom in mol.atoms] is [1, 4, 32, 7], then the function will return the new atom with the order [1, 4, 7, 32]
- Parameters:
mol (Molecule) – An RMG Molecule object, with labeled atoms
- arc.common.sort_two_lists_by_the_first(list1: List[Optional[Union[float, int]]], list2: List[Optional[Union[float, int]]]) Tuple[List[Union[int, float]], List[Union[int, float]]] [source]¶
Sort two lists in increasing order by the values of the first list. Ignoring None entries from list1 and their respective entries in list2. The function was written in this format rather the more pytonic
zip(*sorted(zip(list1, list2)))
style to accommodate for dictionaries as entries of list2, otherwise aTypeError: '<' not supported between instances of 'dict' and 'dict'
error is raised.- Parameters:
list1 (list, tuple) – Entries are floats or ints (could also be None).
list2 (list, tuple) – Entries could be anything.
- Raises:
InputError – If types are wrong, or lists are not the same length.
- Returns: Tuple[list, list]
Sorted values from list1, ignoring None entries.
Respective entries from list2.
- arc.common.string_representer(dumper, data)[source]¶
Add a custom string representer to use block literals for multiline strings.
- arc.common.sum_list_entries(lst: List[Union[int, float]], multipliers: Optional[List[Union[int, float]]] = None) Optional[float] [source]¶
Sum all entries in a list. If any entry is
None
, returnNone
. Ifmultipliers
is given, multiply each entry inlst
by the respective multiplier entry.- Parameters:
lst (list) – The list to process.
multipliers (list, optional) – A list of multipliers.
- Returns:
The result.
- Return type:
Optional[float]
- arc.common.time_lapse(t0) str [source]¶
A helper function returning the elapsed time since t0.
- Parameters:
t0 (time.pyi) – The initial time the count starts from.
- Returns: str
A “D HH:MM:SS” formatted time difference between now and t0.
- arc.common.timedelta_from_str(time_str: str)[source]¶
Get a datetime.timedelta object from its str() representation
- Parameters:
time_str (str) – The string representation of a datetime.timedelta object.
- Returns:
The corresponding timedelta object.
- Return type:
datetime.timedelta
- arc.common.to_yaml(py_content: Union[list, dict]) str [source]¶
Convert a Python list or dictionary to a YAML string format.
- Parameters:
py_content (list, dict) – The Python content to save.
- Returns: str
The corresponding YAML representation.
- arc.common.torsions_to_scans(descriptor: Optional[List[List[int]]], direction: int = 1) Optional[List[List[int]]] [source]¶
Convert torsions to scans or vice versa. In ARC we define a torsion as a list of four atoms with 0-indices. We define a scan as a list of four atoms with 1-indices. This function converts one format to the other.
- Parameters:
descriptor (list) – The torsions or scans list.
direction (int, optional) – 1: Convert torsions to scans; -1: Convert scans to torsions.
- Returns:
The converted indices.
- Return type:
Optional[List[List[int]]]