rmgpy.molecule.Group

class rmgpy.molecule.Group

A representation of a molecular substructure group using a graph data type, extending the Graph class. The attributes are:

Attribute

Type

Description

atoms

list

Aliases for the vertices storing GroupAtom

multiplicity

list

Range of multiplicities accepted for the group

props

dict

Dictionary of arbitrary properties/flags classifying state of Group object

metal

list

List of metals accepted for the group

facet

list

List of facets accepted for the group

Corresponding alias methods to Molecule have also been provided.

add_atom(atom)

Add an atom to the graph. The atom is initialized with no bonds.

add_bond(bond)

Add a bond to the graph as an edge connecting the two atoms atom1 and atom2.

add_edge(edge)

Add an edge to the graph. The two vertices in the edge must already exist in the graph, or a ValueError is raised.

add_explicit_ligands()

This function O2d/S2d ligand to CO or CS atomtypes if they are not already there.

Returns a ‘True’ if the group was modified otherwise returns ‘False’

add_implicit_atoms_from_atomtype()

Returns: a modified group with implicit atoms added Add implicit double/triple bonded atoms O, S or R, for which we will use a C

Not designed to work with wildcards

add_implicit_benzene()

Returns: A modified group with any implicit benzene rings added

This method currently does not if there are wildcards in atomtypes or bond orders The current algorithm also requires that all Cb and Cbf are atomtyped

There are other cases where the algorithm doesn’t work. For example whenever there are many dangling Cb or Cbf atoms not in a ring, it is likely fail. In the database test (the only use thus far), we will require that any group with more than 3 Cbfs have complete rings. This is much stricter than this method can handle, but right now this method cannot handle very general cases, so it is better to be conservative.

Note that it also works on other aromatic atomtypes like N5bd etc.

add_vertex(vertex)

Add a vertex to the graph. The vertex is initialized with no edges.

atoms

List of atoms contained in the current molecule.

Renames the inherited vertices attribute of Graph.

classify_benzene_carbons(partners)
Parameters:
  • group – :class:Group with atoms to classify

  • partners – dictionary of partnered up atoms, which must be a cbf atom

Some non-carbon ‘benzene’ type atoms (eg. N3b) are included and classified.

Returns: tuple with lists of each atom classification: cb_atom_list, cbf_atom_list, cbf_atom_list1, cbf_atom_list2, connected_cbfs

clear_labeled_atoms()

Remove the labels from all atoms in the molecular group.

clear_reg_dims()

clear regularization dimensions

contains_labeled_atom(label)

Return True if the group contains an atom with the label label and False otherwise.

contains_surface_site()

Returns True iff the group contains an ‘X’ surface site.

copy(deep)

Create a copy of the current graph. If deep is True, a deep copy is made: copies of the vertices and edges are used in the new graph. If deep is False or not specified, a shallow copy is made: the original vertices and edges are used in the new graph.

copy_and_map()

Create a deep copy of the current graph, and return the dict ‘mapping’. Method was modified from Graph.copy() method

create_and_connect_atom(atomtypes, connecting_atom, bond_orders)

This method creates an non-radical, uncharged, :class:GroupAtom with specified list of atomtypes and connects it to one atom of the group, ‘connecting_atom’. This is useful for making sample atoms.

Parameters:
  • atomtypes – list of atomtype labels (strs)

  • connecting_atom – :class:GroupAtom that is connected to the new benzene atom

  • bond_orders – list of bond Orders connecting new_atom and connecting_atom

Returns: the newly created atom

draw(file_format)

Use pydot to draw a basic graph of the group.

Use format to specify the desired output file_format, eg. ‘png’, ‘svg’, ‘ps’, ‘pdf’, ‘plain’, etc.

find_isomorphism(other, initial_map, save_order, strict)

Returns True if other is isomorphic and False otherwise, and the matching mapping. The initial_map attribute can be used to specify a required mapping from self to other (i.e. the atoms of self are the keys, while the atoms of other are the values). The returned mapping also uses the atoms of self for the keys and the atoms of other for the values. The other parameter must be a Group object, or a TypeError is raised.

find_subgraph_isomorphisms(other, initial_map, save_order)

Returns True if other is subgraph isomorphic and False otherwise. In other words, return True is self is more specific than other. Also returns the lists all of valid mappings. The initial_map attribute can be used to specify a required mapping from self to other (i.e. the atoms of self are the keys, while the atoms of other are the values). The returned mappings also use the atoms of self for the keys and the atoms of other for the values. The other parameter must be a Group object, or a TypeError is raised.

from_adjacency_list(adjlist, check_consistency)

Convert a string adjacency list adjlist to a molecular structure. Skips the first line (assuming it’s a label) unless withLabel is False.

get_all_cycles(starting_vertex)

Given a starting vertex, returns a list of all the cycles containing that vertex.

This function returns a duplicate of each cycle because [0,1,2,3] is counted as separate from [0,3,2,1]

get_all_cycles_of_size(size)

Return a list of the all non-duplicate rings with length ‘size’. The algorithm implements was adapted from a description by Fan, Panaye, Doucet, and Barbu (doi: 10.1021/ci00015a002)

B. T. Fan, A. Panaye, J. P. Doucet, and A. Barbu. “Ring Perception: A New Algorithm for Directly Finding the Smallest Set of Smallest Rings from a Connection Table.” J. Chem. Inf. Comput. Sci. 33, p. 657-662 (1993).

get_all_cyclic_vertices()

Returns all vertices belonging to one or more cycles.

get_all_edges()

Returns a list of all edges in the graph.

get_all_labeled_atoms()

Return the labeled atoms as a dict with the keys being the labels and the values the atoms themselves. If two or more atoms have the same label, the value is converted to a list of these atoms.

get_all_polycyclic_vertices()

Return all vertices belonging to two or more cycles, fused or spirocyclic.

get_all_simple_cycles_of_size(size)

Return a list of all non-duplicate monocyclic rings with length ‘size’.

Naive approach by eliminating polycyclic rings that are returned by getAllCyclicsOfSize.

get_bond(atom1, atom2)

Returns the bond connecting atoms atom1 and atom2.

get_bonds(atom)

Return a list of the bonds involving the specified atom.

get_disparate_cycles()

Get all disjoint monocyclic and polycyclic cycle clusters in the molecule. Takes the RC and recursively merges all cycles which share vertices.

Returns: monocyclic_cycles, polycyclic_cycles

get_edge(vertex1, vertex2)

Returns the edge connecting vertices vertex1 and vertex2.

get_edges(vertex)

Return a dictionary of the edges involving the specified vertex.

get_edges_in_cycle(vertices, sort)

For a given list of atoms comprising a ring, return the set of bonds connecting them, in order around the ring.

If sort=True, then sort the vertices to match their connectivity. Otherwise, assumes that they are already sorted, which is true for cycles returned by get_relevant_cycles or get_smallest_set_of_smallest_rings.

get_element_count()

Returns the element count for the molecule as a dictionary. Wildcards are not counted as any particular element.

get_extensions(r, r_bonds, r_un, basename, atm_ind, atm_ind2, n_splits)

generate all allowed group extensions and their complements note all atomtypes except for elements and r/r!H’s must be removed

get_labeled_atoms(label)

Return the atom in the group that is labeled with the given label. Raises ValueError if no atom in the group has that label.

get_largest_ring(vertex)

returns the largest ring containing vertex. This is typically useful for finding the longest path in a polycyclic ring, since the polycyclic rings returned from get_polycycles are not necessarily in order in the ring structure.

get_max_cycle_overlap()

Return the maximum number of vertices that are shared between any two cycles in the graph. For example, if there are only disparate monocycles or no cycles, the maximum overlap is zero; if there are “spiro” cycles, it is one; if there are “fused” cycles, it is two; and if there are “bridged” cycles, it is three.

get_monocycles()

Return a list of cycles that are monocyclic.

get_net_charge()

Iterate through the atoms in the group and calculate the net charge

get_polycycles()

Return a list of cycles that are polycyclic. In other words, merge the cycles which are fused or spirocyclic into a single polycyclic cycle, and return only those cycles. Cycles which are not polycyclic are not returned.

get_relevant_cycles()

Returns the set of relevant cycles as a list of lists. Uses RingDecomposerLib for ring perception.

Kolodzik, A.; Urbaczek, S.; Rarey, M. Unique Ring Families: A Chemically Meaningful Description of Molecular Ring Topologies. J. Chem. Inf. Model., 2012, 52 (8), pp 2013-2021

Flachsenberg, F.; Andresen, N.; Rarey, M. RingDecomposerLib: An Open-Source Implementation of Unique Ring Families and Other Cycle Bases. J. Chem. Inf. Model., 2017, 57 (2), pp 122-126

get_smallest_set_of_smallest_rings()

Returns the smallest set of smallest rings as a list of lists. Uses RingDecomposerLib for ring perception.

Kolodzik, A.; Urbaczek, S.; Rarey, M. Unique Ring Families: A Chemically Meaningful Description of Molecular Ring Topologies. J. Chem. Inf. Model., 2012, 52 (8), pp 2013-2021

Flachsenberg, F.; Andresen, N.; Rarey, M. RingDecomposerLib: An Open-Source Implementation of Unique Ring Families and Other Cycle Bases. J. Chem. Inf. Model., 2017, 57 (2), pp 122-126

get_surface_sites()

Get a list of surface site GroupAtoms in the group. :returns: A list containing the surface site GroupAtoms in the molecule :rtype: List(GroupAtom)

has_atom(atom)

Returns True if atom is an atom in the graph, or False if not.

has_bond(atom1, atom2)

Returns True if atoms atom1 and atom2 are connected by an bond, or False if not.

has_edge(vertex1, vertex2)

Returns True if vertices vertex1 and vertex2 are connected by an edge, or False if not.

has_vertex(vertex)

Returns True if vertex is a vertex in the graph, or False if not.

has_wildcards()

This function is a Group level wildcards checker.

Returns a ‘True’ if any of the atoms in this group has wildcards.

is_aromatic_ring()

This method returns a boolean telling if the group has a 5 or 6 cyclic with benzene bonds exclusively

is_benzene_explicit()

Returns: ‘True’ if all Cb, Cbf (and other ‘b’ atoms) are in completely explicitly stated benzene rings.

Otherwise return ‘False’

is_cyclic()

Return True if one or more cycles are present in the graph or False otherwise.

is_edge_in_cycle(edge)

Return True if the edge between vertices vertex1 and vertex2 is in one or more cycles in the graph, or False if not.

is_identical(other, save_order)

Returns True if other is identical and False otherwise. The function is_isomorphic respects wildcards, while this function does not, make it more useful for checking groups to groups (as opposed to molecules to groups)

is_isomorphic(other, initial_map, generate_initial_map, save_order, strict)

Returns True if two graphs are isomorphic and False otherwise. The initial_map attribute can be used to specify a required mapping from self to other (i.e. the atoms of self are the keys, while the atoms of other are the values). The other parameter must be a Group object, or a TypeError is raised.

is_mapping_valid(other, mapping, equivalent, strict)

Check that a proposed mapping of vertices from self to other is valid by checking that the vertices and edges involved in the mapping are mutually equivalent. If equivalent is True it checks if atoms and edges are equivalent, if False it checks if they are specific cases of each other. If strict is True, electrons and bond orders are considered, and ignored if False.

is_subgraph_isomorphic(other, initial_map, generate_initial_map, save_order)

Returns True if other is subgraph isomorphic and False otherwise. In other words, return True if self is more specific than other. The initial_map attribute can be used to specify a required mapping from self to other (i.e. the atoms of self are the keys, while the atoms of other are the values). The other parameter must be a Group object, or a TypeError is raised.

is_surface_site()

Returns True iff the group is nothing but a surface site ‘X’.

is_vertex_in_cycle(vertex)

Return True if the given vertex is contained in one or more cycles in the graph, or False if not.

make_sample_molecule()

Returns: A sample class :Molecule: from the group

merge(other)

Merge two groups so as to store them in a single Group object. The merged Group object is returned.

merge_groups(other, keep_identical_labels)

This function takes other :class:Group object and returns a merged :class:Group object based on overlapping labeled atoms between self and other

Currently assumes other can be merged at the closest labelled atom if keep_identical_labels=True merge_groups will not try to merge atoms with the same labels

pick_wildcards()

Returns: the :class:Group object without wildcards in either atomtype or bonding

This function will naively pick the first atomtype for each atom, but will try to pick bond orders that make sense given the selected atomtypes

remove_atom(atom)

Remove atom and all bonds associated with it from the graph. Does not remove atoms that no longer have any bonds as a result of this removal.

remove_bond(bond)

Remove the bond between atoms atom1 and atom2 from the graph. Does not remove atoms that no longer have any bonds as a result of this removal.

remove_edge(edge)

Remove the specified edge from the graph. Does not remove vertices that no longer have any edges as a result of this removal.

remove_van_der_waals_bonds()

Remove all bonds that are definitely only van der Waals bonds.

remove_vertex(vertex)

Remove vertex and all edges associated with it from the graph. Does not remove vertices that no longer have any edges as a result of this removal.

reset_connectivity_values()

Reset any cached connectivity information. Call this method when you have modified the graph.

reset_ring_membership()

Resets ring membership information in the GroupAtom.props attribute.

restore_vertex_order()

reorder the vertices to what they were before sorting if you saved the order

sort_atoms()

Sort the atoms in the graph. This can make certain operations, e.g. the isomorphism functions, much more efficient.

sort_by_connectivity(atom_list)
Parameters:

atom_list – input list of atoms

Returns: a sorted list of atoms where each atom is connected to a previous atom in the list if possible

sort_cyclic_vertices(vertices)

Given a list of vertices comprising a cycle, sort them such that adjacent entries in the list are connected to each other. Warning: Assumes that the cycle is elementary, ie. no bridges.

sort_vertices(save_order)

Sort the vertices in the graph. This can make certain operations, e.g. the isomorphism functions, much more efficient.

specify_atom_extensions(i, basename, r)

generates extensions for specification of the type of atom defined by a given atomtype or set of atomtypes

specify_bond_extensions(i, j, basename, r_bonds)

generates extensions for the specification of bond order for a given bond

specify_external_new_bond_extensions(i, basename, r_bonds)

generates extensions for the creation of a bond (of undefined order) between an atom and a new atom that is not H

specify_internal_new_bond_extensions(i, j, n_splits, basename, r_bonds)

generates extensions for creation of a bond (of undefined order) between two atoms indexed i,j that already exist in the group and are unbonded

specify_ring_extensions(i, basename)

generates extensions for specifying if an atom is in a ring

specify_unpaired_extensions(i, basename, r_un)

generates extensions for specification of the number of electrons on a given atom

split()

Convert a single Group object containing two or more unconnected groups into separate class:Group objects.

standardize_atomtype()

This function changes the atomtypes in a group if the atom must be a specific atomtype based on its bonds and valency.

Currently only standardizes oxygen, carbon and sulfur ATOMTYPES

We also only check when there is exactly one atomtype, one bondType, one radical setting. For any group where there are wildcards or multiple attributes, we cannot apply this check.

In the case where the atomtype is ambiguous based on bonds and valency, this function will not change the type.

Returns a ‘True’ if the group was modified otherwise returns ‘False’

standardize_group()

This function modifies groups to make them have a standard AdjList form.

Currently it makes atomtypes as specific as possible and makes CO/CS atomtypes have explicit O2d/S2d ligands. Other functions can be added as necessary

Returns a ‘True’ if the group was modified otherwise returns ‘False’

to_adjacency_list(label)

Convert the molecular structure to a string adjacency list.

update_charge()

Update the partial charge according to the valence electron, total bond order, lone pairs and radical electrons. This method is used for products of specific families with recipes that modify charges.

update_connectivity_values()

Update the connectivity values for each vertex in the graph. These are used to accelerate the isomorphism checking.

update_fingerprint()

Update the molecular fingerprint used to accelerate the subgraph isomorphism checks.