rmgpy.molecule.Group

class rmgpy.molecule.Group(atoms=None, props=None, multiplicity=None)

A representation of a molecular substructure group using a graph data type, extending the Graph class. The attributes are:

Attribute

Type

Description

atoms

list

Aliases for the vertices storing GroupAtom

multiplicity

list

Range of multiplicities accepted for the group

props

dict

Dictionary of arbitrary properties/flags classifying state of Group object

Corresponding alias methods to Molecule have also been provided.

add_atom(self, GroupAtom atom)

Add an atom to the graph. The atom is initialized with no bonds.

add_bond(self, GroupBond bond)

Add a bond to the graph as an edge connecting the two atoms atom1 and atom2.

add_edge(self, Edge edge) Edge

Add an edge to the graph. The two vertices in the edge must already exist in the graph, or a ValueError is raised.

add_explicit_ligands(self) bool

This function O2d/S2d ligand to CO or CS atomtypes if they are not already there.

Returns a ‘True’ if the group was modified otherwise returns ‘False’

add_implicit_atoms_from_atomtype(self) Group

Returns: a modified group with implicit atoms added Add implicit double/triple bonded atoms O, S or R, for which we will use a C

Not designed to work with wildcards

add_implicit_benzene(self) Group

Returns: A modified group with any implicit benzene rings added

This method currently does not if there are wildcards in atomtypes or bond orders The current algorithm also requires that all Cb and Cbf are atomtyped

There are other cases where the algorithm doesn’t work. For example whenever there are many dangling Cb or Cbf atoms not in a ring, it is likely fail. In the database test (the only use thus far), we will require that any group with more than 3 Cbfs have complete rings. This is much stricter than this method can handle, but right now this method cannot handle very general cases, so it is better to be conservative.

add_vertex(self, Vertex vertex) Vertex

Add a vertex to the graph. The vertex is initialized with no edges.

atoms

List of atoms contained in the current molecule.

Renames the inherited vertices attribute of Graph.

classify_benzene_carbons(self, dict partners=None) tuple
Parameters:
  • group – :class:Group with atoms to classify

  • partners – dictionary of partnered up atoms, which must be a cbf atom

Returns: tuple with lists of each atom classification

clear_labeled_atoms(self)

Remove the labels from all atoms in the molecular group.

clear_reg_dims(self)

clear regularization dimensions

contains_labeled_atom(self, unicode label) bool

Return True if the group contains an atom with the label label and False otherwise.

contains_surface_site(self) bool

Returns True iff the group contains an ‘X’ surface site.

copy(self, bool deep=False) Graph

Create a copy of the current graph. If deep is True, a deep copy is made: copies of the vertices and edges are used in the new graph. If deep is False or not specified, a shallow copy is made: the original vertices and edges are used in the new graph.

copy_and_map(self) dict

Create a deep copy of the current graph, and return the dict ‘mapping’. Method was modified from Graph.copy() method

create_and_connect_atom(self, list atomtypes, GroupAtom connecting_atom, list bond_orders) GroupAtom

This method creates an non-radical, uncharged, :class:GroupAtom with specified list of atomtypes and connects it to one atom of the group, ‘connecting_atom’. This is useful for making sample atoms.

Parameters:
  • atomtypes – list of atomtype labels (strs)

  • connecting_atom – :class:GroupAtom that is connected to the new benzene atom

  • bond_orders – list of bond Orders connecting new_atom and connecting_atom

Returns: the newly created atom

draw(self, file_format)

Use pydot to draw a basic graph of the group.

Use format to specify the desired output file_format, eg. ‘png’, ‘svg’, ‘ps’, ‘pdf’, ‘plain’, etc.

elementCount

dict

Type:

elementCount

find_isomorphism(self, Graph other, dict initial_map=None, bool save_order=False, bool strict=True) list

Returns True if other is isomorphic and False otherwise, and the matching mapping. The initial_map attribute can be used to specify a required mapping from self to other (i.e. the atoms of self are the keys, while the atoms of other are the values). The returned mapping also uses the atoms of self for the keys and the atoms of other for the values. The other parameter must be a Group object, or a TypeError is raised.

find_subgraph_isomorphisms(self, Graph other, dict initial_map=None, bool save_order=False) list

Returns True if other is subgraph isomorphic and False otherwise. In other words, return True is self is more specific than other. Also returns the lists all of valid mappings. The initial_map attribute can be used to specify a required mapping from self to other (i.e. the atoms of self are the keys, while the atoms of other are the values). The returned mappings also use the atoms of self for the keys and the atoms of other for the values. The other parameter must be a Group object, or a TypeError is raised.

from_adjacency_list(self, unicode adjlist)

Convert a string adjacency list adjlist to a molecular structure. Skips the first line (assuming it’s a label) unless withLabel is False.

get_all_cycles(self, Vertex starting_vertex) list

Given a starting vertex, returns a list of all the cycles containing that vertex.

This function returns a duplicate of each cycle because [0,1,2,3] is counted as separate from [0,3,2,1]

get_all_cycles_of_size(self, int size) list

Return a list of the all non-duplicate rings with length ‘size’. The algorithm implements was adapted from a description by Fan, Panaye, Doucet, and Barbu (doi: 10.1021/ci00015a002)

B. T. Fan, A. Panaye, J. P. Doucet, and A. Barbu. “Ring Perception: A New Algorithm for Directly Finding the Smallest Set of Smallest Rings from a Connection Table.” J. Chem. Inf. Comput. Sci. 33, p. 657-662 (1993).

get_all_cyclic_vertices(self) list

Returns all vertices belonging to one or more cycles.

get_all_edges(self) list

Returns a list of all edges in the graph.

get_all_labeled_atoms(self) dict

Return the labeled atoms as a dict with the keys being the labels and the values the atoms themselves. If two or more atoms have the same label, the value is converted to a list of these atoms.

get_all_polycyclic_vertices(self) list

Return all vertices belonging to two or more cycles, fused or spirocyclic.

get_all_simple_cycles_of_size(self, int size) list

Return a list of all non-duplicate monocyclic rings with length ‘size’.

Naive approach by eliminating polycyclic rings that are returned by getAllCyclicsOfSize.

get_bond(self, GroupAtom atom1, GroupAtom atom2) GroupBond

Returns the bond connecting atoms atom1 and atom2.

get_bonds(self, GroupAtom atom) dict

Return a list of the bonds involving the specified atom.

get_disparate_cycles(self) tuple

Get all disjoint monocyclic and polycyclic cycle clusters in the molecule. Takes the RC and recursively merges all cycles which share vertices.

Returns: monocyclic_cycles, polycyclic_cycles

get_edge(self, Vertex vertex1, Vertex vertex2) Edge

Returns the edge connecting vertices vertex1 and vertex2.

get_edges(self, Vertex vertex) dict

Return a dictionary of the edges involving the specified vertex.

get_edges_in_cycle(self, list vertices, bool sort=False) list

For a given list of atoms comprising a ring, return the set of bonds connecting them, in order around the ring.

If sort=True, then sort the vertices to match their connectivity. Otherwise, assumes that they are already sorted, which is true for cycles returned by get_relevant_cycles or get_smallest_set_of_smallest_rings.

get_element_count(self) dict

Returns the element count for the molecule as a dictionary. Wildcards are not counted as any particular element.

get_extensions(self, r=None, basename='', atm_ind=None, atm_ind2=None, n_splits=None)

generate all allowed group extensions and their complements note all atomtypes except for elements and r/r!H’s must be removed

get_labeled_atoms(self, unicode label) list

Return the atom in the group that is labeled with the given label. Raises ValueError if no atom in the group has that label.

get_largest_ring(self, Vertex vertex) list

returns the largest ring containing vertex. This is typically useful for finding the longest path in a polycyclic ring, since the polycyclic rings returned from get_polycycles are not necessarily in order in the ring structure.

get_max_cycle_overlap(self) int

Return the maximum number of vertices that are shared between any two cycles in the graph. For example, if there are only disparate monocycles or no cycles, the maximum overlap is zero; if there are “spiro” cycles, it is one; if there are “fused” cycles, it is two; and if there are “bridged” cycles, it is three.

get_monocycles(self) list

Return a list of cycles that are monocyclic.

get_net_charge(self)

Iterate through the atoms in the group and calculate the net charge

get_polycycles(self) list

Return a list of cycles that are polycyclic. In other words, merge the cycles which are fused or spirocyclic into a single polycyclic cycle, and return only those cycles. Cycles which are not polycyclic are not returned.

get_relevant_cycles(self) list

Returns the set of relevant cycles as a list of lists. Uses RingDecomposerLib for ring perception.

Kolodzik, A.; Urbaczek, S.; Rarey, M. Unique Ring Families: A Chemically Meaningful Description of Molecular Ring Topologies. J. Chem. Inf. Model., 2012, 52 (8), pp 2013-2021

Flachsenberg, F.; Andresen, N.; Rarey, M. RingDecomposerLib: An Open-Source Implementation of Unique Ring Families and Other Cycle Bases. J. Chem. Inf. Model., 2017, 57 (2), pp 122-126

get_smallest_set_of_smallest_rings(self) list

Returns the smallest set of smallest rings as a list of lists. Uses RingDecomposerLib for ring perception.

Kolodzik, A.; Urbaczek, S.; Rarey, M. Unique Ring Families: A Chemically Meaningful Description of Molecular Ring Topologies. J. Chem. Inf. Model., 2012, 52 (8), pp 2013-2021

Flachsenberg, F.; Andresen, N.; Rarey, M. RingDecomposerLib: An Open-Source Implementation of Unique Ring Families and Other Cycle Bases. J. Chem. Inf. Model., 2017, 57 (2), pp 122-126

get_surface_sites(self) list

Get a list of surface site GroupAtoms in the group. :returns: A list containing the surface site GroupAtoms in the molecule :rtype: List(GroupAtom)

has_atom(self, GroupAtom atom) bool

Returns True if atom is an atom in the graph, or False if not.

has_bond(self, GroupAtom atom1, GroupAtom atom2) bool

Returns True if atoms atom1 and atom2 are connected by an bond, or False if not.

has_edge(self, Vertex vertex1, Vertex vertex2) bool

Returns True if vertices vertex1 and vertex2 are connected by an edge, or False if not.

has_vertex(self, Vertex vertex) bool

Returns True if vertex is a vertex in the graph, or False if not.

is_aromatic_ring(self) bool

This method returns a boolean telling if the group has a 5 or 6 cyclic with benzene bonds exclusively

is_benzene_explicit(self) bool

Returns: ‘True’ if all Cb, Cbf atoms are in completely explicitly stated benzene rings.

Otherwise return ‘False’

is_cyclic(self) bool

Return True if one or more cycles are present in the graph or False otherwise.

is_edge_in_cycle(self, Edge edge) bool

Return True if the edge between vertices vertex1 and vertex2 is in one or more cycles in the graph, or False if not.

is_identical(self, Graph other, bool save_order=False) bool

Returns True if other is identical and False otherwise. The function is_isomorphic respects wildcards, while this function does not, make it more useful for checking groups to groups (as opposed to molecules to groups)

is_isomorphic(self, Graph other, dict initial_map=None, bool generate_initial_map=False, bool save_order=False, bool strict=True) bool

Returns True if two graphs are isomorphic and False otherwise. The initial_map attribute can be used to specify a required mapping from self to other (i.e. the atoms of self are the keys, while the atoms of other are the values). The other parameter must be a Group object, or a TypeError is raised.

is_mapping_valid(self, Graph other, dict mapping, bool equivalent=True, bool strict=True) bool

Check that a proposed mapping of vertices from self to other is valid by checking that the vertices and edges involved in the mapping are mutually equivalent. If equivalent is True it checks if atoms and edges are equivalent, if False it checks if they are specific cases of each other. If strict is True, electrons and bond orders are considered, and ignored if False.

is_subgraph_isomorphic(self, Graph other, dict initial_map=None, bool generate_initial_map=False, bool save_order=False) bool

Returns True if other is subgraph isomorphic and False otherwise. In other words, return True if self is more specific than other. The initial_map attribute can be used to specify a required mapping from self to other (i.e. the atoms of self are the keys, while the atoms of other are the values). The other parameter must be a Group object, or a TypeError is raised.

is_surface_site(self) bool

Returns True iff the group is nothing but a surface site ‘X’.

is_vertex_in_cycle(self, Vertex vertex) bool

Return True if the given vertex is contained in one or more cycles in the graph, or False if not.

make_sample_molecule(self) Molecule

Returns: A sample class :Molecule: from the group

merge(self, Graph other) Graph

Merge two groups so as to store them in a single Group object. The merged Group object is returned.

merge_groups(self, Group other, bool keep_identical_labels=False) Group

This function takes other :class:Group object and returns a merged :class:Group object based on overlapping labeled atoms between self and other

Currently assumes other can be merged at the closest labelled atom if keep_identical_labels=True merge_groups will not try to merge atoms with the same labels

multiplicity

list

Type:

multiplicity

ordered_vertices

list

Type:

ordered_vertices

pick_wildcards(self)

Returns: the :class:Group object without wildcards in either atomtype or bonding

This function will naively pick the first atomtype for each atom, but will try to pick bond orders that make sense given the selected atomtypes

props

dict

Type:

props

radicalCount

‘short’

Type:

radicalCount

remove_atom(self, GroupAtom atom)

Remove atom and all bonds associated with it from the graph. Does not remove atoms that no longer have any bonds as a result of this removal.

remove_bond(self, GroupBond bond)

Remove the bond between atoms atom1 and atom2 from the graph. Does not remove atoms that no longer have any bonds as a result of this removal.

remove_edge(self, Edge edge)

Remove the specified edge from the graph. Does not remove vertices that no longer have any edges as a result of this removal.

remove_van_der_waals_bonds(self)

Remove all bonds that are definitely only van der Waals bonds.

remove_vertex(self, Vertex vertex)

Remove vertex and all edges associated with it from the graph. Does not remove vertices that no longer have any edges as a result of this removal.

reset_connectivity_values(self)

Reset any cached connectivity information. Call this method when you have modified the graph.

reset_ring_membership(self)

Resets ring membership information in the GroupAtom.props attribute.

restore_vertex_order(self)

reorder the vertices to what they were before sorting if you saved the order

sort_atoms(self)

Sort the atoms in the graph. This can make certain operations, e.g. the isomorphism functions, much more efficient.

sort_by_connectivity(self, list atom_list) list
Parameters:

atom_list – input list of atoms

Returns: a sorted list of atoms where each atom is connected to a previous atom in the list if possible

sort_cyclic_vertices(self, list vertices) list

Given a list of vertices comprising a cycle, sort them such that adjacent entries in the list are connected to each other. Warning: Assumes that the cycle is elementary, ie. no bridges.

sort_vertices(self, bool save_order=False)

Sort the vertices in the graph. This can make certain operations, e.g. the isomorphism functions, much more efficient.

specify_atom_extensions(self, i, basename, r)

generates extensions for specification of the type of atom defined by a given atomtype or set of atomtypes

specify_bond_extensions(self, i, j, basename, r_bonds)

generates extensions for the specification of bond order for a given bond

specify_external_new_bond_extensions(self, i, basename, r_bonds)

generates extensions for the creation of a bond (of undefined order) between an atom and a new atom that is not H

specify_internal_new_bond_extensions(self, i, j, n_splits, basename, r_bonds)

generates extensions for creation of a bond (of undefined order) between two atoms indexed i,j that already exist in the group and are unbonded

specify_ring_extensions(self, i, basename)

generates extensions for specifying if an atom is in a ring

specify_unpaired_extensions(self, i, basename, r_un)

generates extensions for specification of the number of electrons on a given atom

split(self) list

Convert a single Group object containing two or more unconnected groups into separate class:Group objects.

standardize_atomtype(self) bool

This function changes the atomtypes in a group if the atom must be a specific atomtype based on its bonds and valency.

Currently only standardizes oxygen, carbon and sulfur ATOMTYPES

We also only check when there is exactly one atomtype, one bondType, one radical setting. For any group where there are wildcards or multiple attributes, we cannot apply this check.

In the case where the atomtype is ambiguous based on bonds and valency, this function will not change the type.

Returns a ‘True’ if the group was modified otherwise returns ‘False’

standardize_group(self) bool

This function modifies groups to make them have a standard AdjList form.

Currently it makes atomtypes as specific as possible and makes CO/CS atomtypes have explicit O2d/S2d ligands. Other functions can be added as necessary

Returns a ‘True’ if the group was modified otherwise returns ‘False’

to_adjacency_list(self, unicode label=u'')

Convert the molecular structure to a string adjacency list.

update(self)
update_charge(self)

Update the partial charge according to the valence electron, total bond order, lone pairs and radical electrons. This method is used for products of specific families with recipes that modify charges.

update_connectivity_values(self)

Update the connectivity values for each vertex in the graph. These are used to accelerate the isomorphism checking.

update_fingerprint(self)

Update the molecular fingerprint used to accelerate the subgraph isomorphism checks.

vertices

list

Type:

vertices