rmgpy.molecule.Group¶
- class rmgpy.molecule.Group¶
A representation of a molecular substructure group using a graph data type, extending the
Graph
class. The attributes are:Attribute
Type
Description
atoms
list
Aliases for the vertices storing
GroupAtom
multiplicity
list
Range of multiplicities accepted for the group
props
dict
Dictionary of arbitrary properties/flags classifying state of Group object
metal
list
List of metals accepted for the group
facet
list
List of facets accepted for the group
Corresponding alias methods to Molecule have also been provided.
- add_atom(atom)¶
Add an atom to the graph. The atom is initialized with no bonds.
- add_bond(bond)¶
Add a bond to the graph as an edge connecting the two atoms atom1 and atom2.
- add_edge(edge)¶
Add an edge to the graph. The two vertices in the edge must already exist in the graph, or a
ValueError
is raised.
- add_explicit_ligands()¶
This function O2d/S2d ligand to CO or CS atomtypes if they are not already there.
Returns a ‘True’ if the group was modified otherwise returns ‘False’
- add_implicit_atoms_from_atomtype()¶
Returns: a modified group with implicit atoms added Add implicit double/triple bonded atoms O, S or R, for which we will use a C
Not designed to work with wildcards
- add_implicit_benzene()¶
Returns: A modified group with any implicit benzene rings added
This method currently does not if there are wildcards in atomtypes or bond orders The current algorithm also requires that all Cb and Cbf are atomtyped
There are other cases where the algorithm doesn’t work. For example whenever there are many dangling Cb or Cbf atoms not in a ring, it is likely fail. In the database test (the only use thus far), we will require that any group with more than 3 Cbfs have complete rings. This is much stricter than this method can handle, but right now this method cannot handle very general cases, so it is better to be conservative.
Note that it also works on other aromatic atomtypes like N5bd etc.
- add_vertex(vertex)¶
Add a vertex to the graph. The vertex is initialized with no edges.
- atoms¶
List of atoms contained in the current molecule.
Renames the inherited vertices attribute of
Graph
.
- classify_benzene_carbons(partners)¶
- Parameters:
group – :class:Group with atoms to classify
partners – dictionary of partnered up atoms, which must be a cbf atom
Some non-carbon ‘benzene’ type atoms (eg. N3b) are included and classified.
Returns: tuple with lists of each atom classification: cb_atom_list, cbf_atom_list, cbf_atom_list1, cbf_atom_list2, connected_cbfs
- clear_labeled_atoms()¶
Remove the labels from all atoms in the molecular group.
- clear_reg_dims()¶
clear regularization dimensions
- contains_labeled_atom(label)¶
Return
True
if the group contains an atom with the label label andFalse
otherwise.
- contains_surface_site()¶
Returns
True
iff the group contains an ‘X’ surface site.
- copy(deep)¶
Create a copy of the current graph. If deep is
True
, a deep copy is made: copies of the vertices and edges are used in the new graph. If deep isFalse
or not specified, a shallow copy is made: the original vertices and edges are used in the new graph.
- copy_and_map()¶
Create a deep copy of the current graph, and return the dict ‘mapping’. Method was modified from Graph.copy() method
- create_and_connect_atom(atomtypes, connecting_atom, bond_orders)¶
This method creates an non-radical, uncharged, :class:GroupAtom with specified list of atomtypes and connects it to one atom of the group, ‘connecting_atom’. This is useful for making sample atoms.
- Parameters:
atomtypes – list of atomtype labels (strs)
connecting_atom – :class:GroupAtom that is connected to the new benzene atom
bond_orders – list of bond Orders connecting new_atom and connecting_atom
Returns: the newly created atom
- draw(file_format)¶
Use pydot to draw a basic graph of the group.
Use format to specify the desired output file_format, eg. ‘png’, ‘svg’, ‘ps’, ‘pdf’, ‘plain’, etc.
- find_isomorphism(other, initial_map, save_order, strict)¶
Returns
True
if other is isomorphic andFalse
otherwise, and the matching mapping. The initial_map attribute can be used to specify a required mapping from self to other (i.e. the atoms of self are the keys, while the atoms of other are the values). The returned mapping also uses the atoms of self for the keys and the atoms of other for the values. The other parameter must be aGroup
object, or aTypeError
is raised.
- find_subgraph_isomorphisms(other, initial_map, save_order)¶
Returns
True
if other is subgraph isomorphic andFalse
otherwise. In other words, returnTrue
is self is more specific than other. Also returns the lists all of valid mappings. The initial_map attribute can be used to specify a required mapping from self to other (i.e. the atoms of self are the keys, while the atoms of other are the values). The returned mappings also use the atoms of self for the keys and the atoms of other for the values. The other parameter must be aGroup
object, or aTypeError
is raised.
- from_adjacency_list(adjlist, check_consistency)¶
Convert a string adjacency list adjlist to a molecular structure. Skips the first line (assuming it’s a label) unless withLabel is
False
.
- get_all_cycles(starting_vertex)¶
Given a starting vertex, returns a list of all the cycles containing that vertex.
This function returns a duplicate of each cycle because [0,1,2,3] is counted as separate from [0,3,2,1]
- get_all_cycles_of_size(size)¶
Return a list of the all non-duplicate rings with length ‘size’. The algorithm implements was adapted from a description by Fan, Panaye, Doucet, and Barbu (doi: 10.1021/ci00015a002)
B. T. Fan, A. Panaye, J. P. Doucet, and A. Barbu. “Ring Perception: A New Algorithm for Directly Finding the Smallest Set of Smallest Rings from a Connection Table.” J. Chem. Inf. Comput. Sci. 33, p. 657-662 (1993).
- get_all_cyclic_vertices()¶
Returns all vertices belonging to one or more cycles.
- get_all_edges()¶
Returns a list of all edges in the graph.
- get_all_labeled_atoms()¶
Return the labeled atoms as a
dict
with the keys being the labels and the values the atoms themselves. If two or more atoms have the same label, the value is converted to a list of these atoms.
- get_all_polycyclic_vertices()¶
Return all vertices belonging to two or more cycles, fused or spirocyclic.
- get_all_simple_cycles_of_size(size)¶
Return a list of all non-duplicate monocyclic rings with length ‘size’.
Naive approach by eliminating polycyclic rings that are returned by
getAllCyclicsOfSize
.
- get_bond(atom1, atom2)¶
Returns the bond connecting atoms atom1 and atom2.
- get_bonds(atom)¶
Return a list of the bonds involving the specified atom.
- get_disparate_cycles()¶
Get all disjoint monocyclic and polycyclic cycle clusters in the molecule. Takes the RC and recursively merges all cycles which share vertices.
Returns: monocyclic_cycles, polycyclic_cycles
- get_edge(vertex1, vertex2)¶
Returns the edge connecting vertices vertex1 and vertex2.
- get_edges(vertex)¶
Return a dictionary of the edges involving the specified vertex.
- get_edges_in_cycle(vertices, sort)¶
For a given list of atoms comprising a ring, return the set of bonds connecting them, in order around the ring.
If sort=True, then sort the vertices to match their connectivity. Otherwise, assumes that they are already sorted, which is true for cycles returned by get_relevant_cycles or get_smallest_set_of_smallest_rings.
- get_element_count()¶
Returns the element count for the molecule as a dictionary. Wildcards are not counted as any particular element.
- get_extensions(r, r_bonds, r_un, basename, atm_ind, atm_ind2, n_splits)¶
generate all allowed group extensions and their complements note all atomtypes except for elements and r/r!H’s must be removed
- get_labeled_atoms(label)¶
Return the atom in the group that is labeled with the given label. Raises
ValueError
if no atom in the group has that label.
- get_largest_ring(vertex)¶
returns the largest ring containing vertex. This is typically useful for finding the longest path in a polycyclic ring, since the polycyclic rings returned from get_polycycles are not necessarily in order in the ring structure.
- get_max_cycle_overlap()¶
Return the maximum number of vertices that are shared between any two cycles in the graph. For example, if there are only disparate monocycles or no cycles, the maximum overlap is zero; if there are “spiro” cycles, it is one; if there are “fused” cycles, it is two; and if there are “bridged” cycles, it is three.
- get_monocycles()¶
Return a list of cycles that are monocyclic.
- get_net_charge()¶
Iterate through the atoms in the group and calculate the net charge
- get_polycycles()¶
Return a list of cycles that are polycyclic. In other words, merge the cycles which are fused or spirocyclic into a single polycyclic cycle, and return only those cycles. Cycles which are not polycyclic are not returned.
- get_relevant_cycles()¶
Returns the set of relevant cycles as a list of lists. Uses RingDecomposerLib for ring perception.
Kolodzik, A.; Urbaczek, S.; Rarey, M. Unique Ring Families: A Chemically Meaningful Description of Molecular Ring Topologies. J. Chem. Inf. Model., 2012, 52 (8), pp 2013-2021
Flachsenberg, F.; Andresen, N.; Rarey, M. RingDecomposerLib: An Open-Source Implementation of Unique Ring Families and Other Cycle Bases. J. Chem. Inf. Model., 2017, 57 (2), pp 122-126
- get_smallest_set_of_smallest_rings()¶
Returns the smallest set of smallest rings as a list of lists. Uses RingDecomposerLib for ring perception.
Kolodzik, A.; Urbaczek, S.; Rarey, M. Unique Ring Families: A Chemically Meaningful Description of Molecular Ring Topologies. J. Chem. Inf. Model., 2012, 52 (8), pp 2013-2021
Flachsenberg, F.; Andresen, N.; Rarey, M. RingDecomposerLib: An Open-Source Implementation of Unique Ring Families and Other Cycle Bases. J. Chem. Inf. Model., 2017, 57 (2), pp 122-126
- get_surface_sites()¶
Get a list of surface site GroupAtoms in the group. :returns: A list containing the surface site GroupAtoms in the molecule :rtype: List(GroupAtom)
- has_atom(atom)¶
Returns
True
if atom is an atom in the graph, orFalse
if not.
- has_bond(atom1, atom2)¶
Returns
True
if atoms atom1 and atom2 are connected by an bond, orFalse
if not.
- has_edge(vertex1, vertex2)¶
Returns
True
if vertices vertex1 and vertex2 are connected by an edge, orFalse
if not.
- has_vertex(vertex)¶
Returns
True
if vertex is a vertex in the graph, orFalse
if not.
- has_wildcards()¶
This function is a Group level wildcards checker.
Returns a ‘True’ if any of the atoms in this group has wildcards.
- is_aromatic_ring()¶
This method returns a boolean telling if the group has a 5 or 6 cyclic with benzene bonds exclusively
- is_benzene_explicit()¶
Returns: ‘True’ if all Cb, Cbf (and other ‘b’ atoms) are in completely explicitly stated benzene rings.
Otherwise return ‘False’
- is_cyclic()¶
Return
True
if one or more cycles are present in the graph orFalse
otherwise.
- is_edge_in_cycle(edge)¶
Return
True
if the edge between vertices vertex1 and vertex2 is in one or more cycles in the graph, orFalse
if not.
- is_identical(other, save_order)¶
Returns
True
if other is identical andFalse
otherwise. The function is_isomorphic respects wildcards, while this function does not, make it more useful for checking groups to groups (as opposed to molecules to groups)
- is_isomorphic(other, initial_map, generate_initial_map, save_order, strict)¶
Returns
True
if two graphs are isomorphic andFalse
otherwise. The initial_map attribute can be used to specify a required mapping from self to other (i.e. the atoms of self are the keys, while the atoms of other are the values). The other parameter must be aGroup
object, or aTypeError
is raised.
- is_mapping_valid(other, mapping, equivalent, strict)¶
Check that a proposed mapping of vertices from self to other is valid by checking that the vertices and edges involved in the mapping are mutually equivalent. If equivalent is
True
it checks if atoms and edges are equivalent, ifFalse
it checks if they are specific cases of each other. If strict isTrue
, electrons and bond orders are considered, and ignored ifFalse
.
- is_subgraph_isomorphic(other, initial_map, generate_initial_map, save_order)¶
Returns
True
if other is subgraph isomorphic andFalse
otherwise. In other words, returnTrue
if self is more specific than other. The initial_map attribute can be used to specify a required mapping from self to other (i.e. the atoms of self are the keys, while the atoms of other are the values). The other parameter must be aGroup
object, or aTypeError
is raised.
- is_surface_site()¶
Returns
True
iff the group is nothing but a surface site ‘X’.
- is_vertex_in_cycle(vertex)¶
Return
True
if the given vertex is contained in one or more cycles in the graph, orFalse
if not.
- make_sample_molecule()¶
Returns: A sample class :Molecule: from the group
- merge(other)¶
Merge two groups so as to store them in a single
Group
object. The mergedGroup
object is returned.
- merge_groups(other, keep_identical_labels)¶
This function takes other :class:Group object and returns a merged :class:Group object based on overlapping labeled atoms between self and other
Currently assumes other can be merged at the closest labelled atom if keep_identical_labels=True merge_groups will not try to merge atoms with the same labels
- pick_wildcards()¶
Returns: the :class:Group object without wildcards in either atomtype or bonding
This function will naively pick the first atomtype for each atom, but will try to pick bond orders that make sense given the selected atomtypes
- remove_atom(atom)¶
Remove atom and all bonds associated with it from the graph. Does not remove atoms that no longer have any bonds as a result of this removal.
- remove_bond(bond)¶
Remove the bond between atoms atom1 and atom2 from the graph. Does not remove atoms that no longer have any bonds as a result of this removal.
- remove_edge(edge)¶
Remove the specified edge from the graph. Does not remove vertices that no longer have any edges as a result of this removal.
- remove_van_der_waals_bonds()¶
Remove all bonds that are definitely only van der Waals bonds.
- remove_vertex(vertex)¶
Remove vertex and all edges associated with it from the graph. Does not remove vertices that no longer have any edges as a result of this removal.
- reset_connectivity_values()¶
Reset any cached connectivity information. Call this method when you have modified the graph.
- reset_ring_membership()¶
Resets ring membership information in the GroupAtom.props attribute.
- restore_vertex_order()¶
reorder the vertices to what they were before sorting if you saved the order
- sort_atoms()¶
Sort the atoms in the graph. This can make certain operations, e.g. the isomorphism functions, much more efficient.
- sort_by_connectivity(atom_list)¶
- Parameters:
atom_list – input list of atoms
Returns: a sorted list of atoms where each atom is connected to a previous atom in the list if possible
- sort_cyclic_vertices(vertices)¶
Given a list of vertices comprising a cycle, sort them such that adjacent entries in the list are connected to each other. Warning: Assumes that the cycle is elementary, ie. no bridges.
- sort_vertices(save_order)¶
Sort the vertices in the graph. This can make certain operations, e.g. the isomorphism functions, much more efficient.
- specify_atom_extensions(i, basename, r)¶
generates extensions for specification of the type of atom defined by a given atomtype or set of atomtypes
- specify_bond_extensions(i, j, basename, r_bonds)¶
generates extensions for the specification of bond order for a given bond
- specify_external_new_bond_extensions(i, basename, r_bonds)¶
generates extensions for the creation of a bond (of undefined order) between an atom and a new atom that is not H
- specify_internal_new_bond_extensions(i, j, n_splits, basename, r_bonds)¶
generates extensions for creation of a bond (of undefined order) between two atoms indexed i,j that already exist in the group and are unbonded
- specify_ring_extensions(i, basename)¶
generates extensions for specifying if an atom is in a ring
- specify_unpaired_extensions(i, basename, r_un)¶
generates extensions for specification of the number of electrons on a given atom
- split()¶
Convert a single
Group
object containing two or more unconnected groups into separate class:Group objects.
- standardize_atomtype()¶
This function changes the atomtypes in a group if the atom must be a specific atomtype based on its bonds and valency.
Currently only standardizes oxygen, carbon and sulfur ATOMTYPES
We also only check when there is exactly one atomtype, one bondType, one radical setting. For any group where there are wildcards or multiple attributes, we cannot apply this check.
In the case where the atomtype is ambiguous based on bonds and valency, this function will not change the type.
Returns a ‘True’ if the group was modified otherwise returns ‘False’
- standardize_group()¶
This function modifies groups to make them have a standard AdjList form.
Currently it makes atomtypes as specific as possible and makes CO/CS atomtypes have explicit O2d/S2d ligands. Other functions can be added as necessary
Returns a ‘True’ if the group was modified otherwise returns ‘False’
- to_adjacency_list(label)¶
Convert the molecular structure to a string adjacency list.
- update_charge()¶
Update the partial charge according to the valence electron, total bond order, lone pairs and radical electrons. This method is used for products of specific families with recipes that modify charges.
- update_connectivity_values()¶
Update the connectivity values for each vertex in the graph. These are used to accelerate the isomorphism checking.
- update_fingerprint()¶
Update the molecular fingerprint used to accelerate the subgraph isomorphism checks.