# rmgpy.molecule.filtration¶

This module contains functions for filtering a list of Molecules representing a single Species, keeping only the representative structures. Relevant for filtration of negligible mesomerism contributing structures.

The rules this module follows are (by order of importance):

1. Minimum overall deviation from the Octet Rule (elaborated for Dectet for sulfur as a third row element)

2. Additional charge separation is only allowed for radicals if it makes a new radical site in the species

3. If a structure must have charge separation, negative charges will be assigned to more electronegative atoms, whereas positive charges will be assigned to less electronegative atoms (charge stabilization)

4. Opposite charges will be as close as possible to one another, and vice versa (charge stabilization)

rmgpy.molecule.filtration.aromaticity_filtration(mol_list, features)

Returns a filtered list of molecules based on heuristics for determining representative aromatic resonance structures.

For monocyclic aromatics, Kekule structures are removed, with the assumption that an equivalent aromatic structure exists. Non-aromatic structures are maintained if they present new radical sites. Instead of explicitly checking the radical sites, we only check for the SDSDSD bond motif since radical delocalization will disrupt that pattern.

For polycyclic aromatics, structures without any benzene bonds are removed. The idea is that radical delocalization into the aromatic pi system is unfavorable because it disrupts aromaticity. Therefore, structures where the radical is delocalized so far into the molecule such that none of the rings are aromatic anymore are not representative. While this isn’t strictly true, it helps reduce the number of representative structures by focusing on the most important ones.

rmgpy.molecule.filtration.charge_filtration(filtered_list, charge_span_list)

Returns a new filtered_list, filtered based on charge_span_list, electronegativity and proximity considerations. If structures with an additional charge layer introduce reactive sites (i.e., radicals or multiple bonds) they will also be considered. For example:

• Both of NO2’s resonance structures will be kept: [O]N=O <=> O=[N+.][O-]

• NCO will only have two resonance structures [N.]=C=O <=> N#C[O.], and will loose the third structure which has the same octet deviation, has a charge separation, but the radical site has already been considered: [N+.]#C[O-]

• CH2NO keeps all three structures, since a new radical site is introduced: [CH2.]N=O <=> C=N[O.] <=> C=[N+.][O-]

• NH2CHO has two structures, one of which is charged since it introduces a multiple bond: NC=O <=> [NH2+]=C[O-]

However, if the species is not a radical, or multiple bonds do not alter, we only keep the structures with the minimal charge span. For example:

• NSH will only keep the N#S form and not [N-]=[SH+]

• The following species will loose two thirds of its resonance structures, which are charged: CS(=O)SC <=> CS(=O)#SC <=> C[S+]([O-]SC <=> CS([O-])=[S+]C <=> C[S+]([O-])#SC <=> C[S+](=O)=[S-]C

• Azide is know to have three resonance structures: [NH-][N+]#N <=> N=[N+]=[N-] <=> [NH+]#[N+][N-2]; here we filter the third one out due to the higher charge span, which does not contribute to reactivity in RMG

rmgpy.molecule.filtration.check_reactive(filtered_list)

Check that there’s at least one reactive structure in the returned list. If not, raise an error (does not return anything)

rmgpy.molecule.filtration.filter_structures(mol_list, mark_unreactive=True, allow_expanded_octet=True, features=None, save_order=False)

We often get too many resonance structures from the combination of all rules, particularly for species containing lone pairs. This function filters them out by minimizing the number of C/N/O/S atoms without a full octet. If save_order is True the atom order is reset after performing atom isomorphism.

A helper function for reactive site discovery in charged species

rmgpy.molecule.filtration.get_charge_span_list(mol_list)

Returns the a list of charge spans for a respective list of :class:Molecule objects This is also calculated in the octet_filtration() function along with the octet filtration process

rmgpy.molecule.filtration.get_octet_deviation(mol, allow_expanded_octet=True)

Returns the octet deviation for a :class:Molecule object if allow_expanded_octet is True (by default), then the function also considers dectet for third row elements (currently sulfur is the only hypervalance third row element in RMG)

rmgpy.molecule.filtration.get_octet_deviation_list(mol_list, allow_expanded_octet=True)

Returns the a list of octet deviations for a respective list of :class:Molecule objects

rmgpy.molecule.filtration.mark_unreactive_structures(filtered_list, mol_list, save_order=False)

Mark selected structures in filtered_list with the Molecule.reactive flag set to False (it is True by default) Changes the filtered_list object, and does not return anything. If save_order is True the atom order is reset after performing atom isomorphism.

rmgpy.molecule.filtration.octet_filtration(mol_list, octet_deviation_list)

Returns a filtered list based on the octet_deviation_list. Also computes and returns a charge_span_list. Filtering using the octet deviation criterion rules out most unrepresentative structures. However, since some charge-strained species are still kept (e.g., [NH]N=S=O <-> [NH+]#[N+][S-][O-]), we also generate during the same loop a charge_span_list to keep track of the charge spans. This is used for further filtering.

rmgpy.molecule.filtration.stabilize_charges_by_electronegativity(mol_list, allow_empty_list=False)

Only keep structures that obey the electronegativity rule. If a structure must have charge separation, negative charges will be assigned to more electronegative atoms, and vice versa. If allow_empty_list is set to False (default), this function will not return an empty list. If it is set to True and all structures in mol_list violate the electronegativity heuristic, the original mol_list is returned (examples: [C-]#[O+], CS, [NH+]#[C-], [OH+]=[N-], [C-][S+]=C violate this heuristic).

rmgpy.molecule.filtration.stabilize_charges_by_proximity(mol_list)

Only keep structures that obey the charge proximity rule. Opposite charges will be as close as possible to one another, and vice versa.