12.1. Introduction

This section describes some of the general characteristics of RMG’s databases.

12.1.1. Group Definitions

The main section in many of RMG’s databases are the ‘group’ definitions. Groups are adjacency lists that describe structures around the reacting atoms. Between the adjacency list’s index number and atom type, a starred number is inserted if the atom is a reacting atom.

Because groups typically do not describe entire molecules, atoms may appear to be lacking full valency. When this occurs, the omitted bonds are allowed to be anything. An example of a primary carbon group from H-Abstraction is shown below. The adjacency list defined on the left matches any of the three drawn structures on the right (the numbers correspond to the index from the adjacency list).

../../../_images/Group.png

Atom types describe atoms in group definitions. The table below shows all atoms types in RMG.

Atom Type

Chemical Element

Localized electronic structure

R

Any

No requirements

R!H

Any except hydrogen

No requirements

H

Hydrogen

No requirements

C

Carbon

No requirements

Ca

Carbon

Atomic carbon with two lone pairs and no bonds

Cs

Carbon

Up to four single bonds

Csc

Carbon

Up to three single bonds, charged +1

Cd

Carbon

One double bond (to any atom other than O or S), up to two single bonds

Cdc

Carbon

One double bond, up to one single bond, charged +1

CO

Carbon

One double bond to an oxygen atom, up to two single bonds

CS

Carbon

One double bond to an sulfur atom, up to two single bonds

Cdd

Carbon

Two double bonds

Ct

Carbon

One triple bond, up to one single bond

Cb

Carbon

Two benzene bonds, up tp one single bond

Cbf

Carbon

Three benzene bonds (fused aromatics)

C2s

Carbon

One lone pair, up to two single bonds

C2sc

Carbon

One lone pair, up to three single bonds, charged -1

C2d

Carbon

One lone pair, one double bond

C2dc

Carbon

One lone pair, one double bond, up to one single bond, charge -1

C2tc

Carbon

One lone pair, one triple bond, charged -1

N

Nitrogen

No requirements

N0sc

Nitrogen

Three lone pairs, up to one single bond, charged -2

N1s

Nitrogen

Two lone pairs, up to one single bond

N1sc

Nitrogen

Two lone pairs, up to two single bonds, charged -1

N1dc

Nitrogen

Two lone pairs, one double bond, charged -1

N3s

Nitrogen

One lone pair, up to three single bonds

N3d

Nitrogen

One lone pair, one double bond, up to one single bond

N3t

Nitrogen

One lone pair, one triple bond

N3b

Nitrogen

One lone pair, two aromatic bonds

N5sc

Nitrogen

No lone pairs, up to four single bonds, charged +1

N5dc

Nitrogen

No lone pairs, one double bond, up to two single bonds, charged +1

N5ddc

Nitrogen

No lone pairs, two double bonds, charged +1

N5dddc

Nitrogen

No lone pairs, three double bonds, charged -1

N5tc

Nitrogen

No lone pairs, one triple bond, up to one single bond, charged +1

N5b

Nitrogen

No lone pairs, two aromatic bonds, up to one single bond

O

Oxygen

No requirements

Oa

Oxygen

Atomic oxygen with three lone pairs and no bonds

O0sc

Oxygen

Three lone pairs, up to one single bond, charged -1

O0dc

Oxygen

Three lone pairs, one double bond, charged -2

O2s

Oxygen

Two lone pairs, up to two single bonds

O2sc

Oxygen

Two lone pairs, up to one single bond, charged +1

O2d

Oxygen

Two lone pairs, one double bond

O4sc

Oxygen

One lone pair, up to three single bonds, charged +1

O4dc

Oxygen

One lone pair, one double bond, up to one single bond, charged +1

O4tc

Oxygen

One lone pair, one triple bond, charged +1

Si

Silicon

No requirements

Sis

Silicon

Up to four single bonds

Sid

Silicon

One double bond (not to O), up to two single bonds

SiO

Silicon

One double bond to an oxygen atom, up to two single bonds

Sidd

Silicon

Two double bonds

Sit

Silicon

One triple bond, up to one single bond

Sib

Silicon

Two benzene bonds, up tp one single bond

Sibf

Silicon

Three benzene bonds (fused aromatics)

P

Phosphorus

No requirements

P0sc

Phosphorus

Three lone pairs, up to one single bond, charged -2

P1s

Phosphorus

Two lone pairs, up to one single bond

P1sc

Phosphorus

Two lone pairs, up to two single bonds, charged -1

P1dc

Phosphorus

Two lone pairs, one double bond, charged -1

P3s

Phosphorus

One lone pair, up to three single bonds

P3d

Phosphorus

One lone pair, one double bond, up to one single bond

P3t

Phosphorus

One lone pair, one triple bond

P3b

Phosphorus

One lone pair, two aromatic bonds

P5s

Phosphorus

No lone pairs, up to five single bonds

P5sc

Phosphorus

No lone pairs, up to six single bonds, charged -1/+1/+2

P5d

Phosphorus

No lone pairs, one double bond, up to three single bonds

P5dd

Phosphorus

No lone pairs, two double bonds, up to one single bond

P5dc

Phosphorus

No lone pairs, one double bond, up to two single bonds, charged +1

P5ddc

Phosphorus

No lone pairs, two double bonds, charged +1

P5t

Phosphorus

No lone pairs, one triple bond, up to two single bonds

P5td

Phosphorus

No lone pairs, one triple bond, one double bond

P5tc

Phosphorus

No lone pairs, one triple bond, up to one single bond, charged +1

P5b

Phosphorus

No lone pairs, two aromatic bonds, up to one single bond, charged 0/+1

P5bd

Phosphorus

No lone pairs, two aromatic bonds, one double bond

S

Sulfur

No requirements

Sa

Sulfur

Atomic sulfur with three lone pairs and no bonds

S0sc

Sulfur

Three lone pairs, up to once single bond, charged -1

S2s

Sulfur

Two lone pairs, up to two single bonds

S2sc

Sulfur

Two lone pairs, up to three single bonds, charged -1/+1

S2d

Sulfur

Two lone pairs, one double bond

S2dc

Sulfur

Two lone pairs, one to two double bonds, up to one single bond, charged -1

S2tc

Sulfur

Two lone pairs, one triple bond, charged -1

S4s

Sulfur

One lone pair, up to four single bonds

S4sc

Sulfur

One lone pair, up to five single bonds, charged -1/+1

S4d

Sulfur

One lone pair, one double bond, up to two single bonds

S4dd

Sulfur

One lone pair, two double bonds

S4dc

Sulfur

One lone pair, one to three double bonds, up to three single bonds, charged -1/+1

S4b

Sulfur

One lone pair, two aromatic bonds

S4t

Sulfur

One lone pair, one triple bond, up to one single bond

S4tdc

Sulfur

One lone pair, one to two triple bonds, up to two double bonds, up to two single bonds, charged -1/+1

S6s

Sulfur

No lone pairs, up to six single bonds

S6sc

Sulfur

No lone pairs, up to seven single bonds, charged -1/+1/+2

S6d

Sulfur

No lone pairs, one double bond, up to four single bonds

S6dd

Sulfur

No lone pairs, two double bonds, up to two single bonds

S6ddd

Sulfur

No lone pairs, up to three double bonds

S6dc

Sulfur

No lone pairs, one to three double bonds, up to five single bonds, charged -1/-1/+2

S6t

Sulfur

No lone pairs, one triple bond, up to three single bonds

S6td

Sulfur

No lone pairs, one triple bond, one double bond, up to one single bond

S6tt

Sulfur

No lone pairs, two triple bonds

S6tdc

Sulfur

No lone pairs, one to two triple bonds, up to two double bonds, up to four single bonds, charged -1/-1

Cl

Chlorine

No requirements

Cl1s

Chlorine

Three lone pairs, zero to one single bonds

Br

Bromine

No requirements

Br1s

Bromine

Three lone pairs, zero to one single bonds

I

Iodine

No requirements

I1s

Iodine

Three lone pairs, zero to one single bonds

F

Fluorine

No requirements

F1s

Fluorine

Three lone pairs, zero to one single bonds

He

Helium

No requirements, nonreactive

Ne

Neon

No requirements, nonreactive

Ar

Argon

No requirements, nonreactive

Additionally, groups can also be defined as unions of other groups. For example,:

label="X_H_or_Xrad_H",
group=OR{X_H, Xrad_H},

12.1.2. Forbidden Groups

Forbidden groups can be defined to ban structures globally in RMG or to ban pathways in a specific kinetic family.

Globally forbidden structures will ban all reactions containing either reactants or products that are forbidden. These groups are stored in in the file located at RMG-database/input/forbiddenStructures.py.

To ban certain specific pathways in the kinetics families, a forbidden group must be created, like the following group in the intra_H_migration family:

forbidden(
    label = "bridged56_1254",
    group =
"""
1 *1 C 1 {2,S} {6,S}
2 *4 C 0 {1,S} {3,S} {7,S}
3    C 0 {2,S} {4,S}
4 *2 C 0 {3,S} {5,S} {8,S}
5 *5 C 0 {4,S} {6,S} {7,S}
6    C 0 {1,S} {5,S}
7    C 0 {2,S} {5,S}
8 *3 H 0 {4,S}
""",
    shortDesc = u"""""",
    longDesc =
u"""

""",
)

Forbidden groups should be placed inside the groups.py file located inside the specific kinetics family’s folder RMG-database/input/kinetics/family_name/ alongside normal group entries. The starred atoms in the forbidden group ban the specified reaction recipe from occurring in matched products and reactants.

In addition for forbidding groups, there is the option of forbidding specific molecules or species in RMG-database/input/forbiddenStructures.py. Forbidding a molecule will prevent that exact structure from being generated, while forbidding a species will prevent any of its resonance structures from being generated. Note that specific molecules or species can only be forbidden globally and should not be specified in the groups.py file. To specify a forbidden molecule or species, simply replace the group keyword with molecule or species:

# This forbids a molecule
forbidden(
    label = "C_quintet",
    molecule =
"""
multiplicity 5
1 C u4 p0
""",
    shortDesc = u"""""",
    longDesc =
u"""

""",
)

# This forbids a species
forbidden(
    label = "C_quintet",
    species =
"""
multiplicity 5
1 C u4 p0
""",
    shortDesc = u"""""",
    longDesc =
u"""

""",
)

12.1.3. Hierarchical Trees

Groups are ordered into the nodes of a hierarchical trees which is written at the end of groups.py. The root node of each tree is the most general group with the reacting atoms required for the family. Descending from the root node are more specific groups. Each child node is a subset of the parent node above it.

A simplified example of the trees for H-abstraction is shown below. The indented text shows the syntax in groups.py and a schematic is given underneath.

../../../_images/Trees.png

Individual groups only describe part of the reaction. To describe an entire reaction we need one group from each tree, which we call node templates or simply templates. (C_pri, O_pri_rad), (H2, O_sec_rad), and (X_H, Y_rad) are all valid examples of templates. Templates can be filled in with kinetic parameters from the training set or rules.