rdtools.resonance.filtration#

Module for filtering resonance structures.

This module contains functions for filtering a list of Molecules representing a single Species, keeping only the representative structures. Relevant for filtration of negligible mesomerism contributing structures.

The rules this module follows are (by order of importance):

  1. Minimum overall deviation from the Octet Rule (elaborated for Dectet for sulfur as a third row element)

  2. Additional charge separation is only allowed for radicals if it makes a new radical site in the species

  3. If a structure must have charge separation, negative charges will be assigned to more electronegative atoms,

    whereas positive charges will be assigned to less electronegative atoms (charge stabilization)

  4. Opposite charges will be as close as possible to one another, and vice versa (charge stabilization)

(inspired by http://web.archive.org/web/20140310074727/http://www.chem.ucla.edu/~harding/tutorials/resonance/imp_res_str.html which is quite like http://www.chem.ucla.edu/~harding/IGOC/R/resonance_contributor_preference_rules.html)

rdtools.resonance.filtration.aromaticity_filtration(mol_list: list[Mol], is_polycyclic_aromatic: bool = False) list[Mol]#

Filter molecules by heuristics.

For monocyclic aromatics, Kekule structures are removed, with the assumption that an equivalent aromatic structure exists. Non-aromatic structures are maintained if they present new radical sites. Instead of explicitly checking the radical sites, we only check for the SDSDSD bond motif since radical delocalization will disrupt that pattern.

For polycyclic aromatics, structures without any benzene bonds are removed. The idea is that radical delocalization into the aromatic pi system is unfavorable because it disrupts aromaticity. Therefore, structures where the radical is delocalized so far into the molecule such that none of the rings are aromatic anymore are not representative. While this isn’t strictly true, it helps reduce the number of representative structures by focusing on the most important ones.

Parameters:
  • mol_list (list[Chem.Mol]) – The list of molecules to filter.

  • is_polycyclic_aromatic (bool, optional) – Whether the species is polycyclic aromatic. Default is False.

Returns:

list[Chem.Mol] – The filtered list of molecules.

rdtools.resonance.filtration.charge_filtration(mol_list: list[Mol]) list[Mol]#

Filtered based on charge_span, electronegativity and proximity.

If structures with an additional charge layer introduce new reactive sites (i.e., radicals or multiple bonds) they will also be considered. For example:

  • Both of NO2’s resonance structures will be kept: [O]N=O <=> O=[N+.][O-]

  • NCO will only have two resonance structures [N.]=C=O <=> N#C[O.], and will loose the third structure which has

    the same octet deviation, has a charge separation, but the radical site has already been considered: [N+.]#C[O-]

  • CH2NO keeps all three structures, since a new radical site is introduced: [CH2.]N=O <=> C=N[O.] <=> C=[N+.][O-]

  • NH2CHO has two structures, one of which is charged since it introduces a multiple bond: NC=O <=> [NH2+]=C[O-]

However, if the species is not a radical, or multiple bonds do not alter, we only keep the structures with the minimal charge span. For example:

  • NSH will only keep the N#S form and not [N-]=[SH+]

  • The following species will loose two thirds of its resonance structures, which are charged: CS(=O)SC <=>

    CS(=O)#SC <=> C[S+]([O-]SC <=> CS([O-])=[S+]C <=> C[S+]([O-])#SC <=> C[S+](=O)=[S-]C

  • Azide is know to have three resonance structures: [NH-][N+]#N <=> N=[N+]=[N-] <=> [NH+]#[N+][N-2];

Parameters:

mol_list (list[Chem.Mol]) – The list of molecules to filter.

Returns:

list[Chem.Mol] – The filtered list of molecules.

rdtools.resonance.filtration.filter_structures(mol_list: list[Mol], allow_expanded_octet: bool = True, features: dict[str, bool] | None = None, **kwargs: Any) list[Mol]#

Filter a list of molecules to keep only the representative structures.

This function filters them out by minimizing the number of C/N/O/S atoms without a full octet, non-preferred charge separation, and non-preferred aromatic structures.

Parameters:
  • mol_list (list[Chem.Mol]) – The list of molecules to filter.

  • allow_expanded_octet (bool, optional) – Whether to allow expanded octets for third row elements. Default is True.

  • features (Optional[dict[str, bool]], optional) – A list of features of the species. Default is None.

  • **kwargs (Any) – Additional keyword arguments. They are ignored, but included for compatibility.

Returns:

list[Chem.Mol] – The filtered list of molecules.

Raises:

RuntimeError – If no representative structures are found.

rdtools.resonance.filtration.get_charge_distance(mol: Mol) tuple[int, int]#

Get the cumulated charge distance for similar charge and difference charge pairs.

Parameters:

mol (Chem.Mol) – The molecule to check.

Returns:

tuple[int, int] – The cumulated charge distance for similar charge and difference charge pairs, respectively.

rdtools.resonance.filtration.get_charge_span_list(mol_list: list[Mol]) list[float]#

Get the list of charge spans for a list of molecules.

This is also calculated in the octet_filtration() function along with the octet filtration process.

Parameters:

mol_list (list[Chem.Mol]) – The list of molecules to get the charge spans for.

Returns:

list[float] – The charge spans for the molecules in mol_list.

rdtools.resonance.filtration.get_octet_deviation(mol: Mol, allow_expanded_octet: bool = True) float#

Returns the octet deviation for a molecule.

Parameters:
  • mol (Chem.Mol) – The molecule to get the octet deviation for.

  • allow_expanded_octet (bool, optional) – Whether to allow expanded octets for third row elements. if allow_expanded_octet is True (by default), then the function also considers dectet for third row elements. Default is True.

Returns:

float – The octet deviation for the molecule.

rdtools.resonance.filtration.get_octet_deviation_list(mol_list: list[Mol], allow_expanded_octet: bool = True) list[float]#

Get the octet deviations for a list of molecules.

Parameters:
  • mol_list (list[Chem.Mol]) – The list of molecules to get the octet deviations for.

  • allow_expanded_octet (bool, optional) – Whether to allow expanded octets for third row elements. Default is True.

Returns:

list[float] – The octet deviations for the molecules in mol_list.

rdtools.resonance.filtration.has_unique_sites(mol: Mol, rad_idxs: set[int], mul_bond_idxs: set[tuple[int, int]]) bool#

Check if a resonance structure has unique sites.

Check if a resonance structure has unique radical and multiple bond sites that are not present in other structures.

Parameters:
  • mol (Chem.Mol) – The molecule to check.

  • rad_idxs (set[int]) – The set of radical sites in the other structures.

  • mul_bond_idxs (set[tuple[int, int]]) – The set of multiple bond sites in the other structures.

Returns:

boolTrue if the structure has unique radical and multiple bond sites, False otherwise.

rdtools.resonance.filtration.multiplicity_filtration(mol_list: list[Mol], ref_idx: int = 0) list[Mol]#

Filter a list of molecules based on their multiplicity.

Returns a filtered list based on the multiplicity of the species. The multiplicity of the species is determined by the number of radical electrons in the species and only the one with the same multiplicity as the reference species (the first by default) is kept.

Parameters:
  • mol_list (list[Chem.Mol]) – The list of molecules to filter. Can be either RDKit Mol or RDMC RDKitMol.

  • ref_idx (int, optional) – The index of the reference molecule in mol_list. Default is 0.

Returns:

list[Chem.Mol] – The filtered list of molecules.

rdtools.resonance.filtration.octet_filtration(mol_list: list[Mol], allow_expanded_octet: bool = True) list[Mol]#

Filter unrepresentative mol by the octet deviation criterion.

Parameters:
  • mol_list (list[Chem.Mol]) – The list of molecules to filter.

  • allow_expanded_octet (bool, optional) – Whether to allow expanded octets for third row elements.

Returns:

list[Chem.Mol] – The filtered list of molecules.

rdtools.resonance.filtration.stabilize_charges_by_electronegativity(mol_list: list[Mol], allow_empty_list: bool = False) list[Mol]#

Only keep structures that obey the electronegativity rule.

If a structure must have charge separation, negative charges will be assigned to more electronegative atoms, and vice versa.

Parameters:
  • mol_list (list[Chem.Mol]) – The list of molecules to filter.

  • allow_empty_list (bool, optional) – Whether to allow an empty list to be returned. Default is False. If allow_empty_list is set to False, and all structures in mol_list violate the electronegativity heuristic, this function will return the original mol_list. (examples: [C-]#[O+], CS, [NH+]#[C-], [OH+]=[N-], [C-][S+]=C violate this heuristic).

Returns:

list[Chem.Mol] – The filtered list of molecules.

rdtools.resonance.filtration.stabilize_charges_by_proximity(mol_list: list[Mol]) list[Mol]#

Only keep structures that obey the charge proximity rule.

Opposite charges will be as close as possible to one another, and vice versa.

Parameters:

mol_list (list[Chem.Mol]) – The list of molecules to filter.

Returns:

list[Chem.Mol] – The filtered list of molecules.