rdmc.mol#

This module provides class and methods for dealing with RDKit RWMol, Mol.

class rdmc.mol.RDKitMol(mol: Mol | RWMol, keepAtomMap: bool = True)#

Bases: object

A helpful wrapper for Chem.rdchem.RWMol. The method nomenclature follows the Camel style to be consistent with RDKit. It keeps almost all of the original methods of Chem.rdchem.RWMol but has a few useful shortcuts so that users don’t need to refer to other RDKit modules.

AddNullConformer(confId: int | None = None, random: bool = True) → None#

Embed a conformer with atoms’ coordinates of random numbers or with all atoms located at the origin to the current RDKitMol.

Parameters:

confId (int, optional) – Which ID to set for the conformer (will be added as the last conformer by default).
random (bool, optional) – Whether set coordinates to random numbers. Otherwise, set to all-zero coordinates. Defaults to True.

AddRedundantBonds(bonds: Iterable) → RDKitMol#

Add redundant bonds (not originally exist in the molecule) for facilitating a few molecule operation or analyses. This function will only generate a copy of the molecule and no change is conducted inplace.

Parameters:: bonds – a list of length-2 Iterables containing the indexes of the ended atoms.

Align molecules based on a reference molecule. This function will also return the RMSD value for the best alignment. When leaving both prbMol and refMol blank, the function will align current molecule’s conformers, and PrbCid or refCid must be provided.

Parameters:

refMol (Mol) – RDKit molecule as a reference. Should not be provided with prbMol.
prbMol (Mol) – RDKit molecules to align to the current molecule. Should not be provided with refMol.
prbCid (int, optional) – The conformer id to be aligned. Defaults to 0.
refCid (int, optional) – The id of reference conformer. Defaults to 0.
reflect (bool, optional) – Whether to reflect the conformation of the probe molecule. Defaults to False.
atomMap (list, optional) – A vector of pairs of atom IDs (prb AtomId, ref AtomId) used to compute the alignments. If this mapping is not specified, an attempt is made to generate on by substructure matching.
maxIters (int, optional) – Maximum number of iterations used in minimizing the RMSD. Defaults to 1000.

Returns:

float – RMSD value.

AssignStereochemistryFrom3D(confId: int = 0)#

Assign the chirality type to a molecule’s atoms.

Parameters:: confId (int, optional) – The ID of the conformer whose geometry is used to determine the chirality. Defaults to 0.

CalcRMSD(prbMol: RDKitMol, prbCid: int = 0, refCid: int = 0, reflect: bool = False, atomMaps: list | None = None, weights: list = []) → float#

Calculate the RMSD between conformers of two molecules. Note this function will not align conformers, thus molecules’ geometries are not translated or rotated during the calculation. You can expect a larger number compared to the RMSD from AlignMol().

Parameters:

prbMol (RDKitMol) – The other molecule to compare with. It can be set to the current molecule.
prbCid (int, optional) – The conformer ID of the current molecule to calculate RMSD. Defaults to 0.
refCid (int, optional) – The conformer ID of the other molecule to calculate RMSD. Defaults to 0.
reflect (bool, optional) – Whether to reflect the conformation of the prbMol. Defaults to False.
atomMaps (list, optional) – Provide an atom mapping to calculate the RMSD. By default, prbMol and current molecule are assumed to have the same atom order.
weights (list, optional) – Specify weights to each atom pairs. E.g., use atom weights to highlight the importance of heavy atoms. Defaults to [] for using unity weights.

Returns:

float – RMSD value.

CombineMol(molFrag: RDKitMol | Mol, offset: list | tuple | float | ndarray = 0, c_product: bool = False) → RDKitMol#

Combine the current molecule with the given molFrag (another molecule or fragment). A new object instance will be created and changes are not made to the current molecule.

Parameters:

molFrag (RDKitMol or Mol) – The molecule or fragment to be combined into the current one.
offset –
- (list or tuple): A 3-element vector used to define the offset.
- (float): Distance in Angstrom between the current mol and the molFrag along the x axis.
c_product (bool, optional) –
If True, generate conformers for every possible combination between the current molecule and the molFrag. E.g., (1,1), (1,2), … (1,n), (2,1), …(m,1), … (m,n). \(N(conformer) = m \times n.\)

Defaults to False, meaning only generate conformers according to (1,1), (2,2), … When c_product is set to False, if the current molecule has 0 conformer, conformers will be embedded to the current molecule first. The number of conformers of the combined molecule will be equal to the number of conformers of molFrag. Otherwise, the number of conformers of the combined molecule will be equal to the number of conformers of the current molecule. Some coordinates may be filled by 0s, if the current molecule and molFrag have different numbers of conformers.

Returns:

RDKitMol – The combined molecule.

Copy(quickCopy: bool = False, confId: int = -1, copy_attrs: list | None = None) → RDKitMol#

Make a copy of the current RDKitMol.

Parameters:

quickCopy (bool, optional) – Use the quick copy mode without copying conformers. Defaults to False.
confId (int, optional) – The conformer ID to be copied. Defaults to -1, meaning all conformers.
copy_attrs (list, optional) – Copy specific attributes to the new molecule. Defaults to None.

Returns:

RDKitMol – a copied molecule

EmbedConformer(embed_null: bool = True, **kwargs)#

Embed a conformer to the RDKitMol. This will overwrite current conformers. By default, it will first try embedding a 3D conformer; if fails, it then try to compute 2D coordinates and use that for the conformer structure; if both approaches fail, and embedding a null conformer is allowed, a conformer with all zero coordinates will be embedded. The last one is helpful for the case where you can use SetPositions to set their positions afterward, or if you want to optimize the geometry using force fields.

Parameters:: embed_null (bool) – If embedding 3D and 2D coordinates fails, whether to embed a conformer with all null coordinates, (0, 0, 0), for each atom. Defaults to True.

EmbedMultipleConfs(n: int = 1, embed_null: bool = True, **kwargs)#

Embed multiple conformers to the RDKitMol. This will overwrite current conformers. By default, it will first try embedding a 3D conformer; if fails, it then try to compute 2D coordinates and use that for the conformer structure; if both approaches fail, and embedding a null conformer is allowed, a conformer with all zero coordinates will be embedded. The last one is helpful for the case where you can use SetPositions to set their positions afterward, or if you want to optimize the geometry using force fields.

Parameters:

n (int) – The number of conformers to be embedded. The default is 1.
embed_null (bool) – If embeding fails, whether to embed null conformers. Defaults to True.

EmbedMultipleNullConfs(n: int = 10, random: bool = True)#

Embed conformers with null or random atom coordinates. This helps the cases where a conformer can not be successfully embedded. You can choose to generate all zero coordinates or random coordinates. You can set to all-zero coordinates, if you will set coordinates later; You should set to random coordinates, if you want to optimize this molecule by force fields (RDKit force field cannot optimize all-zero coordinates).

Parameters:

n (int) – The number of conformers to be embedded. Defaults to 10.
random (bool, optional) – Whether set coordinates to random numbers. Otherwise, set to all-zero coordinates. Defaults to True.

EmbedNullConformer(random: bool = True)#

Embed a conformer with null or random atom coordinates. This helps the cases where a conformer can not be successfully embedded. You can choose to generate all zero coordinates or random coordinates. You can set to all-zero coordinates, if you will set coordinates later; You should set to random coordinates, if you want to optimize this molecule by force fields (RDKit force field cannot optimize all-zero coordinates).

Parameters:: random (bool, optional) – Whether set coordinates to random numbers. Otherwise, set to all-zero coordinates. Defaults to True.

classmethod FromFile(path: str, backend: str = 'openbabel', header: bool = True, removeHs: bool = False, sanitize: bool = True, sameMol: bool = False, **kwargs) → RDKitMol#

Read RDKitMol from a file.

Parameters:

path (str) – File path to data.
backend (str, optional) – The backend used to perceive molecule. Defaults to 'openbabel'. Currently, we only support 'openbabel' and 'jensen'.
header (bool, optional) – If lines of the number of atoms and title are included. Defaults to True.
removeHs (bool) – Whether or not to remove hydrogens from the input. Defaults to False.
sanitize (bool) – Whether or not to use RDKit’s sanitization algorithm to clean input; helpful to set this to False when reading TS files. Defaults to True.
sameMol (bool) – Whether or not all the conformers in the (sdf) file are for the same mol, in which case we will copy conformers directly to the mol. Defaults to False.

Returns:

RDKitMol – An RDKit molecule object corresponding to the file.

classmethod FromInchi(inchi: str, removeHs: bool = False, addHs: bool = True, sanitize: bool = True)#

Construct an RDKitMol object from a InChI string.

Parameters:

inchi (str) – A InChI string. https://en.wikipedia.org/wiki/International_Chemical_Identifier
removeHs (bool, optional) – Whether to remove hydrogen atoms from the molecule, Due to RDKit implementation, only effective when sanitize is True as well. True to remove.
addHs (bool, optional) – Whether to add explicit hydrogen atoms to the molecule. True to add. Only functioning when removeHs is False.
sanitize (bool, optional) – Whether to sanitize the RDKit molecule, True to sanitize.

Returns:

RDKitMol – An RDKit molecule object corresponding to the InChI.

classmethod FromMol(mol: Mol | RWMol, keepAtomMap: bool = True) → RDKitMol#

Convert a RDKit Chem.rdchem.Mol molecule to RDKitMol Molecule.

Parameters:

rdmol (Union[Mol, RWMol]) – The RDKit Chem.rdchem.Mol / RWMol molecule to be converted.
keepAtomMap (bool, optional) – Whether keep the original atom mapping. Defaults to True. If no atom mapping is stored in the molecule, atom mapping will be created based on atom indexes.

Returns:

RDKitMol – RDKitMol molecule converted from the input RDKit Chem.rdchem.Mol molecule.

classmethod FromOBMol(obMol: openbabel.OBMol, removeHs: bool = False, sanitize: bool = True, embed: bool = True) → RDKitMol#

Convert a OpenBabel Mol to an RDKitMol object.

Parameters:

obMol (Molecule) – An OpenBabel Molecule object for the conversion.
removeHs (bool, optional) – Whether to remove hydrogen atoms from the molecule, Defaults to False.
sanitize (bool, optional) – Whether to sanitize the RDKit molecule. Defaults to True.
embed (bool, optional) – Whether to embeb 3D conformer from OBMol. Defaults to True.

Returns:

RDKitMol – An RDKit molecule object corresponding to the input OpenBabel Molecule object.

classmethod FromRMGMol(rmgMol: rmgpy.molecule.Molecule, removeHs: bool = False, sanitize: bool = True) → RDKitMol#

Convert an RMG Molecule to an RDkitMol object.

Parameters:

rmgMol ('rmg.molecule.Molecule') – An RMG Molecule instance.
removeHs (bool, optional) – Whether to remove hydrogen atoms from the molecule, True to remove.
sanitize (bool, optional) – Whether to sanitize the RDKit molecule, True to sanitize.

Returns:

RDKitMol – An RDKit molecule object corresponding to the RMG Molecule.

classmethod FromSDF(sdf: str, removeHs: bool = False, sanitize: bool = True) → RDKitMol#

Convert an SDF string to RDKitMol.

Parameters:

sdf (str) – An SDF string.
removeHs (bool) – Whether or not to remove hydrogens from the input. Defaults to False.
sanitize (bool) – Whether or not to use RDKit’s sanitization algorithm to clean input; helpful to set this to False when reading TS files. Defaults to True.

Returns:

RDKitMol – An RDKit molecule object corresponding to the SDF string.

classmethod FromSmarts(smarts: str) → RDKitMol#

Convert a SMARTS to an RDKitMol object.

Parameters:: smarts (str) – A SMARTS string of the molecule
Returns:: RDKitMol – An RDKit molecule object corresponding to the SMARTS.

classmethod FromSmiles(smiles: str, removeHs: bool = False, addHs: bool = True, sanitize: bool = True, allowCXSMILES: bool = True, keepAtomMap: bool = True) → RDKitMol#

Convert a SMILES string to an RDkitMol object.

Parameters:

smiles (str) – A SMILES representation of the molecule.
removeHs (bool, optional) – Whether to remove hydrogen atoms from the molecule, True to remove.
addHs (bool, optional) – Whether to add explicit hydrogen atoms to the molecule. True to add. Only functioning when removeHs is False.
sanitize (bool, optional) – Whether to sanitize the RDKit molecule, True to sanitize.
allowCXSMILES (bool, optional) – Whether to recognize and parse CXSMILES. Defaults to True.
keepAtomMap (bool, optional) – Whether to keep the Atom mapping contained in the SMILES. Defaults Defaults to True.

Returns:

RDKitMol – An RDKit molecule object corresponding to the SMILES.

classmethod FromXYZ(xyz: str, backend: str = 'openbabel', header: bool = True, sanitize: bool = True, embed_chiral: bool = False, **kwargs)#

Convert xyz string to RDKitMol.

Parameters:

xyz (str) – A XYZ String.
backend (str) – The backend used to perceive molecule. Defaults to 'openbabel'. Currently, we only support 'openbabel' and 'jensen'.
header (bool, optional) – If lines of the number of atoms and title are included. Defaults to True.
sanitize (bool) – Sanitize the RDKit molecule during conversion. Helpful to set it to False when reading in TSs. Defaults to True.
embed_chiral – True to embed chiral information. Defaults to True.
kwargs (supported) –
jensen:
- charge: The charge of the species. Defaults to 0.
- allow_charged_fragments: True for charged fragment, False for radical. Defaults to False.
- use_graph: True to use networkx module for accelerate. Defaults to True.
- use_huckel: True to use extended Huckel bond orders to locate bonds. Defaults to False.
- forced_rdmc: Defaults to False. In rare case, we may hope to use a tailored
  version of the Jensen XYZ parser, other than the one available in RDKit. Set this argument to True to force use RDMC’s implementation, which user’s may have some flexibility to modify.

Returns:

RDKitMol – An RDKit molecule object corresponding to the xyz.

GetAdjacencyMatrix()#

Get the adjacency matrix of the molecule.

Returns:: numpy.ndarray – A square adjacency matrix of the molecule, where a 1 indicates that atoms are bonded and a 0 indicates that atoms aren’t bonded.

GetAllConformers() → List[RDKitConf]#

Get all of the embedded conformers.

Returns:: List[‘RDKitConf’] – A list all of conformers.

GetAtomMapNumbers() → tuple#

Get the atom mapping.

Returns:: tuple – atom mapping numbers in the sequence of atom index.

GetAtomMasses() → List[float]#

Get the mass of each atom. The order is consistent with the atom indexes.

Returns:: list – A list of atom masses.

GetAtomicNumbers()#

Get the Atomic numbers of the molecules. The atomic numbers are sorted by the atom indexes.

Returns:: list – A list of atomic numbers.

GetAtoms() → list#: This is a rewrite of GetAtoms(), based on the findings of RDKit issue. Although RDKit fixed this issue in version 2023.09, we still keep this function for backward compatibility.

GetBestAlign(refMol, prbCid: int = 0, refCid: int = 0, atomMaps: list | None = None, maxIters: int = 1000, keepBestConformer: bool = True)#

This is a wrapper function for calling AlignMol twice, with reflect to True and False, respectively.

Parameters:

refMol (Mol) – RDKit molecule as a reference.
prbCid (int, optional) – The conformer id to be aligned. Defaults to 0.
refCid (int, optional) – The id of reference conformer. Defaults to 0.
reflect (bool, optional) – Whether to reflect the conformation of the probe molecule. Defaults to False.
atomMap (list, optional) – a vector of pairs of atom IDs (probe AtomId, ref AtomId) used to compute the alignments. If this mapping is not specified an attempt is made to generate on by substructure matching.
maxIters (int, optional) – maximum number of iterations used in minimizing the RMSD. Defaults to 1000.
keepBestConformer (bool, optional) – Whether to keep the best Conformer structure. Defaults to True. This is less helpful when you are comparing different atom mappings.

Returns:

float – RMSD value.
bool – if reflected conformer gives a better result.

GetBondsAsTuples() → List[tuple]#

Generate a list of length-2 sets indicating the bonding atoms in the molecule.

Returns:: list – A list of length-2 sets indicating the bonding atoms.

GetClosedShellMol(cheap: bool = False, sanitize: bool = True) → RDKitMol#

Get a closed shell molecule by removing all radical electrons and adding H atoms to these radical sites. This method currently only work for radicals and will not work properly for singlet radicals.

Parameters:

cheap (bool) – Whether to use a cheap method where H atoms are only implicitly added. Defaults to False. Setting it to False only when the molecule is immediately used for generating SMILES/InChI and other representations, and no further manipulation is needed. Otherwise, it may be confusing as the hydrogen atoms will not appear in the list of atoms, not display in the 2D graph, etc.
sanitize (bool) – Whether to sanitize the molecule. Defaults to True.

Returns:

RDKitMol – A closed shell molecule.

GetConformer(id: int = 0) → RDKitConf#

Get the embedded conformer according to ID.

Parameters:: id (int) – The ID of the conformer to be obtained. The default is 0.
Raises:: ValueError – Bad id assigned.
Returns:: RDKitConf – A conformer corresponding to the ID.

GetConformers(ids: list | tuple = [0]) → List[RDKitConf]#

Get the embedded conformers according to IDs.

Parameters:: ids (Union[list, tuple]) – The ids of the conformer to be obtained. The default is [0].
Raises:: ValueError – Bad id assigned.
Returns:: List[RDKitConf] – A list of conformers corresponding to the IDs.

GetDistanceMatrix(id: int = 0) → ndarray#

Get the distance matrix of the molecule.

Parameters:: id (int, optional) – The conformer ID to extract distance matrix from. Defaults to 0.
Returns:: np.ndarray – A square distance matrix of the molecule.

GetElementCounts() → Dict[str, int]#

Get the element counts of the molecules.

Returns:: dict – A dictionary of element counts.

GetElementSymbols() → List[str]#

Get the element symbols of the molecules. The element symbols are sorted by the atom indexes.

Returns:: list – A list of element symbols.

GetFingerprint(fpType: str = 'morgan', numBits: int = 2048, count: bool = False, **kwargs) → ndarray#

Get the fingerprint of the molecule.

Parameters:

fpType (str, optional) – The type of the fingerprint. Defaults to 'morgan'.
numBits (int, optional) – The number of bits of the fingerprint. Defaults to 2048.
count (bool, optional) – Whether to count the number of occurrences of each bit. Defaults to False.

Returns:

np.ndarray – A fingerprint of the molecule.

GetFormalCharge() → int#

Get formal charge of the molecule.

Returns:: int – Formal charge.

GetHeavyAtoms() → list#

Get heavy atoms of the molecule with the order consistent with the atom indexes.

Returns:: list – A list of heavy atoms.

GetInternalCoordinates(nonredundant: bool = True) → list#

Get internal coordinates of the molecule.

Parameters:: nonredundant (bool) – Whether to return nonredundant internal coordinates. Defaults to True.
Returns:: list – A list of internal coordinates.

GetMolFrags(asMols: bool = False, sanitize: bool = True, frags: list | None = None, fragsMolAtomMapping: list | None = None) → tuple#

Finds the disconnected fragments from a molecule. For example, for the molecule “CC(=O)[O-].[NH3+]C”, this function will split the molecules into a list of “CC(=O)[O-]” and “[NH3+]C”. By defaults, this function will return a list of atom mapping, but options are available for getting mols.

Parameters:

asMols (bool, optional) – Whether the fragments will be returned as molecules instead of atom IDs. Defaults to True.
sanitize (bool, optional) – Whether the fragments molecules will be sanitized before returning them. Defaults to True.
frags (list, optional) – If this is provided as an empty list, the result will be mol.GetNumAtoms() long on return and will contain the fragment assignment for each Atom.
fragsMolAtomMapping (list, optional) – If this is provided as an empty list ([]), the result will be a numFrags long list on return, and each entry will contain the indices of the Atoms in that fragment: [(0, 1, 2, 3), (4, 5)].

Returns:

tuple – a tuple of atom mapping or a tuple of split molecules (RDKitMol).

GetPositions(id: int = 0) → ndarray#

Get atom positions of the embeded conformer.

Parameters:: id (int, optional) – The conformer ID to extract atom positions from. Defaults to 0.
Returns:: np.ndarray – a 3 x N matrix containing atom coordinates.

GetSpinMultiplicity() → int#

Get spin multiplicity of a molecule. The spin multiplicity is calculated using Hund’s rule of maximum multiplicity defined as 2S + 1.

Returns:: int – Spin multiplicity.

GetSubstructMatch(query: RDKitMol | RWMol | Mol, useChirality: bool = False, useQueryQueryMatches: bool = False) → tuple#

Returns the indices of the molecule’s atoms that match a substructure query.

Parameters:

query (Mol) – An RDkit Molecule.
useChirality (bool, optional) – Enables the use of stereochemistry in the matching. Defaults to False.
useQueryQueryMatches (bool, optional) – Use query-query matching logic. Defaults to False.

Returns:

tuple – A tuple of matched indices.

GetSubstructMatchAndRecipe(mol: RDKitMol) → Tuple[tuple, dict]#

Get the substructure match between two molecules and a recipe to recover the provide mol to the current mol. If swapping the atom indices in mol according to the recipe, the mol should have the same connectivity as the current molecule. Note, if no match is found, the returned match and recipe will be empty.

Parameters:

mol (RDKitMol) – The molecule to compare with.

Returns:

tuple – The substructure match.
dict – A truncated atom mapping of mol2 to mol1.

GetSubstructMatches(query: RDKitMol | RWMol | Mol, uniquify: bool = True, useChirality: bool = False, useQueryQueryMatches: bool = False, maxMatches: int = 1000) → tuple#

Returns tuples of the indices of the molecule’s atoms that match a substructure query.

Parameters:

query (Mol) – a Molecule.
uniquify (bool, optional) – determines whether or not the matches are uniquified. Defaults to True.
useChirality (bool, optional) – enables the use of stereochemistry in the matching. Defaults to False.
useQueryQueryMatches (bool, optional) – use query-query matching logic. Defaults to False.
maxMatches – The maximum number of matches that will be returned to prevent a combinatorial explosion. Defaults to 1000.

Returns:

tuple – A tuple of tuples of matched indices.

GetSymmSSSR() → tuple#

Get a symmetrized SSSR for a molecule.

Returns:: tuple – A sequence of sequences containing the rings found as atom IDs.

GetTorsionTops(torsion: Iterable, allowNonbondPivots: bool = False) → tuple#

Generate tops for the given torsion. Top atoms are defined as atoms on one side of the torsion. The mol should be in one-piece when using this function, otherwise, the results will be misleading.

Parameters:

torsion (Iterable) – An iterable with four elements and the 2nd and 3rd are the pivot of the torsion.
allowNonbondPivots (bool, optional) – Allow non-bonding pivots. Defaults to False.

Returns:

tuple – Two frags, one of the top of the torsion, and the other top of the torsion.

GetTorsionalModes(excludeMethyl: bool = False, includeRings: bool = False) → list#

Get all of the torsional modes (rotors) from the molecule.

Parameters:

excludeMethyl (bool) – Whether exclude the torsions with methyl groups. Defaults to False.
includeRings (bool) – Whether or not to include ring torsions. Defaults to False.

Returns:

list – A list of four-atom-indice to indicating the torsional modes.

GetVdwMatrix(threshold: float = 0.4) → ndarray | None#

Get the derived Van der Waals matrix, which can be used to analyze the collision of atoms. More information can be found from generate_vdw_mat.

Parameters:: threshold – A float indicating the threshold to use in the vdw matrix. Defaults to 0.4.
Returns:: Optional[np.ndarray] – A 2D array of the derived Van der Waals Matrix, if the the matrix exists, otherwise None.

HasCollidingAtoms(threshold: float = 0.4) → bool#

Check whether the molecule has colliding atoms.

Parameters:: threshold – A float indicating the threshold to use in the vdw matrix. Defaults to 0.4.
Returns:: bool – Whether the molecule has colliding atoms.

HasSameConnectivity(refmol: RDKitMol) → bool#

Check wheter the molecule has the same connectivity as the reference molecule.

Parameters:: refmol (RDKitMol) – The reference molecule.
Returns:: bool – Whether the molecule has the same connectivity as the reference molecule.

HasSameConnectivityConformer(confId: int = 0, backend: str = 'openbabel', **kwargs) → bool#

Check whether the conformer of the molecule (defined by its spacial coordinates) as the same connectivity as the molecule.

Parameters:

confId (int, optional) – The conformer ID. Defaults to 0.
backend (str, optional) – The backend to use for the comparison. Defaults to 'openbabel'.
**kwargs – The keyword arguments to pass to the backend.

Returns:

bool – Whether the conformer has the same connectivity as the molecule.

Kekulize(clearAromaticFlags: bool = False)#

Kekulizes the molecule.

Parameters:: clearAromaticFlags (optional) – If True, all atoms and bonds in the molecule will be marked non-aromatic following the kekulization. Defaults to False.

PrepareOutputMol(removeHs: bool = False, sanitize: bool = True) → Mol#

Generate a RDKit Mol instance for output purpose, to ensure that the original molecule is not modified.

Parameters:

removeHs (bool, optional) –
Remove less useful explicity H atoms. E.g., When output SMILES, H atoms, if explicitly added, will be included and reduce the readablity. Defaults to False. Note, following Hs are not removed:
1. H which aren’t connected to a heavy atom. E.g.,[H][H].
2. Labelled H. E.g., atoms with atomic number=1, but isotope > 1.
3. Two coordinate Hs. E.g., central H in C[H-]C.
4. Hs connected to dummy atoms
5. Hs that are part of the definition of double bond Stereochemistry.
6. Hs that are not connected to anything else.
sanitize (bool, optional) – Whether to sanitize the molecule. Defaults to True.

Returns:

Mol – A Mol instance used for output purpose.

Reflect(id: int = 0)#

Reflect the atom coordinates of a molecule, and therefore its mirror image.

Parameters:: id (int, optional) – The conformer id to reflect. Defaults to 0.

RemoveHs(sanitize: bool = True)#

Remove H atoms. Useful when trying to match heavy atoms.

Parameters:: sanitize (bool, optional) – Whether to sanitize the molecule. Defaults to True.

RenumberAtoms(newOrder: dict | list | None = None, updateAtomMap: bool = True) → RDKitMol#

Return a new copy of RDKitMol that has atom (index) reordered.

Parameters:

newOrder (list or dict, optional) – The new ordering the atoms (should be numAtoms long). - If provided as a list, it should a list of atom indexes. E.g., if newOrder is [3,2,0,1], then atom 3 in the original molecule will be atom 0 in the new one. - If provided as a dict, it should be a mapping between atoms. E.g., if newOrder is {0: 3, 1: 2, 2: 0, 3: 1}, then atom 0 in the original molecule will be atom 3 in the new one. Unlike the list case, the newOrder can be a partial mapping, but one should make sure all the pairs are included. E.g., {0: 3, 3: 0}. - If no value provided (default), then the molecule will be renumbered based on the current atom map numbers. The latter is helpful when the sequence of atom map numbers and atom indexes are inconsistent.
updateAtomMap (bool) – Whether to update the atom map number based on the new order. Defaults to True.

Returns:

RDKitMol – Molecule with reordered atoms.

Sanitize(sanitizeOps: int | SanitizeFlags | None = rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_ALL)#

Sanitize the molecule.

Parameters:: sanitizeOps (int or str, optional) – Sanitize operations to be carried out. Defaults to SanitizeFlags.SANITIZE_ALL. More details can be found at RDKit docs.

SaturateBiradicalSites12(multiplicity: int, verbose: bool = True)#

A method help to saturate 1,2 biradicals to match the given molecule spin multiplicity. E.g.:

*C - C* => C = C

In the current implementation, no error will be raised, if the function doesn’t achieve the goal. This function has not been been tested on nitrogenate.

Parameters:

multiplicity (int) – The target multiplicity.
verbose (bool) – Whether to print additional information. Defaults to True.

SaturateBiradicalSitesCDB(multiplicity: int, chain_length: int = 8, verbose: bool = True)#

A method help to saturate biradicals that have conjugated double bond in between to match the given molecule spin multiplicity. E.g, 1,4 biradicals can be saturated if there is a unsaturated bond between them:

*C - C = C - C* => C = C - C = C

In the current implementation, no error will be raised, if the function doesn’t achieve the goal. This function has not been been tested on nitrogenate.

Parameters:

multiplicity (int) – The target multiplicity.
chain_length (int) – How long the conjugated double bond chain is. A larger value will result in longer computational time. Defaults to 8.
verbose (bool) – Whether to print additional information. Defaults to True.

SaturateCarbene(multiplicity: int, verbose: bool = True)#

A method help to saturate carbenes and nitrenes to match the given molecule spin multiplicity:

*-C-* (triplet) => C-(**) (singlet)

In the current implementation, no error will be raised, if the function doesn’t achieve the goal. This function has not been been tested on nitrogenate.

Parameters:

multiplicity (int) – The target multiplicity.
verbose (int) – Whether to print additional information. Defaults to True.

SaturateMol(multiplicity: int, chain_length: int = 8, verbose: bool = False)#

A method help to saturate the molecule to match the given molecule spin multiplicity. This is just a wrapper to call SaturateBiradicalSites12(), SaturateBiradicalSitesCDB(), and SaturateCarbene():

*C - C* => C = C
*C - C = C - C* => C = C - C = C
*-C-* (triplet) => C-(**) (singlet)

In the current implementation, no error will be raised, if the function doesn’t achieve the goal. This function has not been been tested on nitrogenate.

Parameters:

multiplicity (int) – The target multiplicity.
chain_length (int) – How long the conjugated double bond chain is. A larger value will result in longer time. Defaults to 8.
verbose (bool) – Whether to print intermediate information. Defaults to False.

SetAtomMapNumbers(atomMap: Sequence[int] | None = None)#

Set the atom mapping number. By defaults, atom indexes are used. It can be helpful when plotting the molecule in a 2D graph.

Parameters:: atomMap (list, tuple, optional) – A sequence of integers for atom mapping.

SetPositions(coords: Sequence | str, id: int = 0, header: bool = False)#

Set the atom positions to one of the conformer.

Parameters:

coords (sequence) – A tuple/list/ndarray containing atom positions; or a string with the typical XYZ formating.
id (int, optional) – Conformer ID to assign the Positions to. Defaults to 1.
header (bool) – When the XYZ string has an header. Defaults to False.

SetVdwMatrix(threshold: float = 0.4, vdw_radii: dict = {1: 1.2, 2: 1.4, 3: 2.2, 4: 1.9, 5: 1.8, 6: 1.7, 7: 1.6, 8: 1.55, 9: 1.5, 10: 1.54, 11: 2.4, 12: 2.2, 13: 2.1, 14: 2.1, 15: 1.95, 16: 1.8, 17: 1.8, 18: 1.88, 19: 2.8, 20: 2.4, 21: 2.3, 22: 2.15, 23: 2.05, 24: 2.05, 25: 2.05, 26: 2.05, 27: 2.0, 28: 2.0, 29: 2.0, 30: 2.1, 31: 2.1, 32: 2.1, 33: 2.05, 34: 1.9, 35: 1.9, 36: 2.02, 37: 2.9, 38: 2.55, 39: 2.4, 40: 2.3, 41: 2.15, 42: 2.1, 43: 2.05, 44: 2.05, 45: 2.0, 46: 2.05, 47: 2.1, 48: 2.2, 49: 2.2, 50: 2.25, 51: 2.2, 52: 2.1, 53: 2.1, 54: 2.16, 55: 3.0, 56: 2.7, 57: 2.5, 58: 2.48, 59: 2.47, 60: 2.45, 61: 2.43, 62: 2.42, 63: 2.4, 64: 2.38, 65: 2.37, 66: 2.35, 67: 2.33, 68: 2.32, 69: 2.3, 70: 2.28, 71: 2.27, 72: 2.25, 73: 2.2, 74: 2.1, 75: 2.05, 76: 2.0, 77: 2.0, 78: 2.05, 79: 2.1, 80: 2.05, 81: 2.2, 82: 2.3, 83: 2.3, 84: 2.0, 85: 2.0, 86: 2.0, 87: 2.0, 88: 2.0, 89: 2.0, 90: 2.4, 91: 2.0, 92: 2.3, 93: 2.0, 94: 2.0, 95: 2.0, 96: 2.0, 97: 2.0, 98: 2.0, 99: 2.0, 100: 2.0, 101: 2.0, 102: 2.0, 103: 2.0, 104: 2.0, 105: 2.0, 106: 2.0, 107: 2.0, 108: 2.0, 109: 2.0, 110: 2.0, 111: 2.0, 112: 2.0, 113: 2.0, 114: 2.0, 115: 2.0, 116: 2.0, 117: 2.0, 118: 2.0})#

Set the derived Van der Waals matrix, which is an upper triangle matrix calculated from a threshold usually around 0.4 of the Van der Waals Radii. Its diagonal elements are all zeros. The element (i, j) is calculated by threshold * sum( R(atom i) + R(atom j) ). If two atoms are bonded, the value is set to be zero. When threshold = 0.4, the value is close to the covalent bond length.

Parameters:

threshold (float) – The threshold used to calculate the derived Van der Waals matrix. A larger value results in a matrix with larger values; When compared with distance matrix, it may overestiate the overlapping between atoms. The default value is 0.4.
vdw_radii (dict) – A dict stores the Van der Waals radii of different elements.

Raises:

ValueError – Invalid threshold is supplied.

ToAtoms(confId: int = 0) → Atoms#

Convert RDKitMol to the ase.Atoms object.

Parameters:: confId (int) – The conformer ID to be exported. Defaults to 0.
Returns:: Atoms – The corresponding ase.Atoms object.

ToGraph(keep_bond_order: bool = False) → Graph#

Convert RDKitMol to a networkx graph.

Parameters:: keep_bond_order (bool) – Whether to keep bond order information. Defaults to False, meaning treat all bonds as single bonds.
Returns:: nx.Graph – A networkx graph representing the molecule.

ToInchi(options: str = '') → str#

Convert the RDKitMol to a InChI string using RDKit builtin converter.

Parameters:: options (str, optional) – The InChI generation options. Options should be prefixed with either a - or a / Available options are explained in the InChI technical FAQ: https://www.inchi-trust.org/technical-faq/#15.14 and https://www.inchi-trust.org/?s=user+guide. Defaults to “”.

ToMolBlock(confId: int = -1) → str#

Convert RDKitMol to a mol block string.

Parameters:: confId (int) – The conformer ID to be exported.
Returns:: str – The mol block of the molecule.

ToOBMol() → openbabel.OBMol#

Convert RDKitMol to a OBMol.

Returns:: OBMol – The corresponding openbabel OBMol.

ToRWMol() → RWMol#

Convert the RDKitMol Molecule back to a RDKit Chem.rdchem.RWMol.

Returns:: RWMol – A RDKit Chem.rdchem.RWMol molecule.

ToSDFFile(path: str, confId: int = -1)#

Write molecule information to .sdf file.

Parameters:: path (str) – The path to save the .sdf file.

ToSmiles(stereo: bool = True, kekule: bool = False, canonical: bool = True, removeAtomMap: bool = True, removeHs: bool = True) → str#

Convert RDKitMol to a SMILES string.

Parameters:

stereo (bool, optional) – Whether keep stereochemistry information. Defaults to True.
kekule (bool, optional) – Whether use Kekule form. Defaults to False.
canonical (bool, optional) – Whether generate a canonical SMILES. Defaults to True.
removeAtomMap (bool, optional) – Whether to remove map id information in the SMILES. Defaults to True.
removeHs (bool, optional) – Whether to remove H atoms to make obtained SMILES clean. Defaults to True.

Returns:

str – The smiles string of the molecule.

ToXYZ(confId: int = -1, header: bool = True, comment: str = '') → str#

Convert RDKitMol to a xyz string.

Parameters:

confId (int) – The conformer ID to be exported.
header (bool, optional) – Whether to include header (first two lines). Defaults to True.

Returns:

str – The xyz of the molecule.

rdmc.mol.generate_vdw_mat(rd_mol, threshold: float = 0.4, vdw_radii: dict = {1: 1.2, 2: 1.4, 3: 2.2, 4: 1.9, 5: 1.8, 6: 1.7, 7: 1.6, 8: 1.55, 9: 1.5, 10: 1.54, 11: 2.4, 12: 2.2, 13: 2.1, 14: 2.1, 15: 1.95, 16: 1.8, 17: 1.8, 18: 1.88, 19: 2.8, 20: 2.4, 21: 2.3, 22: 2.15, 23: 2.05, 24: 2.05, 25: 2.05, 26: 2.05, 27: 2.0, 28: 2.0, 29: 2.0, 30: 2.1, 31: 2.1, 32: 2.1, 33: 2.05, 34: 1.9, 35: 1.9, 36: 2.02, 37: 2.9, 38: 2.55, 39: 2.4, 40: 2.3, 41: 2.15, 42: 2.1, 43: 2.05, 44: 2.05, 45: 2.0, 46: 2.05, 47: 2.1, 48: 2.2, 49: 2.2, 50: 2.25, 51: 2.2, 52: 2.1, 53: 2.1, 54: 2.16, 55: 3.0, 56: 2.7, 57: 2.5, 58: 2.48, 59: 2.47, 60: 2.45, 61: 2.43, 62: 2.42, 63: 2.4, 64: 2.38, 65: 2.37, 66: 2.35, 67: 2.33, 68: 2.32, 69: 2.3, 70: 2.28, 71: 2.27, 72: 2.25, 73: 2.2, 74: 2.1, 75: 2.05, 76: 2.0, 77: 2.0, 78: 2.05, 79: 2.1, 80: 2.05, 81: 2.2, 82: 2.3, 83: 2.3, 84: 2.0, 85: 2.0, 86: 2.0, 87: 2.0, 88: 2.0, 89: 2.0, 90: 2.4, 91: 2.0, 92: 2.3, 93: 2.0, 94: 2.0, 95: 2.0, 96: 2.0, 97: 2.0, 98: 2.0, 99: 2.0, 100: 2.0, 101: 2.0, 102: 2.0, 103: 2.0, 104: 2.0, 105: 2.0, 106: 2.0, 107: 2.0, 108: 2.0, 109: 2.0, 110: 2.0, 111: 2.0, 112: 2.0, 113: 2.0, 114: 2.0, 115: 2.0, 116: 2.0, 117: 2.0, 118: 2.0})#

Generate a derived Van der Waals matrix, which is an upper triangle matrix calculated from a threshold usually around 0.4 of the Van der Waals Radii. Its diagonal elements are all zeros. The element (i, j) is calculated by threshold * sum( R(atom i) + R(atom j) ). If two atoms are bonded, the value is set to be zero. When threshold = 0.4, the value is close to the covalent bond length.

Parameters:

threshold (float) – The threshold used to calculate the derived Van der Waals matrix. A larger value results in a matrix with larger values; When compared with distance matrix, it may overestiate the overlapping between atoms. The default value is 0.4.
vdw_radii (dict) – A dict stores the Van der Waals radii of different elements.

Raises:

ValueError – Invalid threshold is supplied.

rdmc.mol.parse_xyz_or_smiles_list(mol_list, with_3d_info: bool = False, **kwargs)#

A helper function to parse xyz and smiles and list if the conformational information is provided.

Parameters:

mol_list (list) – a list of smiles or xyzs or tuples of (string, multiplicity) to specify desired multiplicity. E.g., ['CCC', 'H 0 0 0', ('[CH2]', 1)]
with_3d_info (bool) – Whether to indicate which entries are from 3D representations. Defaults to False.

Returns:

list – A list of RDKitMol objects.