rdmc.mol#
This module provides class and methods for dealing with RDKit RWMol, Mol.
- class rdmc.mol.RDKitMol(mol: Mol | RWMol, keepAtomMap: bool = True)#
Bases:
object
A helpful wrapper for
Chem.rdchem.RWMol
. The method nomenclature follows the Camel style to be consistent with RDKit. It keeps almost all of the original methods ofChem.rdchem.RWMol
but has a few useful shortcuts so that users don’t need to refer to other RDKit modules.- AddNullConformer(confId: int | None = None, random: bool = True) None #
Embed a conformer with atoms’ coordinates of random numbers or with all atoms located at the origin to the current RDKitMol.
- Parameters:
confId (int, optional) – Which ID to set for the conformer (will be added as the last conformer by default).
random (bool, optional) – Whether set coordinates to random numbers. Otherwise, set to all-zero coordinates. Defaults to
True
.
- AddRedundantBonds(bonds: Iterable) RDKitMol #
Add redundant bonds (not originally exist in the molecule) for facilitating a few molecule operation or analyses. This function will only generate a copy of the molecule and no change is conducted inplace.
- Parameters:
bonds – a list of length-2 Iterables containing the indexes of the ended atoms.
- AlignMol(prbMol: RDKitMol | RWMol | Mol | None = None, refMol: RDKitMol | RWMol | Mol | None = None, prbCid: int = 0, refCid: int = 0, reflect: bool = False, atomMaps: list | None = None, maxIters: int = 1000, weights: list = []) float #
Align molecules based on a reference molecule. This function will also return the RMSD value for the best alignment. When leaving both
prbMol
andrefMol
blank, the function will align current molecule’s conformers, andPrbCid
orrefCid
must be provided.- Parameters:
refMol (Mol) – RDKit molecule as a reference. Should not be provided with
prbMol
.prbMol (Mol) – RDKit molecules to align to the current molecule. Should not be provided with
refMol
.prbCid (int, optional) – The conformer id to be aligned. Defaults to
0
.refCid (int, optional) – The id of reference conformer. Defaults to
0
.reflect (bool, optional) – Whether to reflect the conformation of the probe molecule. Defaults to
False
.atomMap (list, optional) – A vector of pairs of atom IDs
(prb AtomId, ref AtomId)
used to compute the alignments. If this mapping is not specified, an attempt is made to generate on by substructure matching.maxIters (int, optional) – Maximum number of iterations used in minimizing the RMSD. Defaults to
1000
.
- Returns:
float – RMSD value.
- AssignStereochemistryFrom3D(confId: int = 0)#
Assign the chirality type to a molecule’s atoms.
- Parameters:
confId (int, optional) – The ID of the conformer whose geometry is used to determine the chirality. Defaults to
0
.
- CalcRMSD(prbMol: RDKitMol, prbCid: int = 0, refCid: int = 0, reflect: bool = False, atomMaps: list | None = None, weights: list = []) float #
Calculate the RMSD between conformers of two molecules. Note this function will not align conformers, thus molecules’ geometries are not translated or rotated during the calculation. You can expect a larger number compared to the RMSD from
AlignMol()
.- Parameters:
prbMol (RDKitMol) – The other molecule to compare with. It can be set to the current molecule.
prbCid (int, optional) – The conformer ID of the current molecule to calculate RMSD. Defaults to
0
.refCid (int, optional) – The conformer ID of the other molecule to calculate RMSD. Defaults to
0
.reflect (bool, optional) – Whether to reflect the conformation of the
prbMol
. Defaults toFalse
.atomMaps (list, optional) – Provide an atom mapping to calculate the RMSD. By default,
prbMol
and current molecule are assumed to have the same atom order.weights (list, optional) – Specify weights to each atom pairs. E.g., use atom weights to highlight the importance of heavy atoms. Defaults to
[]
for using unity weights.
- Returns:
float – RMSD value.
- CombineMol(molFrag: RDKitMol | Mol, offset: list | tuple | float | ndarray = 0, c_product: bool = False) RDKitMol #
Combine the current molecule with the given
molFrag
(another molecule or fragment). A new object instance will be created and changes are not made to the current molecule.- Parameters:
molFrag (RDKitMol or Mol) – The molecule or fragment to be combined into the current one.
offset –
(list or tuple): A 3-element vector used to define the offset.
(float): Distance in Angstrom between the current mol and the
molFrag
along the x axis.
c_product (bool, optional) –
If
True
, generate conformers for every possible combination between the current molecule and themolFrag
. E.g., (1,1), (1,2), … (1,n), (2,1), …(m,1), … (m,n). \(N(conformer) = m \times n.\)Defaults to
False
, meaning only generate conformers according to (1,1), (2,2), … Whenc_product
is set toFalse
, if the current molecule has 0 conformer, conformers will be embedded to the current molecule first. The number of conformers of the combined molecule will be equal to the number of conformers ofmolFrag
. Otherwise, the number of conformers of the combined molecule will be equal to the number of conformers of the current molecule. Some coordinates may be filled by 0s, if the current molecule andmolFrag
have different numbers of conformers.
- Returns:
RDKitMol – The combined molecule.
- Copy(quickCopy: bool = False, confId: int = -1, copy_attrs: list | None = None) RDKitMol #
Make a copy of the current
RDKitMol
.- Parameters:
quickCopy (bool, optional) – Use the quick copy mode without copying conformers. Defaults to
False
.confId (int, optional) – The conformer ID to be copied. Defaults to
-1
, meaning all conformers.copy_attrs (list, optional) – Copy specific attributes to the new molecule. Defaults to
None
.
- Returns:
RDKitMol – a copied molecule
- EmbedConformer(embed_null: bool = True, **kwargs)#
Embed a conformer to the
RDKitMol
. This will overwrite current conformers. By default, it will first try embedding a 3D conformer; if fails, it then try to compute 2D coordinates and use that for the conformer structure; if both approaches fail, and embedding a null conformer is allowed, a conformer with all zero coordinates will be embedded. The last one is helpful for the case where you can use SetPositions to set their positions afterward, or if you want to optimize the geometry using force fields.- Parameters:
embed_null (bool) – If embedding 3D and 2D coordinates fails, whether to embed a conformer with all null coordinates,
(0, 0, 0)
, for each atom. Defaults toTrue
.
- EmbedMultipleConfs(n: int = 1, embed_null: bool = True, **kwargs)#
Embed multiple conformers to the
RDKitMol
. This will overwrite current conformers. By default, it will first try embedding a 3D conformer; if fails, it then try to compute 2D coordinates and use that for the conformer structure; if both approaches fail, and embedding a null conformer is allowed, a conformer with all zero coordinates will be embedded. The last one is helpful for the case where you can use SetPositions to set their positions afterward, or if you want to optimize the geometry using force fields.- Parameters:
n (int) – The number of conformers to be embedded. The default is
1
.embed_null (bool) – If embeding fails, whether to embed null conformers. Defaults to
True
.
- EmbedMultipleNullConfs(n: int = 10, random: bool = True)#
Embed conformers with null or random atom coordinates. This helps the cases where a conformer can not be successfully embedded. You can choose to generate all zero coordinates or random coordinates. You can set to all-zero coordinates, if you will set coordinates later; You should set to random coordinates, if you want to optimize this molecule by force fields (RDKit force field cannot optimize all-zero coordinates).
- Parameters:
n (int) – The number of conformers to be embedded. Defaults to
10
.random (bool, optional) – Whether set coordinates to random numbers. Otherwise, set to all-zero coordinates. Defaults to
True
.
- EmbedNullConformer(random: bool = True)#
Embed a conformer with null or random atom coordinates. This helps the cases where a conformer can not be successfully embedded. You can choose to generate all zero coordinates or random coordinates. You can set to all-zero coordinates, if you will set coordinates later; You should set to random coordinates, if you want to optimize this molecule by force fields (RDKit force field cannot optimize all-zero coordinates).
- Parameters:
random (bool, optional) – Whether set coordinates to random numbers. Otherwise, set to all-zero coordinates. Defaults to
True
.
- classmethod FromFile(path: str, backend: str = 'openbabel', header: bool = True, removeHs: bool = False, sanitize: bool = True, sameMol: bool = False, **kwargs) RDKitMol #
Read RDKitMol from a file.
- Parameters:
path (str) – File path to data.
backend (str, optional) – The backend used to perceive molecule. Defaults to
'openbabel'
. Currently, we only support'openbabel'
and'jensen'
.header (bool, optional) – If lines of the number of atoms and title are included. Defaults to
True.
removeHs (bool) – Whether or not to remove hydrogens from the input. Defaults to
False
.sanitize (bool) – Whether or not to use RDKit’s sanitization algorithm to clean input; helpful to set this to
False
when reading TS files. Defaults toTrue
.sameMol (bool) – Whether or not all the conformers in the (sdf) file are for the same mol, in which case we will copy conformers directly to the mol. Defaults to
False
.
- Returns:
RDKitMol – An RDKit molecule object corresponding to the file.
- classmethod FromInchi(inchi: str, removeHs: bool = False, addHs: bool = True, sanitize: bool = True)#
Construct an
RDKitMol
object from a InChI string.- Parameters:
inchi (str) – A InChI string. https://en.wikipedia.org/wiki/International_Chemical_Identifier
removeHs (bool, optional) – Whether to remove hydrogen atoms from the molecule, Due to RDKit implementation, only effective when sanitize is
True
as well.True
to remove.addHs (bool, optional) – Whether to add explicit hydrogen atoms to the molecule.
True
to add. Only functioning whenremoveHs
isFalse
.sanitize (bool, optional) – Whether to sanitize the RDKit molecule,
True
to sanitize.
- Returns:
RDKitMol – An RDKit molecule object corresponding to the InChI.
- classmethod FromMol(mol: Mol | RWMol, keepAtomMap: bool = True) RDKitMol #
Convert a RDKit
Chem.rdchem.Mol
molecule toRDKitMol
Molecule.- Parameters:
rdmol (Union[Mol, RWMol]) – The RDKit
Chem.rdchem.Mol
/RWMol
molecule to be converted.keepAtomMap (bool, optional) – Whether keep the original atom mapping. Defaults to
True
. If no atom mapping is stored in the molecule, atom mapping will be created based on atom indexes.
- Returns:
RDKitMol – RDKitMol molecule converted from the input RDKit
Chem.rdchem.Mol
molecule.
- classmethod FromOBMol(obMol: openbabel.OBMol, removeHs: bool = False, sanitize: bool = True, embed: bool = True) RDKitMol #
Convert a OpenBabel Mol to an RDKitMol object.
- Parameters:
obMol (Molecule) – An OpenBabel Molecule object for the conversion.
removeHs (bool, optional) – Whether to remove hydrogen atoms from the molecule, Defaults to
False
.sanitize (bool, optional) – Whether to sanitize the RDKit molecule. Defaults to
True
.embed (bool, optional) – Whether to embeb 3D conformer from OBMol. Defaults to
True
.
- Returns:
RDKitMol – An RDKit molecule object corresponding to the input OpenBabel Molecule object.
- classmethod FromRMGMol(rmgMol: rmgpy.molecule.Molecule, removeHs: bool = False, sanitize: bool = True) RDKitMol #
Convert an RMG
Molecule
to anRDkitMol
object.- Parameters:
rmgMol ('rmg.molecule.Molecule') – An RMG
Molecule
instance.removeHs (bool, optional) – Whether to remove hydrogen atoms from the molecule,
True
to remove.sanitize (bool, optional) – Whether to sanitize the RDKit molecule,
True
to sanitize.
- Returns:
RDKitMol – An RDKit molecule object corresponding to the RMG Molecule.
- classmethod FromSDF(sdf: str, removeHs: bool = False, sanitize: bool = True) RDKitMol #
Convert an SDF string to RDKitMol.
- Parameters:
sdf (str) – An SDF string.
removeHs (bool) – Whether or not to remove hydrogens from the input. Defaults to
False
.sanitize (bool) – Whether or not to use RDKit’s sanitization algorithm to clean input; helpful to set this to
False
when reading TS files. Defaults toTrue
.
- Returns:
RDKitMol – An RDKit molecule object corresponding to the SDF string.
- classmethod FromSmarts(smarts: str) RDKitMol #
Convert a SMARTS to an
RDKitMol
object.- Parameters:
smarts (str) – A SMARTS string of the molecule
- Returns:
RDKitMol – An RDKit molecule object corresponding to the SMARTS.
- classmethod FromSmiles(smiles: str, removeHs: bool = False, addHs: bool = True, sanitize: bool = True, allowCXSMILES: bool = True, keepAtomMap: bool = True) RDKitMol #
Convert a SMILES string to an
RDkitMol
object.- Parameters:
smiles (str) – A SMILES representation of the molecule.
removeHs (bool, optional) – Whether to remove hydrogen atoms from the molecule,
True
to remove.addHs (bool, optional) – Whether to add explicit hydrogen atoms to the molecule.
True
to add. Only functioning when removeHs is False.sanitize (bool, optional) – Whether to sanitize the RDKit molecule,
True
to sanitize.allowCXSMILES (bool, optional) – Whether to recognize and parse CXSMILES. Defaults to
True
.keepAtomMap (bool, optional) – Whether to keep the Atom mapping contained in the SMILES. Defaults Defaults to
True
.
- Returns:
RDKitMol – An RDKit molecule object corresponding to the SMILES.
- classmethod FromXYZ(xyz: str, backend: str = 'openbabel', header: bool = True, sanitize: bool = True, embed_chiral: bool = False, **kwargs)#
Convert xyz string to RDKitMol.
- Parameters:
xyz (str) – A XYZ String.
backend (str) – The backend used to perceive molecule. Defaults to
'openbabel'
. Currently, we only support'openbabel'
and'jensen'
.header (bool, optional) – If lines of the number of atoms and title are included. Defaults to
True.
sanitize (bool) – Sanitize the RDKit molecule during conversion. Helpful to set it to
False
when reading in TSs. Defaults toTrue
.embed_chiral –
True
to embed chiral information. Defaults toTrue
.kwargs (supported) –
- jensen:
charge: The charge of the species. Defaults to
0
.allow_charged_fragments:
True
for charged fragment,False
for radical. Defaults toFalse
.use_graph:
True
to use networkx module for accelerate. Defaults toTrue
.use_huckel:
True
to use extended Huckel bond orders to locate bonds. Defaults toFalse
.- forced_rdmc: Defaults to
False
. In rare case, we may hope to use a tailored version of the Jensen XYZ parser, other than the one available in RDKit. Set this argument to
True
to force use RDMC’s implementation, which user’s may have some flexibility to modify.
- forced_rdmc: Defaults to
- Returns:
RDKitMol – An RDKit molecule object corresponding to the xyz.
- GetAdjacencyMatrix()#
Get the adjacency matrix of the molecule.
- Returns:
numpy.ndarray – A square adjacency matrix of the molecule, where a 1 indicates that atoms are bonded and a 0 indicates that atoms aren’t bonded.
- GetAllConformers() List[RDKitConf] #
Get all of the embedded conformers.
- Returns:
List[‘RDKitConf’] – A list all of conformers.
- GetAtomMapNumbers() tuple #
Get the atom mapping.
- Returns:
tuple – atom mapping numbers in the sequence of atom index.
- GetAtomMasses() List[float] #
Get the mass of each atom. The order is consistent with the atom indexes.
- Returns:
list – A list of atom masses.
- GetAtomicNumbers()#
Get the Atomic numbers of the molecules. The atomic numbers are sorted by the atom indexes.
- Returns:
list – A list of atomic numbers.
- GetAtoms() list #
This is a rewrite of GetAtoms(), based on the findings of RDKit issue. Although RDKit fixed this issue in version 2023.09, we still keep this function for backward compatibility.
- GetBestAlign(refMol, prbCid: int = 0, refCid: int = 0, atomMaps: list | None = None, maxIters: int = 1000, keepBestConformer: bool = True)#
This is a wrapper function for calling
AlignMol
twice, withreflect
toTrue
andFalse
, respectively.- Parameters:
refMol (Mol) – RDKit molecule as a reference.
prbCid (int, optional) – The conformer id to be aligned. Defaults to
0
.refCid (int, optional) – The id of reference conformer. Defaults to
0
.reflect (bool, optional) – Whether to reflect the conformation of the probe molecule. Defaults to
False
.atomMap (list, optional) – a vector of pairs of atom IDs
(probe AtomId, ref AtomId)
used to compute the alignments. If this mapping is not specified an attempt is made to generate on by substructure matching.maxIters (int, optional) – maximum number of iterations used in minimizing the RMSD. Defaults to
1000
.keepBestConformer (bool, optional) – Whether to keep the best Conformer structure. Defaults to
True
. This is less helpful when you are comparing different atom mappings.
- Returns:
float – RMSD value.
bool – if reflected conformer gives a better result.
- GetBondsAsTuples() List[tuple] #
Generate a list of length-2 sets indicating the bonding atoms in the molecule.
- Returns:
list – A list of length-2 sets indicating the bonding atoms.
- GetClosedShellMol(cheap: bool = False, sanitize: bool = True) RDKitMol #
Get a closed shell molecule by removing all radical electrons and adding H atoms to these radical sites. This method currently only work for radicals and will not work properly for singlet radicals.
- Parameters:
cheap (bool) – Whether to use a cheap method where H atoms are only implicitly added. Defaults to
False
. Setting it toFalse
only when the molecule is immediately used for generating SMILES/InChI and other representations, and no further manipulation is needed. Otherwise, it may be confusing as the hydrogen atoms will not appear in the list of atoms, not display in the 2D graph, etc.sanitize (bool) – Whether to sanitize the molecule. Defaults to
True
.
- Returns:
RDKitMol – A closed shell molecule.
- GetConformer(id: int = 0) RDKitConf #
Get the embedded conformer according to ID.
- Parameters:
id (int) – The ID of the conformer to be obtained. The default is
0
.- Raises:
ValueError – Bad id assigned.
- Returns:
RDKitConf – A conformer corresponding to the ID.
- GetConformers(ids: list | tuple = [0]) List[RDKitConf] #
Get the embedded conformers according to IDs.
- Parameters:
ids (Union[list, tuple]) – The ids of the conformer to be obtained. The default is
[0]
.- Raises:
ValueError – Bad id assigned.
- Returns:
List[RDKitConf] – A list of conformers corresponding to the IDs.
- GetDistanceMatrix(id: int = 0) ndarray #
Get the distance matrix of the molecule.
- Parameters:
id (int, optional) – The conformer ID to extract distance matrix from. Defaults to
0
.- Returns:
np.ndarray – A square distance matrix of the molecule.
- GetElementCounts() Dict[str, int] #
Get the element counts of the molecules.
- Returns:
dict – A dictionary of element counts.
- GetElementSymbols() List[str] #
Get the element symbols of the molecules. The element symbols are sorted by the atom indexes.
- Returns:
list – A list of element symbols.
- GetFingerprint(fpType: str = 'morgan', numBits: int = 2048, count: bool = False, **kwargs) ndarray #
Get the fingerprint of the molecule.
- Parameters:
fpType (str, optional) – The type of the fingerprint. Defaults to
'morgan'
.numBits (int, optional) – The number of bits of the fingerprint. Defaults to
2048
.count (bool, optional) – Whether to count the number of occurrences of each bit. Defaults to
False
.
- Returns:
np.ndarray – A fingerprint of the molecule.
- GetFormalCharge() int #
Get formal charge of the molecule.
- Returns:
int – Formal charge.
- GetHeavyAtoms() list #
Get heavy atoms of the molecule with the order consistent with the atom indexes.
- Returns:
list – A list of heavy atoms.
- GetInternalCoordinates(nonredundant: bool = True) list #
Get internal coordinates of the molecule.
- Parameters:
nonredundant (bool) – Whether to return nonredundant internal coordinates. Defaults to
True
.- Returns:
list – A list of internal coordinates.
- GetMolFrags(asMols: bool = False, sanitize: bool = True, frags: list | None = None, fragsMolAtomMapping: list | None = None) tuple #
Finds the disconnected fragments from a molecule. For example, for the molecule “CC(=O)[O-].[NH3+]C”, this function will split the molecules into a list of “CC(=O)[O-]” and “[NH3+]C”. By defaults, this function will return a list of atom mapping, but options are available for getting mols.
- Parameters:
asMols (bool, optional) – Whether the fragments will be returned as molecules instead of atom IDs. Defaults to
True
.sanitize (bool, optional) – Whether the fragments molecules will be sanitized before returning them. Defaults to
True
.frags (list, optional) – If this is provided as an empty list, the result will be
mol.GetNumAtoms()
long on return and will contain the fragment assignment for each Atom.fragsMolAtomMapping (list, optional) – If this is provided as an empty list (
[]
), the result will be a numFrags long list on return, and each entry will contain the indices of the Atoms in that fragment: [(0, 1, 2, 3), (4, 5)].
- Returns:
tuple – a tuple of atom mapping or a tuple of split molecules (RDKitMol).
- GetPositions(id: int = 0) ndarray #
Get atom positions of the embeded conformer.
- Parameters:
id (int, optional) – The conformer ID to extract atom positions from. Defaults to
0
.- Returns:
np.ndarray – a 3 x N matrix containing atom coordinates.
- GetSpinMultiplicity() int #
Get spin multiplicity of a molecule. The spin multiplicity is calculated using Hund’s rule of maximum multiplicity defined as 2S + 1.
- Returns:
int – Spin multiplicity.
- GetSubstructMatch(query: RDKitMol | RWMol | Mol, useChirality: bool = False, useQueryQueryMatches: bool = False) tuple #
Returns the indices of the molecule’s atoms that match a substructure query.
- Parameters:
query (Mol) – An RDkit Molecule.
useChirality (bool, optional) – Enables the use of stereochemistry in the matching. Defaults to
False
.useQueryQueryMatches (bool, optional) – Use query-query matching logic. Defaults to
False
.
- Returns:
tuple – A tuple of matched indices.
- GetSubstructMatchAndRecipe(mol: RDKitMol) Tuple[tuple, dict] #
Get the substructure match between two molecules and a recipe to recover the provide mol to the current mol. If swapping the atom indices in mol according to the recipe, the mol should have the same connectivity as the current molecule. Note, if no match is found, the returned match and recipe will be empty.
- Parameters:
mol (RDKitMol) – The molecule to compare with.
- Returns:
tuple – The substructure match.
dict – A truncated atom mapping of mol2 to mol1.
- GetSubstructMatches(query: RDKitMol | RWMol | Mol, uniquify: bool = True, useChirality: bool = False, useQueryQueryMatches: bool = False, maxMatches: int = 1000) tuple #
Returns tuples of the indices of the molecule’s atoms that match a substructure query.
- Parameters:
query (Mol) – a Molecule.
uniquify (bool, optional) – determines whether or not the matches are uniquified. Defaults to
True
.useChirality (bool, optional) – enables the use of stereochemistry in the matching. Defaults to
False
.useQueryQueryMatches (bool, optional) – use query-query matching logic. Defaults to
False
.maxMatches – The maximum number of matches that will be returned to prevent a combinatorial explosion. Defaults to
1000
.
- Returns:
tuple – A tuple of tuples of matched indices.
- GetSymmSSSR() tuple #
Get a symmetrized SSSR for a molecule.
- Returns:
tuple – A sequence of sequences containing the rings found as atom IDs.
- GetTorsionTops(torsion: Iterable, allowNonbondPivots: bool = False) tuple #
Generate tops for the given torsion. Top atoms are defined as atoms on one side of the torsion. The mol should be in one-piece when using this function, otherwise, the results will be misleading.
- Parameters:
torsion (Iterable) – An iterable with four elements and the 2nd and 3rd are the pivot of the torsion.
allowNonbondPivots (bool, optional) – Allow non-bonding pivots. Defaults to
False
.
- Returns:
tuple – Two frags, one of the top of the torsion, and the other top of the torsion.
- GetTorsionalModes(excludeMethyl: bool = False, includeRings: bool = False) list #
Get all of the torsional modes (rotors) from the molecule.
- Parameters:
excludeMethyl (bool) – Whether exclude the torsions with methyl groups. Defaults to
False
.includeRings (bool) – Whether or not to include ring torsions. Defaults to
False
.
- Returns:
list – A list of four-atom-indice to indicating the torsional modes.
- GetVdwMatrix(threshold: float = 0.4) ndarray | None #
Get the derived Van der Waals matrix, which can be used to analyze the collision of atoms. More information can be found from
generate_vdw_mat
.- Parameters:
threshold – A float indicating the threshold to use in the vdw matrix. Defaults to
0.4
.- Returns:
Optional[np.ndarray] – A 2D array of the derived Van der Waals Matrix, if the the matrix exists, otherwise
None
.
- HasCollidingAtoms(threshold: float = 0.4) bool #
Check whether the molecule has colliding atoms.
- Parameters:
threshold – A float indicating the threshold to use in the vdw matrix. Defaults to
0.4
.- Returns:
bool – Whether the molecule has colliding atoms.
- HasSameConnectivity(refmol: RDKitMol) bool #
Check wheter the molecule has the same connectivity as the reference molecule.
- Parameters:
refmol (RDKitMol) – The reference molecule.
- Returns:
bool – Whether the molecule has the same connectivity as the reference molecule.
- HasSameConnectivityConformer(confId: int = 0, backend: str = 'openbabel', **kwargs) bool #
Check whether the conformer of the molecule (defined by its spacial coordinates) as the same connectivity as the molecule.
- Parameters:
confId (int, optional) – The conformer ID. Defaults to
0
.backend (str, optional) – The backend to use for the comparison. Defaults to
'openbabel'
.**kwargs – The keyword arguments to pass to the backend.
- Returns:
bool – Whether the conformer has the same connectivity as the molecule.
- Kekulize(clearAromaticFlags: bool = False)#
Kekulizes the molecule.
- Parameters:
clearAromaticFlags (optional) – If
True
, all atoms and bonds in the molecule will be marked non-aromatic following the kekulization. Defaults toFalse
.
- PrepareOutputMol(removeHs: bool = False, sanitize: bool = True) Mol #
Generate a RDKit Mol instance for output purpose, to ensure that the original molecule is not modified.
- Parameters:
removeHs (bool, optional) –
Remove less useful explicity H atoms. E.g., When output SMILES, H atoms, if explicitly added, will be included and reduce the readablity. Defaults to
False
. Note, following Hs are not removed:H which aren’t connected to a heavy atom. E.g.,[H][H].
Labelled H. E.g., atoms with atomic number=1, but isotope > 1.
Two coordinate Hs. E.g., central H in C[H-]C.
Hs connected to dummy atoms
Hs that are part of the definition of double bond Stereochemistry.
Hs that are not connected to anything else.
sanitize (bool, optional) – Whether to sanitize the molecule. Defaults to
True
.
- Returns:
Mol – A Mol instance used for output purpose.
- Reflect(id: int = 0)#
Reflect the atom coordinates of a molecule, and therefore its mirror image.
- Parameters:
id (int, optional) – The conformer id to reflect. Defaults to
0
.
- RemoveHs(sanitize: bool = True)#
Remove H atoms. Useful when trying to match heavy atoms.
- Parameters:
sanitize (bool, optional) – Whether to sanitize the molecule. Defaults to
True
.
- RenumberAtoms(newOrder: dict | list | None = None, updateAtomMap: bool = True) RDKitMol #
Return a new copy of RDKitMol that has atom (index) reordered.
- Parameters:
newOrder (list or dict, optional) – The new ordering the atoms (should be numAtoms long). - If provided as a list, it should a list of atom indexes. E.g., if newOrder is
[3,2,0,1]
, then atom3
in the original molecule will be atom0
in the new one. - If provided as a dict, it should be a mapping between atoms. E.g., if newOrder is{0: 3, 1: 2, 2: 0, 3: 1}
, then atom0
in the original molecule will be atom3
in the new one. Unlike the list case, the newOrder can be a partial mapping, but one should make sure all the pairs are included. E.g.,{0: 3, 3: 0}
. - If no value provided (default), then the molecule will be renumbered based on the current atom map numbers. The latter is helpful when the sequence of atom map numbers and atom indexes are inconsistent.updateAtomMap (bool) – Whether to update the atom map number based on the new order. Defaults to
True
.
- Returns:
RDKitMol – Molecule with reordered atoms.
- Sanitize(sanitizeOps: int | SanitizeFlags | None = rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_ALL)#
Sanitize the molecule.
- Parameters:
sanitizeOps (int or str, optional) – Sanitize operations to be carried out. Defaults to
SanitizeFlags.SANITIZE_ALL
. More details can be found at RDKit docs.
- SaturateBiradicalSites12(multiplicity: int, verbose: bool = True)#
A method help to saturate 1,2 biradicals to match the given molecule spin multiplicity. E.g.:
*C - C* => C = C
In the current implementation, no error will be raised, if the function doesn’t achieve the goal. This function has not been been tested on nitrogenate.
- Parameters:
multiplicity (int) – The target multiplicity.
verbose (bool) – Whether to print additional information. Defaults to
True
.
- SaturateBiradicalSitesCDB(multiplicity: int, chain_length: int = 8, verbose: bool = True)#
A method help to saturate biradicals that have conjugated double bond in between to match the given molecule spin multiplicity. E.g, 1,4 biradicals can be saturated if there is a unsaturated bond between them:
*C - C = C - C* => C = C - C = C
In the current implementation, no error will be raised, if the function doesn’t achieve the goal. This function has not been been tested on nitrogenate.
- Parameters:
multiplicity (int) – The target multiplicity.
chain_length (int) – How long the conjugated double bond chain is. A larger value will result in longer computational time. Defaults to
8
.verbose (bool) – Whether to print additional information. Defaults to
True
.
- SaturateCarbene(multiplicity: int, verbose: bool = True)#
A method help to saturate carbenes and nitrenes to match the given molecule spin multiplicity:
*-C-* (triplet) => C-(**) (singlet)
In the current implementation, no error will be raised, if the function doesn’t achieve the goal. This function has not been been tested on nitrogenate.
- Parameters:
multiplicity (int) – The target multiplicity.
verbose (int) – Whether to print additional information. Defaults to
True
.
- SaturateMol(multiplicity: int, chain_length: int = 8, verbose: bool = False)#
A method help to saturate the molecule to match the given molecule spin multiplicity. This is just a wrapper to call
SaturateBiradicalSites12()
,SaturateBiradicalSitesCDB()
, andSaturateCarbene()
:*C - C* => C = C *C - C = C - C* => C = C - C = C *-C-* (triplet) => C-(**) (singlet)
In the current implementation, no error will be raised, if the function doesn’t achieve the goal. This function has not been been tested on nitrogenate.
- Parameters:
multiplicity (int) – The target multiplicity.
chain_length (int) – How long the conjugated double bond chain is. A larger value will result in longer time. Defaults to
8
.verbose (bool) – Whether to print intermediate information. Defaults to
False
.
- SetAtomMapNumbers(atomMap: Sequence[int] | None = None)#
Set the atom mapping number. By defaults, atom indexes are used. It can be helpful when plotting the molecule in a 2D graph.
- Parameters:
atomMap (list, tuple, optional) – A sequence of integers for atom mapping.
- SetPositions(coords: Sequence | str, id: int = 0, header: bool = False)#
Set the atom positions to one of the conformer.
- Parameters:
coords (sequence) – A tuple/list/ndarray containing atom positions; or a string with the typical XYZ formating.
id (int, optional) – Conformer ID to assign the Positions to. Defaults to
1
.header (bool) – When the XYZ string has an header. Defaults to
False
.
- SetVdwMatrix(threshold: float = 0.4, vdw_radii: dict = {1: 1.2, 2: 1.4, 3: 2.2, 4: 1.9, 5: 1.8, 6: 1.7, 7: 1.6, 8: 1.55, 9: 1.5, 10: 1.54, 11: 2.4, 12: 2.2, 13: 2.1, 14: 2.1, 15: 1.95, 16: 1.8, 17: 1.8, 18: 1.88, 19: 2.8, 20: 2.4, 21: 2.3, 22: 2.15, 23: 2.05, 24: 2.05, 25: 2.05, 26: 2.05, 27: 2.0, 28: 2.0, 29: 2.0, 30: 2.1, 31: 2.1, 32: 2.1, 33: 2.05, 34: 1.9, 35: 1.9, 36: 2.02, 37: 2.9, 38: 2.55, 39: 2.4, 40: 2.3, 41: 2.15, 42: 2.1, 43: 2.05, 44: 2.05, 45: 2.0, 46: 2.05, 47: 2.1, 48: 2.2, 49: 2.2, 50: 2.25, 51: 2.2, 52: 2.1, 53: 2.1, 54: 2.16, 55: 3.0, 56: 2.7, 57: 2.5, 58: 2.48, 59: 2.47, 60: 2.45, 61: 2.43, 62: 2.42, 63: 2.4, 64: 2.38, 65: 2.37, 66: 2.35, 67: 2.33, 68: 2.32, 69: 2.3, 70: 2.28, 71: 2.27, 72: 2.25, 73: 2.2, 74: 2.1, 75: 2.05, 76: 2.0, 77: 2.0, 78: 2.05, 79: 2.1, 80: 2.05, 81: 2.2, 82: 2.3, 83: 2.3, 84: 2.0, 85: 2.0, 86: 2.0, 87: 2.0, 88: 2.0, 89: 2.0, 90: 2.4, 91: 2.0, 92: 2.3, 93: 2.0, 94: 2.0, 95: 2.0, 96: 2.0, 97: 2.0, 98: 2.0, 99: 2.0, 100: 2.0, 101: 2.0, 102: 2.0, 103: 2.0, 104: 2.0, 105: 2.0, 106: 2.0, 107: 2.0, 108: 2.0, 109: 2.0, 110: 2.0, 111: 2.0, 112: 2.0, 113: 2.0, 114: 2.0, 115: 2.0, 116: 2.0, 117: 2.0, 118: 2.0})#
Set the derived Van der Waals matrix, which is an upper triangle matrix calculated from a threshold usually around
0.4
of the Van der Waals Radii. Its diagonal elements are all zeros. The element (i, j) is calculated by threshold * sum( R(atom i) + R(atom j) ). If two atoms are bonded, the value is set to be zero. When threshold = 0.4, the value is close to the covalent bond length.- Parameters:
threshold (float) – The threshold used to calculate the derived Van der Waals matrix. A larger value results in a matrix with larger values; When compared with distance matrix, it may overestiate the overlapping between atoms. The default value is
0.4
.vdw_radii (dict) – A dict stores the Van der Waals radii of different elements.
- Raises:
ValueError – Invalid threshold is supplied.
- ToAtoms(confId: int = 0) Atoms #
Convert
RDKitMol
to thease.Atoms
object.- Parameters:
confId (int) – The conformer ID to be exported. Defaults to
0
.- Returns:
Atoms – The corresponding
ase.Atoms
object.
- ToGraph(keep_bond_order: bool = False) Graph #
Convert RDKitMol to a networkx graph.
- Parameters:
keep_bond_order (bool) – Whether to keep bond order information. Defaults to
False
, meaning treat all bonds as single bonds.- Returns:
nx.Graph – A networkx graph representing the molecule.
- ToInchi(options: str = '') str #
Convert the RDKitMol to a InChI string using RDKit builtin converter.
- Parameters:
options (str, optional) – The InChI generation options. Options should be prefixed with either a - or a / Available options are explained in the InChI technical FAQ: https://www.inchi-trust.org/technical-faq/#15.14 and https://www.inchi-trust.org/?s=user+guide. Defaults to “”.
- ToMolBlock(confId: int = -1) str #
Convert
RDKitMol
to a mol block string.- Parameters:
confId (int) – The conformer ID to be exported.
- Returns:
str – The mol block of the molecule.
- ToOBMol() openbabel.OBMol #
Convert
RDKitMol
to aOBMol
.- Returns:
OBMol – The corresponding openbabel
OBMol
.
- ToRWMol() RWMol #
Convert the
RDKitMol
Molecule back to a RDKitChem.rdchem.RWMol
.- Returns:
RWMol – A RDKit
Chem.rdchem.RWMol
molecule.
- ToSDFFile(path: str, confId: int = -1)#
Write molecule information to .sdf file.
- Parameters:
path (str) – The path to save the .sdf file.
- ToSmiles(stereo: bool = True, kekule: bool = False, canonical: bool = True, removeAtomMap: bool = True, removeHs: bool = True) str #
Convert RDKitMol to a SMILES string.
- Parameters:
stereo (bool, optional) – Whether keep stereochemistry information. Defaults to
True
.kekule (bool, optional) – Whether use Kekule form. Defaults to
False
.canonical (bool, optional) – Whether generate a canonical SMILES. Defaults to
True
.removeAtomMap (bool, optional) – Whether to remove map id information in the SMILES. Defaults to
True
.removeHs (bool, optional) – Whether to remove H atoms to make obtained SMILES clean. Defaults to
True
.
- Returns:
str – The smiles string of the molecule.
- ToXYZ(confId: int = -1, header: bool = True, comment: str = '') str #
Convert
RDKitMol
to a xyz string.- Parameters:
confId (int) – The conformer ID to be exported.
header (bool, optional) – Whether to include header (first two lines). Defaults to
True
.
- Returns:
str – The xyz of the molecule.
- rdmc.mol.generate_vdw_mat(rd_mol, threshold: float = 0.4, vdw_radii: dict = {1: 1.2, 2: 1.4, 3: 2.2, 4: 1.9, 5: 1.8, 6: 1.7, 7: 1.6, 8: 1.55, 9: 1.5, 10: 1.54, 11: 2.4, 12: 2.2, 13: 2.1, 14: 2.1, 15: 1.95, 16: 1.8, 17: 1.8, 18: 1.88, 19: 2.8, 20: 2.4, 21: 2.3, 22: 2.15, 23: 2.05, 24: 2.05, 25: 2.05, 26: 2.05, 27: 2.0, 28: 2.0, 29: 2.0, 30: 2.1, 31: 2.1, 32: 2.1, 33: 2.05, 34: 1.9, 35: 1.9, 36: 2.02, 37: 2.9, 38: 2.55, 39: 2.4, 40: 2.3, 41: 2.15, 42: 2.1, 43: 2.05, 44: 2.05, 45: 2.0, 46: 2.05, 47: 2.1, 48: 2.2, 49: 2.2, 50: 2.25, 51: 2.2, 52: 2.1, 53: 2.1, 54: 2.16, 55: 3.0, 56: 2.7, 57: 2.5, 58: 2.48, 59: 2.47, 60: 2.45, 61: 2.43, 62: 2.42, 63: 2.4, 64: 2.38, 65: 2.37, 66: 2.35, 67: 2.33, 68: 2.32, 69: 2.3, 70: 2.28, 71: 2.27, 72: 2.25, 73: 2.2, 74: 2.1, 75: 2.05, 76: 2.0, 77: 2.0, 78: 2.05, 79: 2.1, 80: 2.05, 81: 2.2, 82: 2.3, 83: 2.3, 84: 2.0, 85: 2.0, 86: 2.0, 87: 2.0, 88: 2.0, 89: 2.0, 90: 2.4, 91: 2.0, 92: 2.3, 93: 2.0, 94: 2.0, 95: 2.0, 96: 2.0, 97: 2.0, 98: 2.0, 99: 2.0, 100: 2.0, 101: 2.0, 102: 2.0, 103: 2.0, 104: 2.0, 105: 2.0, 106: 2.0, 107: 2.0, 108: 2.0, 109: 2.0, 110: 2.0, 111: 2.0, 112: 2.0, 113: 2.0, 114: 2.0, 115: 2.0, 116: 2.0, 117: 2.0, 118: 2.0})#
Generate a derived Van der Waals matrix, which is an upper triangle matrix calculated from a threshold usually around 0.4 of the Van der Waals Radii. Its diagonal elements are all zeros. The element (i, j) is calculated by threshold * sum( R(atom i) + R(atom j) ). If two atoms are bonded, the value is set to be zero. When threshold = 0.4, the value is close to the covalent bond length.
- Parameters:
threshold (float) – The threshold used to calculate the derived Van der Waals matrix. A larger value results in a matrix with larger values; When compared with distance matrix, it may overestiate the overlapping between atoms. The default value is
0.4
.vdw_radii (dict) – A dict stores the Van der Waals radii of different elements.
- Raises:
ValueError – Invalid threshold is supplied.
- rdmc.mol.parse_xyz_or_smiles_list(mol_list, with_3d_info: bool = False, **kwargs)#
A helper function to parse xyz and smiles and list if the conformational information is provided.
- Parameters:
mol_list (list) – a list of smiles or xyzs or tuples of (string, multiplicity) to specify desired multiplicity. E.g.,
['CCC', 'H 0 0 0', ('[CH2]', 1)]
with_3d_info (bool) – Whether to indicate which entries are from 3D representations. Defaults to
False
.
- Returns:
list – A list of RDKitMol objects.