Utils#

Utilities for conformer generation modules.

rdmc.conformer_generation.utils.cluster_confs(mol: RDKitMol, cutoff: float = 1.0) RDKitMol#

Cluster conformers of a molecule based on RMSD.

Parameters:
  • mol ('RDKitMol') – An RDKitMol object.

  • cutoff (float, optional) – The cutoff for clustering. Defaults to 1.0.

Returns:

mol (‘RDKitMol’) – An RDKitMol object with clustered conformers.

rdmc.conformer_generation.utils.convert_log_to_mol(log_path: str, amplitude: float = 1.0, num_frames: int = 10, weights: bool | array = False) None | RDKitMol#

Convert a TS optimization log file to an RDKitMol object with conformers.

Parameters:
  • log_path (str) – The path to the log file.

  • amplitude (float) – The amplitude of the motion. If a single value is provided then the guess will be unique (if available). 0.25 is the default. Otherwise, a list can be provided, and all possible results will be returned.

  • num_frames (int) – The number of frames in each direction (forward and reverse). Defaults to 10.

  • weights (bool or np.array) – If True, use the sqrt(atom mass) as a scaling factor to the displacement. If False, use the identity weights. If a N x 1 np.array is provided, then The concern is that light atoms (e.g., H) tend to have larger motions than heavier atoms.

Returns:

mol (‘RDKitMol’) – An RDKitMol object.

rdmc.conformer_generation.utils.dict_to_mol(mol_data: List[dict], conf_copy_attrs: list | None = None) RDKitMol#

Convert a dictionary that stores its conformers object, atom coordinates, and conformer-level attributes to an RDKitMol. The method assumes that the first conformer’s owning mol contains the conformer-level attributes, which are extracted through the Copy function (this should be the case if the dictionary was generated with the mol_to_dict function).

Parameters:
  • mol_data (list) – A list containing dictionaries of data entries for each conformer.

  • conf_copy_attrs (list, optional) – Conformer-level attributes to copy to the mol. Defaults to None, which means no attributes will be copied.

Returns:

mol (‘RDKitMol’) – An RDKitMol object.

rdmc.conformer_generation.utils.get_conf_failure_mode(rxn_dir: str, pruner: bool = True) dict#

Parse a reaction directory for a TS generation run and extract failure modes (which conformer failed the full workflow and for what reason).

Parameters:
  • rxn_dir (str) –

  • (bool (pruner) – Optional) Whether or not pruner was used during workflow. Defaults to True.

Returns:

failure_dict (‘dict’) – Dictionary of conformer ids mapped to the corresponding failure mode. the failure_mode can be one of the following: opt, prune, freq, irc, workflow, none.

rdmc.conformer_generation.utils.get_frames_from_freq(log: GaussianLog, amplitude: float = 1.0, num_frames: int = 10, weights: bool | array = False) Tuple[ndarray, ndarray]#

Get the reaction mode as frames from a TS optimization log file.

Parameters:
  • log (GaussianLog) – A gaussian log object with vibrational freq calculated.

  • amplitude (float) – The amplitude of the motion. If a single value is provided then the guess will be unique (if available). 0.25 is the default. Otherwise, a list can be provided, and all possible results will be returned.

  • num_frames (int) – The number of frames in each direction (forward and reverse). Defaults to 10.

  • weights (bool or np.array) – If True, use the sqrt(atom mass) as a scaling factor to the displacement. If False, use the identity weights. If a N x 1 np.array is provided, then The concern is that light atoms (e.g., H) tend to have larger motions than heavier atoms.

Returns:
  • np.array – The atomic numbers as an 1D array

  • np.array – The 3D geometries at each frame as a 3D array (number of frames x 2 + 1, number of atoms, 3)

rdmc.conformer_generation.utils.mol_to_dict(mol: RDKitMol, copy: bool = True, iter: int | None = None, conf_copy_attrs: list | None = None) List[dict]#

Convert a molecule to a dictionary that stores its conformers object, atom coordinates, and iteration numbers for a certain calculation (optional).

Parameters:
  • mol ('RDKitMol') – An RDKitMol object.

  • copy (bool, optional) – Use a copy of the molecule to process data. Defaults to True.

  • iter (int, optional) – Number of iterations. Defaults to None.

  • conf_copy_attrs (list, optional) – Conformer-level attributes to copy to the dictionary. Defaults to None, which means no attributes will be copied.

Returns:

list – mol data as a list of dict; each dict corresponds to a conformer.

rdmc.conformer_generation.utils.subprocess_runner(command: list, log_path: str, work_dir: str | None = None, env: dict | None = None)#

Run the Gaussian task with the subprcoess module.

Parameters:
  • command (list, optional) – The command to run. Defaults to None, the command will be [self.binary_path, input_path].

  • work_dir (str, optional) – The working directory. Defaults to None, the current working directory will be used.