Directed conformer generator

Making sure you have explored all dihedral angles of a molecule is tricky. Ideally, a library would approximate the local dihedral energy distribution to correctly guess the minima and generate those conformers for you. Unfortunately, Molassembler is short on information needed for such approximations to be correct. For one, Molassembler avoids requiring full correctness of graph bond orders to keep errors in floating-point bond order discretization from propagating. Additionally, Molassembler doesn’t ask you to specify the overall charge of your molecule. The phenomenological approach to local molecular shape doesn’t require it. It is impossible to assign formal charges purely based on local molecular shapes without knowing overall charge or the guarantee that the formal bond orders of the graph are correct.

So what can Molassembler offer instead? Given that Molassembler quite explicitly states that its generated conformers are guesses to local energy minima on the potential energy surface, what Molassembler can try to ensure is that minimizations of generated conformers minimize into all local minima that exist.

To that end, we suggest you carefully read the details of Alignment and consider deduplicating energy minimized conformer guesses with the Relabeler.

class scine_molassembler.DirectedConformerGenerator

Helper type for directed conformer generation.

Generates new combinations of BondStereopermutator assignments and provides helper functions for the generation of conformers using these combinations and the reverse, finding the combinations from conformers.

It is important that you lower your expectations for the modeling of dihedral energy minima, however. Considering that Molassembler neither requires you to supply a correct graph, never detects or kekulizes aromatic systems nor asks you to supply an overall charge for a molecule, it should be understandable that the manner in which Molassembler decides where dihedral energy minima are is somewhat underpowered. The manner in which shape vertices are aligned in stereopermutation enumeration isn’t even strictly based on a physical principle. We suggest the following to make the most of what the library can do for you:

  • Read the documentation for the various alignments. Consider using not just the default Staggered alignment, but either EclipsedAndStaggered or BetweenEclipsedAndStaggered to improve your chances of capturing all rotational minima. This will likely generate more conformers than strictly required, but should capture all minima.

  • Energy minimize all generated conformers with a suitable method and then deduplicate.

  • Consider using the Relabeler to do a final deduplication step.

>>> butane = io.experimental.from_smiles("CCCC")
>>> generator = DirectedConformerGenerator(butane)
>>> assert generator.bond_list()
>>> conformers = []
>>> while generator.decision_list_set_size() < generator.ideal_ensemble_size:
...     conformers.append(
...       generator.generate_random_conformation(
...         generator.generate_decision_list()
...       )
...     )
>>> assert len(conformers) == generator.ideal_ensemble_size()
__init__(self: scine_molassembler.DirectedConformerGenerator, molecule: scine_molassembler.Molecule, alignment: scine_molassembler.BondStereopermutator.Alignment = Alignment.Staggered, bonds_to_consider: List[scine_molassembler.BondIndex] = []) → None

Construct a generator for a particular molecule.

Parameters
  • molecule – For which molecule to construct a generator

  • alignment – Alignment with which to generate BondStereopermutator instances on considered bonds

  • bonds_to_consider – List of bonds that should be considered for directed conformer generation. Bonds for which consider_bond yields an IgnoreReason will still be ignored.

class EnumerationSettings

Settings for conformer enumeration

property configuration

Configuration for conformer generation scheme

property dihedral_retries

Number of attempts to generate the dihedral decision

property fitting

Mode for fitting dihedral assignments

class IgnoreReason

Reason why a bond is ignored for directed conformer generation

Members:

AtomStereopermutatorPreconditionsUnmet : There is not an assigned stereopermutator on both ends of the bond

HasAssignedBondStereopermutator : There is already an assigned bond stereopermutator on the bond

HasTerminalConstitutingAtom : At least one consituting atom is terminal

InCycle : The bond is in a cycle (see C++ documentation for details why cycle bonds are excluded)

IsEtaBond : The bond is an eta bond

RotationIsIsotropic : Rotation around this bond is isotropic (at least one side’s rotating substituents all have the same ranking)

AtomStereopermutatorPreconditionsUnmet = IgnoreReason.AtomStereopermutatorPreconditionsUnmet
HasAssignedBondStereopermutator = IgnoreReason.HasAssignedBondStereopermutator
HasTerminalConstitutingAtom = IgnoreReason.HasTerminalConstitutingAtom
InCycle = IgnoreReason.InCycle
IsEtaBond = IgnoreReason.IsEtaBond
RotationIsIsotropic = IgnoreReason.RotationIsIsotropic
property name

handle) -> str

Type

(self

class Relabeler

Functionality for relabeling decision lists of minimized structures

Determines dihedral bins from true dihedral distributions of minimized structures and generates bin membership lists for all processed structures.

add(self: scine_molassembler.DirectedConformerGenerator.Relabeler, positions: numpy.ndarray[float64[m, 3]]) → None

Add a particular position to the set to relabel

bin_indices(self: scine_molassembler.DirectedConformerGenerator.Relabeler, bins: List[List[Tuple[float, float]]]) → List[List[int]]

Determine relabeling for all added positions

Returns a list of bin membership indices for each added structure in sequence.

Parameters

bins – Bin intervals for all observed bonds (see bins function)

bin_midpoint_integers(self: scine_molassembler.DirectedConformerGenerator.Relabeler, bin_indices: List[List[int]], bins: List[List[Tuple[float, float]]]) → List[List[int]]

Relabel bin indices into the rounded dihedral value of their bin midpoint

Parameters
  • bin_indices – All structures’ bin indices (see bin_indices)

  • bins – Bin intervals for all observed bonds (see bins function)

bins(self: scine_molassembler.DirectedConformerGenerator.Relabeler, delta: float = 0.5235987755982988) → List[List[Tuple[float, float]]]

Generate bins for all observed dihedrals

Parameters

delta – Maximum dihedral distance between dihedral values to include in the same bin in radians

static density_bins(dihedrals: List[float], delta: float, symmetry_order: int = 1) → List[Tuple[float, float]]

Simplest density-based binning function

Generates bins for a set of dihedral values by sorting the dihedral values and then considering any values within the delta as part of the same bin.

Returns a list of pairs representing bin intervals. It is not guaranteed that the start of the interval is smaller than the end of the interval. This is because of dihedral angle periodicity. The boundaries of the first interval of the bins can have inverted order to indicate wrapping.

Parameters
  • dihedrals – List of dihedral values to bin

  • delta – Maximum dihedral distance between dihedral values to include in the same bin in radians

Raises

RuntimeError – If the passed list of dihedrals is empty

>>> bins = DirectedConformerGenerator.Relabeler.bins
>>> bins([0.1, 0.2], 0.1)
[(0.1, 0.2)]
>>> bins([0.1, 0.2, 0.4], 0.1)
[(0.1, 0.2), (0.4, 0.4)]
>>> bins([0.1, 0.2, 3.1, -3.1], 0.1)  # Inverted boundaries with wrap
[(3.1, -3.1), (0.1, 0.2)]
property dihedrals

Observed dihedrals at each bond in added structures

property sequences

Dominant index sequences at each considered bond

UNKNOWN_DECISION = 255
bin_midpoint_integers(self: scine_molassembler.DirectedConformerGenerator, arg0: List[int]) → List[int]

Relabels a decision list into bin midpoint integers

property bond_list

Get a list of considered bond indices. These are the bonds for which no ignore reason was found at construction-time.

conformation_molecule(self: scine_molassembler.DirectedConformerGenerator, decision_list: List[int])scine_molassembler.Molecule

Yields a molecule whose bond stereopermutators are set for a particular decision list.

Parameters

decision_list – List of assignments for the considered bonds of the generator.

static consider_bond(bond_index: scine_molassembler.BondIndex, molecule: scine_molassembler.Molecule, alignment: scine_molassembler.BondStereopermutator.Alignment = Alignment.Staggered) → Union[scine_molassembler.DirectedConformerGenerator.IgnoreReason, scine_molassembler.BondStereopermutator]

Decide whether to consider a bond’s dihedral for directed conformer generation or not. Returns either an IgnoreReason or an unowned stereopermutator instance.

Parameters
  • bond_index – Bond index to consider

  • molecule – The molecule in which bond_index is valid

  • alignment – Alignment to generate BondStereopermutator instances with. Affects stereopermutation counts.

contains(self: scine_molassembler.DirectedConformerGenerator, decision_list: List[int]) → bool

Checks whether a particular decision list is part of the underlying set

Parameters

decision_list – Decision list to check for in the underlying data structure

decision_list_set_size(self: scine_molassembler.DirectedConformerGenerator) → int

The number of conformer decision lists stored in the underlying set-liked data structure

static distance(decision_list_a: List[int], decision_list_b: List[int], bounds: List[int]) → int

Calculates a distance metric between two decision lists for dihedral permutations given bounds on the values at each position of the decision lists.

Parameters
  • decision_list_a – The first decision list

  • decision_list_b – The second decision list

  • bounds – Value bounds on each entry in the decision lists

enumerate(self: scine_molassembler.DirectedConformerGenerator, callback: Callable[[List[int], numpy.ndarray[float64[m, 3]]], None], seed: int, settings: scine_molassembler.DirectedConformerGenerator.EnumerationSettings = (dihedral_retries=3, fitting=FittingMode.Nearest, configuration=(partiality=Partiality.FourAtom, refinement_step_limit=10000, refinement_gradient_target=1e-05, spatial_model_loosening=1.0, fixed_positions=[]))) → None

Enumerate all conformers of the captured molecule

Clears the stored set of decision lists, then enumerates all conformers of the molecule in parallel.

Note

This function is parallelized and will utilize OMP_NUM_THREADS threads. Callback invocations are unsequenced but the arguments are reproducible.

Parameters
  • callback – Function called with decision list and conformer positions for each successfully generated pair.

  • seed – Randomness initiator for decision list and conformer generation

  • settings – Further parameters for enumeration algorithms

enumerate_random(self: scine_molassembler.DirectedConformerGenerator, callback: Callable[[List[int], numpy.ndarray[float64[m, 3]]], None], settings: scine_molassembler.DirectedConformerGenerator.EnumerationSettings = (dihedral_retries=3, fitting=FittingMode.Nearest, configuration=(partiality=Partiality.FourAtom, refinement_step_limit=10000, refinement_gradient_target=1e-05, spatial_model_loosening=1.0, fixed_positions=[]))) → None

Enumerate all conformers of the captured molecule

Clears the stored set of decision lists, then enumerates all conformers of the molecule in parallel.

Note

This function is parallelized and will utilize OMP_NUM_THREADS threads. Callback invocations are unsequenced but the arguments are reproducible given the same global PRNG state.

Note

This function advances molassembler’s global PRNG state.

Parameters
  • callback – Function called with decision list and conformer positions for each successfully generated pair.

  • settings – Further parameters for enumeration algorithms

generate_conformation(self: scine_molassembler.DirectedConformerGenerator, decision_list: List[int], seed: int, configuration: scine_molassembler.dg.Configuration = (partiality=Partiality.FourAtom, refinement_step_limit=10000, refinement_gradient_target=1e-05, spatial_model_loosening=1.0, fixed_positions=[])) → Union[numpy.ndarray[float64[m, 3]], scine_molassembler.dg.Error]

Try to generate a conformer for a particular decision list.

Parameters
  • decision_list – Decision list to use in conformer generation

  • seed – Seed to initialize a PRNG with for use in conformer generation.

  • configuration – Distance geometry configurations object. Defaults are usually fine.

generate_decision_list(self: scine_molassembler.DirectedConformerGenerator) → List[int]

Generate a new list of discrete dihedral arrangement choices. Guarantees that the new list is not yet part of the underlying set. Inserts the generated list into the underlying set. Will not generate the same decision list twice.

Note

This function advances molassembler’s global PRNG state.

generate_random_conformation(self: scine_molassembler.DirectedConformerGenerator, decision_list: List[int], configuration: scine_molassembler.dg.Configuration = (partiality=Partiality.FourAtom, refinement_step_limit=10000, refinement_gradient_target=1e-05, spatial_model_loosening=1.0, fixed_positions=[])) → Union[numpy.ndarray[float64[m, 3]], scine_molassembler.dg.Error]

Try to generate a conformer for a particular decision list.

Parameters
  • decision_list – Decision list to use in conformer generation

  • configuration – Distance geometry configurations object. Defaults are usually fine.

Note

This function advances molassembler’s global PRNG state.

get_decision_list(*args, **kwargs)

Overloaded function.

  1. get_decision_list(self: scine_molassembler.DirectedConformerGenerator, atom_collection: Scine::Utils::AtomCollection, fitting_mode: scine_molassembler.BondStereopermutator.FittingMode = FittingMode.Nearest) -> List[int]

    Infer a decision list for the relevant bonds from positions.

    For all bonds considered relevant (i.e. all bonds in bond_list()), fits supplied positions to possible stereopermutations and returns the result. Entries have a value equal to UNKNOWN_DECISION if no permutation could be recovered. The usual BondStereopermutator fitting tolerances apply.

    Assumes several things about your supplied positions: - There have only been dihedral changes - No atom stereopermutator assignment changes - No constitutional rearrangements

    This variant of get_decision_lists checks that the element type sequence matches that of the underlying molecule, which holds for conformers generated using the underlying molecule.

    param atom_collection

    Positions from which to interpret the decision list from.

    param fitting_mode

    Mode altering how decisions are fitted.

  2. get_decision_list(self: scine_molassembler.DirectedConformerGenerator, positions: numpy.ndarray[float64[m, 3]], fitting_mode: scine_molassembler.BondStereopermutator.FittingMode = FittingMode.Nearest) -> List[int]

    Infer a decision list for the relevant bonds from positions.

    For all bonds considered relevant (i.e. all bonds in bond_list()), fits supplied positions to possible stereopermutations and returns the result. Entries have a value equal to UNKNOWN_DECISION if no permutation could be recovered. The usual BondStereopermutator fitting tolerances apply.

    Assumes several things about your supplied positions: - There have only been dihedral changes - No atom stereopermutator assignment changes - No constitutional rearrangements

    param atom_collection

    Positions from which to interpret the decision list from.

    param fitting_mode

    Mode altering how decisions are fitted.

property ideal_ensemble_size

Returns the number of conformers needed for a full ensemble

insert(self: scine_molassembler.DirectedConformerGenerator, decision_list: List[int]) → bool

Add a decision list to the underlying set-like data structure.

Parameters

decision_list – Decision list to insert into the underlying data structure.

relabeler(self: scine_molassembler.DirectedConformerGenerator) → Scine::Molassembler::DirectedConformerGenerator::Relabeler

Generate a Relabeler for the underlying molecule and bonds