Directed conformer generator¶
Making sure you have explored all dihedral angles of a molecule is tricky. Ideally, a library would approximate the local dihedral energy distribution to correctly guess the minima and generate those conformers for you. Unfortunately, Molassembler is short on information needed for such approximations to be correct. For one, Molassembler avoids requiring full correctness of graph bond orders to keep errors in floating-point bond order discretization from propagating. Additionally, Molassembler doesn’t ask you to specify the overall charge of your molecule. The phenomenological approach to local molecular shape doesn’t require it. It is impossible to assign formal charges purely based on local molecular shapes without knowing overall charge or the guarantee that the formal bond orders of the graph are correct.
So what can Molassembler offer instead? Given that Molassembler quite explicitly states that its generated conformers are guesses to local energy minima on the potential energy surface, what Molassembler can try to ensure is that minimizations of generated conformers minimize into all local minima that exist.
To that end, we suggest you carefully read the details of
Alignment
and consider
deduplicating energy minimized conformer guesses with the
Relabeler
.
-
class
scine_molassembler.
DirectedConformerGenerator
¶ Helper type for directed conformer generation.
Generates new combinations of BondStereopermutator assignments and provides helper functions for the generation of conformers using these combinations and the reverse, finding the combinations from conformers.
It is important that you lower your expectations for the modeling of dihedral energy minima, however. Considering that Molassembler neither requires you to supply a correct graph, never detects or kekulizes aromatic systems nor asks you to supply an overall charge for a molecule, it should be understandable that the manner in which Molassembler decides where dihedral energy minima are is somewhat underpowered. The manner in which shape vertices are aligned in stereopermutation enumeration isn’t even strictly based on a physical principle. We suggest the following to make the most of what the library can do for you:
Read the documentation for the various alignments. Consider using not just the default
Staggered
alignment, but eitherEclipsedAndStaggered
orBetweenEclipsedAndStaggered
to improve your chances of capturing all rotational minima. This will likely generate more conformers than strictly required, but should capture all minima.Energy minimize all generated conformers with a suitable method and then deduplicate.
Consider using the
Relabeler
to do a final deduplication step.
>>> butane = io.experimental.from_smiles("CCCC") >>> generator = DirectedConformerGenerator(butane) >>> assert len(generator.bond_list) > 0 >>> conformers = [] >>> while generator.decision_list_set_size() < generator.ideal_ensemble_size: ... conformers.append( ... generator.generate_random_conformation( ... generator.generate_decision_list() ... ) ... ) >>> assert len(conformers) == generator.ideal_ensemble_size
-
class
EnumerationSettings
¶ Settings for conformer enumeration
-
__init__
(self: scine_molassembler.DirectedConformerGenerator.EnumerationSettings) → None¶ Default-initialize enumeration settings
-
property
configuration
¶ Configuration for conformer generation scheme
-
property
dihedral_retries
¶ Number of attempts to generate the dihedral decision
-
property
fitting
¶ Mode for fitting dihedral assignments
-
-
class
IgnoreReason
¶ Reason why a bond is ignored for directed conformer generation
Members:
AtomStereopermutatorPreconditionsUnmet : There is not an assigned stereopermutator on both ends of the bond
HasAssignedBondStereopermutator : There is already an assigned bond stereopermutator on the bond
HasTerminalConstitutingAtom : At least one consituting atom is terminal
InCycle : The bond is in a cycle (see C++ documentation for details why cycle bonds are excluded)
IsEtaBond : The bond is an eta bond
RotationIsIsotropic : Rotation around this bond is isotropic (at least one side’s rotating substituents all have the same ranking)
-
AtomStereopermutatorPreconditionsUnmet
= <IgnoreReason.AtomStereopermutatorPreconditionsUnmet: 0>¶
-
HasAssignedBondStereopermutator
= <IgnoreReason.HasAssignedBondStereopermutator: 1>¶
-
HasTerminalConstitutingAtom
= <IgnoreReason.HasTerminalConstitutingAtom: 2>¶
-
InCycle
= <IgnoreReason.InCycle: 3>¶
-
IsEtaBond
= <IgnoreReason.IsEtaBond: 4>¶
-
RotationIsIsotropic
= <IgnoreReason.RotationIsIsotropic: 5>¶
-
__init__
(self: scine_molassembler.DirectedConformerGenerator.IgnoreReason, value: int) → None¶
-
property
name
¶
-
property
value
¶
-
-
class
Relabeler
¶ Functionality for relabeling decision lists of minimized structures
Determines dihedral bins from true dihedral distributions of minimized structures and generates bin membership lists for all processed structures.
-
class
DihedralInfo
¶ -
__init__
(*args, **kwargs)¶ Initialize self. See help(type(self)) for accurate signature.
-
property
i_set
¶ First atom index set for the dihedral
-
property
j
¶ Second atom index of the dihedral
-
property
k
¶ Third atom index of the dihedral
-
property
l_set
¶ Fourth atom index set for the dihedral
-
property
symmetry_order
¶ Rotational symmetry order of the dihedral
-
-
__init__
(*args, **kwargs)¶ Initialize self. See help(type(self)) for accurate signature.
-
add
(self: scine_molassembler.DirectedConformerGenerator.Relabeler, positions: numpy.ndarray[numpy.float64[m, 3]]) → List[float]¶ Add a particular position to the set to relabel
-
bin_bounds
(self: scine_molassembler.DirectedConformerGenerator.Relabeler, bin_indices: List[List[int]], bins: List[List[Tuple[float, float]]]) → List[List[Tuple[int, int]]]¶ Relabel bin indices into integer bounds on their bins
- Parameters
bin_indices – All structures’ bin indices (see bin_indices)
bins – Bin intervals for all observed bonds (see bins function)
-
bin_indices
(self: scine_molassembler.DirectedConformerGenerator.Relabeler, bins: List[List[Tuple[float, float]]]) → List[List[int]]¶ Determine relabeling for all added positions
Returns a list of bin membership indices for each added structure in sequence.
- Parameters
bins – Bin intervals for all observed bonds (see bins function)
-
bin_midpoint_integers
(self: scine_molassembler.DirectedConformerGenerator.Relabeler, bin_indices: List[List[int]], bins: List[List[Tuple[float, float]]]) → List[List[int]]¶ Relabel bin indices into the rounded dihedral value of their bin midpoint
- Parameters
bin_indices – All structures’ bin indices (see bin_indices)
bins – Bin intervals for all observed bonds (see bins function)
-
bins
(self: scine_molassembler.DirectedConformerGenerator.Relabeler, delta: float = 0.5235987755982988) → List[List[Tuple[float, float]]]¶ Generate bins for all observed dihedrals
- Parameters
delta – Maximum dihedral distance between dihedral values to include in the same bin in radians
-
static
density_bins
(dihedrals: List[float], delta: float, symmetry_order: int = 1) → List[Tuple[float, float]]¶ Simplest density-based binning function
Generates bins for a set of dihedral values by sorting the dihedral values and then considering any values within the delta as part of the same bin.
Returns a list of pairs representing bin intervals. It is not guaranteed that the start of the interval is smaller than the end of the interval. This is because of dihedral angle periodicity. The boundaries of the first interval of the bins can have inverted order to indicate wrapping.
- Parameters
dihedrals – List of dihedral values to bin
delta – Maximum dihedral distance between dihedral values to include in the same bin in radians
- Raises
RuntimeError – If the passed list of dihedrals is empty
>>> bins = DirectedConformerGenerator.Relabeler.density_bins >>> bins([0.1, 0.2], 0.1) [(0.1, 0.2)] >>> bins([0.1, 0.2, 0.4], 0.1) [(0.1, 0.2), (0.4, 0.4)] >>> bins([0.1, 0.2, 3.1, -3.1], 0.1) # Inverted boundaries with wrap [(3.1, -3.1), (0.1, 0.2)]
-
property
dihedrals
¶ Observed dihedral values at each bond in added structures
-
static
integer_bounds
(floating_bounds: Tuple[float, float]) → Tuple[int, int]¶ Converts dihedral bounds in radians into integer degree bounds. Rounds the lower bound down and rounds the upper bound up.
- Parameters
floating_bounds – Pair of dihedral values in radians
- Returns
Pair of integer dihedral values in degrees
>>> int_bounds = DirectedConformerGenerator.Relabeler.integer_bounds >>> int_bounds((-0.1, 0.1)) (-6, 6)
-
static
make_bounds
(dihedral: float, tolerance: float) → Tuple[float, float]¶ Generates [-pi, pi) wrapped bounds on a dihedral value in radians with a tolerance.
>>> DirectedConformerGenerator.Relabeler.make_bounds(0, 0.1) (-0.1, 0.1)
-
property
sequences
¶ Dominant dihedral index sequences at each considered bond
-
class
-
UNKNOWN_DECISION
= 255¶
-
__init__
(self: scine_molassembler.DirectedConformerGenerator, molecule: scine_molassembler.Molecule, alignment: scine_molassembler.BondStereopermutator.Alignment = scine_molassembler.BondStereopermutator.Alignment.Staggered, bonds_to_consider: List[scine_molassembler.BondIndex] = []) → None¶ Construct a generator for a particular molecule.
- Parameters
molecule – For which molecule to construct a generator
alignment – Alignment with which to generate BondStereopermutator instances on considered bonds
bonds_to_consider – List of bonds that should be considered for directed conformer generation. Bonds for which consider_bond yields an IgnoreReason will still be ignored.
-
bin_bounds
(self: scine_molassembler.DirectedConformerGenerator, arg0: List[int]) → List[Tuple[int, int]]¶ Relabels a decision list into integer bounds of its stereopermutation bin
-
bin_midpoint_integers
(self: scine_molassembler.DirectedConformerGenerator, arg0: List[int]) → List[int]¶ Relabels a decision list into bin midpoint integers
-
property
bond_list
¶ Get a list of considered bond indices. These are the bonds for which no ignore reason was found at construction-time.
-
conformation_molecule
(self: scine_molassembler.DirectedConformerGenerator, decision_list: List[int]) → scine_molassembler.Molecule¶ Yields a molecule whose bond stereopermutators are set for a particular decision list.
- Parameters
decision_list – List of assignments for the considered bonds of the generator.
-
static
consider_bond
(bond_index: scine_molassembler.BondIndex, molecule: scine_molassembler.Molecule, alignment: scine_molassembler.BondStereopermutator.Alignment = scine_molassembler.BondStereopermutator.Alignment.Staggered) → Union[scine_molassembler.DirectedConformerGenerator.IgnoreReason, scine_molassembler.BondStereopermutator]¶ Decide whether to consider a bond’s dihedral for directed conformer generation or not. Returns either an IgnoreReason or an unowned stereopermutator instance.
- Parameters
bond_index – Bond index to consider
molecule – The molecule in which bond_index is valid
alignment – Alignment to generate BondStereopermutator instances with. Affects stereopermutation counts.
-
contains
(self: scine_molassembler.DirectedConformerGenerator, decision_list: List[int]) → bool¶ Checks whether a particular decision list is part of the underlying set
- Parameters
decision_list – Decision list to check for in the underlying data structure
-
decision_list_set_size
(self: scine_molassembler.DirectedConformerGenerator) → int¶ The number of conformer decision lists stored in the underlying set-liked data structure
-
static
distance
(decision_list_a: List[int], decision_list_b: List[int], bounds: List[int]) → int¶ Calculates a distance metric between two decision lists for dihedral permutations given bounds on the values at each position of the decision lists.
- Parameters
decision_list_a – The first decision list
decision_list_b – The second decision list
bounds – Value bounds on each entry in the decision lists
-
enumerate
(self: scine_molassembler.DirectedConformerGenerator, callback: Callable[[List[int], numpy.ndarray[numpy.float64[m, 3]]], None], seed: int, settings: scine_molassembler.DirectedConformerGenerator.EnumerationSettings = ...) → None¶ Enumerate all conformers of the captured molecule
Clears the stored set of decision lists, then enumerates all conformers of the molecule in parallel.
Note
This function is parallelized and will utilize
OMP_NUM_THREADS
threads. Callback invocations are unsequenced but the arguments are reproducible.- Parameters
callback – Function called with decision list and conformer positions for each successfully generated pair.
seed – Randomness initiator for decision list and conformer generation
settings – Further parameters for enumeration algorithms
-
enumerate_random
(self: scine_molassembler.DirectedConformerGenerator, callback: Callable[[List[int], numpy.ndarray[numpy.float64[m, 3]]], None], settings: scine_molassembler.DirectedConformerGenerator.EnumerationSettings = ...) → None¶ Enumerate all conformers of the captured molecule
Clears the stored set of decision lists, then enumerates all conformers of the molecule in parallel.
Note
This function is parallelized and will utilize
OMP_NUM_THREADS
threads. Callback invocations are unsequenced but the arguments are reproducible given the same global PRNG state.Note
This function advances
molassembler
’s global PRNG state.- Parameters
callback – Function called with decision list and conformer positions for each successfully generated pair.
settings – Further parameters for enumeration algorithms
-
generate_conformation
(self: scine_molassembler.DirectedConformerGenerator, decision_list: List[int], seed: int, configuration: scine_molassembler.dg.Configuration = ...) → Union[numpy.ndarray[numpy.float64[m, 3]], scine_molassembler.dg.Error]¶ Try to generate a conformer for a particular decision list.
- Parameters
decision_list – Decision list to use in conformer generation
seed – Seed to initialize a PRNG with for use in conformer generation.
configuration – Distance geometry configurations object. Defaults are usually fine.
-
generate_decision_list
(self: scine_molassembler.DirectedConformerGenerator) → List[int]¶ Generate a new list of discrete dihedral arrangement choices. Guarantees that the new list is not yet part of the underlying set. Inserts the generated list into the underlying set. Will not generate the same decision list twice.
Note
This function advances
molassembler
’s global PRNG state.
-
generate_random_conformation
(self: scine_molassembler.DirectedConformerGenerator, decision_list: List[int], configuration: scine_molassembler.dg.Configuration = ...) → Union[numpy.ndarray[numpy.float64[m, 3]], scine_molassembler.dg.Error]¶ Try to generate a conformer for a particular decision list.
- Parameters
decision_list – Decision list to use in conformer generation
configuration – Distance geometry configurations object. Defaults are usually fine.
Note
This function advances
molassembler
’s global PRNG state.
-
get_decision_list
(*args, **kwargs)¶ Overloaded function.
get_decision_list(self: scine_molassembler.DirectedConformerGenerator, atom_collection: Scine::Utils::AtomCollection, fitting_mode: scine_molassembler.BondStereopermutator.FittingMode = scine_molassembler.BondStereopermutator.FittingMode.Nearest) -> List[int]
Infer a decision list for the relevant bonds from positions.
For all bonds considered relevant (i.e. all bonds in bond_list()), fits supplied positions to possible stereopermutations and returns the result. Entries have a value equal to
UNKNOWN_DECISION
if no permutation could be recovered. The usual BondStereopermutator fitting tolerances apply.Assumes several things about your supplied positions: - There have only been dihedral changes - No atom stereopermutator assignment changes - No constitutional rearrangements
This variant of get_decision_lists checks that the element type sequence matches that of the underlying molecule, which holds for conformers generated using the underlying molecule.
- param atom_collection
Positions from which to interpret the decision list from.
- param fitting_mode
Mode altering how decisions are fitted.
get_decision_list(self: scine_molassembler.DirectedConformerGenerator, positions: numpy.ndarray[numpy.float64[m, 3]], fitting_mode: scine_molassembler.BondStereopermutator.FittingMode = scine_molassembler.BondStereopermutator.FittingMode.Nearest) -> List[int]
Infer a decision list for the relevant bonds from positions.
For all bonds considered relevant (i.e. all bonds in bond_list()), fits supplied positions to possible stereopermutations and returns the result. Entries have a value equal to
UNKNOWN_DECISION
if no permutation could be recovered. The usual BondStereopermutator fitting tolerances apply.Assumes several things about your supplied positions: - There have only been dihedral changes - No atom stereopermutator assignment changes - No constitutional rearrangements
- param atom_collection
Positions from which to interpret the decision list from.
- param fitting_mode
Mode altering how decisions are fitted.
-
property
ideal_ensemble_size
¶ Returns the number of conformers needed for a full ensemble
-
insert
(self: scine_molassembler.DirectedConformerGenerator, decision_list: List[int]) → bool¶ Add a decision list to the underlying set-like data structure.
- Parameters
decision_list – Decision list to insert into the underlying data structure.
-
relabeler
(self: scine_molassembler.DirectedConformerGenerator) → scine_molassembler.DirectedConformerGenerator.Relabeler¶ Generate a Relabeler for the underlying molecule and bonds