How to use AutoCAS

Quickstart

After installing AutoCAS it can be started from the command line. To show all possible options, please run:

python3 -m scine_autocas -h

For example, AutoCAS can be started by passing a valid XYZ file to it, and running all calculations with the corresponding defaults.

python3 -m scine_autocas -x <molecule.xyz>

To pass a basis set, a different interface or enable the creation of entanglement diagrams the following directives can be passed:

python3 -m scine_autocas --xyz_file <molecule.xyz> --basis_set cc-pvtz --plot --interface Molcas

However we would strongly recommend providing a .yml-input file, to make calculations reproducible and allowing higher customization of AutoCAS.

Advanced Usage

YAML Input

Instead of relying on the provided input information from the command line, AutoCAS can be fully controlled by passing a .yml-input file to it:

python3 -m scine_autocas -y <input.yml>

A minimal .yml-input requires to have at least a molecule section, with the location of the corresponding XYZ file.

molecule:
  xyz_file: "/path/to/molecule.xyz"

Most options from the .yml-input are populated through the whole AutoCAS framework, meaning every variable can be set and further it can be used to start from any point in an AutoCAS calculation. A comprehensive .yml-file could look like:

---
# Enable the large active space protocol.
large_cas: False
# Plots the entanglement diagram from the inital cas calculation.
entanglement_diagram: "/path/to/entanglement_diagram.pdf"
# The threshold diagram shows all thresholds from the provided
# single orbital entropies.
threshold_diagram: "/path/to/threshold_diagram.pdf"
# In some large active space calculations, the active space can get
# so large, that an ordinary entanglement diagram becomes too complex.
# For a better overview this kind of diagram shows the entanglement of
# all diagrams in the background and the entanglement of the selected
# orbitals in an inner circle to provide a better overview.
full_entanglement_diagram: "/path/to/full_entanglement_diagram.pdf"
# Molecular system related settings.
molecule:
  # Total charge.
  charge: 0
  # Total spin multiplicity.
  spin_multiplicity: 1
  # For 3d transition metals, additionally include 4d orbitals in the initial
  # active space.
  double_d_shell: True
  # The xyz-file of the molecule.
  xyz_file: "/path/to/molecule.xyz"
# General autocas settings.
autocas:
  # Required number of consecutive threshold steps to form a plateau to
  # determine the active space.
  plateau_values: 10
  # One threshold step corresponds to a fraction of the maximal single orbital
  # entropy.
  threshold_step: 0.01
  # Setting to define if a cas is necessary.
  diagnostics:
    # Any orbital with a s1 value below that threshold is directly excluded from cas.
    weak_correlation_threshold: 0.02
    # If maximum of s1 is below that threshold a single reference method might be better.
    single_reference_threshold: 0.14
  # Settings for large active space protocol. Only used if large_cas is enabled.
  large_spaces:
    # Maximum number of orbitals in a sub-cas
    max_orbitals: 30
# Interface related settings.
interface:
  # Which interface to use. (Currently only molcas is implemented)
  interface: "molcas"
  # If interface should write output and all related files.
  dump: True
  # Name of the project, will be also used in some file names.
  project_name: "autocas_project"
  # Set molcas environment variables. All variables here respect already set global variables in
  # the current session.
  environment:
    # Directory to store integrals and other molcas temporary files.
    molcas_scratch_dir: "/path/to/molcas_scratch"
  # Method related settings.
  settings:
    # DMRG bond dimension for the final dmrg run. To change a different bond dimension for
    # the initial DMRG run, please see an provided example script. (autoCAS/scripts)
    dmrg_bond_dimension: 1000
    # Number of DMRG sweeps for the final DMRG run. To change a different bond dimension for
    # the initial DMRG run, please see an provided example script. (autoCAS/scripts)
    dmrg_sweeps: 10
    # The basis set.
    basis_set: "cc-pvdz"
    # Method to evaluate the final active space.
    method: "dmrg-scf"
    # Method for dynamic correlation in the final calculation.
    post_cas_method: "caspt2"
    # Directory which contains folder hirachy made by autocas.
    work_dir: "/path/to/autocas_project"
    # The xyz-file, required for molcas input. Make sure this file is the same as
    # defined above, or at least contains the same atoms.
    xyz_file: "/path/to/molecule.xyz"
    # Point group of the system.
    point_group: "C1"
    # IPEA shift for caspt2.
    ipea: 0.0
    # Enable Cholesky decomposition for integrals.
    cholesky: True
    # Enable unrestriced HF. If your provided system is open-shell with the corresponding
    # charge and/or spin multiplicity uhf is enabled automatically.
    uhf: False
    # Enable fiedler ordering for DMRG.
    fiedler: True
    # Number of states. 0 means that this option is disabled. Hence
    # 0 and 1 have the same meaning, that onlt the ground state is evaluated.
    n_excited_states: 0

For more information on keywords, take a look at the API section.

Custom Scripts

Until now, everything discussed is based on the front-end of AutoCAS implemented in the usual __main__.py in scine_autocas/. However, AutoCAS comes as a Python3 library, which can be utilized on its own, writing custom workflows incorporating the AutoCAS framework. A basic script to set up an AutoCAS-based calculation could look like:

"""Basic example script.

This script is a modified version from the ground state cas calculation
in scine_autocas.main_functions
"""
# -*- coding: utf-8 -*-
__copyright__ = """This file is part of SCINE AutoCAS.
This code is licensed under the 3-clause BSD license.
Copyright ETH Zurich, Department of Chemistry and Applied Biosciences, Reiher Group.
See LICENSE.txt for details
"""

import os

from scine_autocas import Autocas
from scine_autocas.autocas_utils.molecule import Molecule
from scine_autocas.interfaces.molcas import Molcas
from scine_autocas.main_functions import MainFunctions
from scine_autocas.plots.entanglement_plot import EntanglementPlot


def standard_autocas_procedure(path: str, xyz_file: str):
    """Do ground state cas"""
    # create a molecule
    molecule = Molecule(xyz_file)

    # initialize autoCAS and Molcas interface
    autocas = Autocas(molecule)
    molcas = Molcas([molecule])

    # setup interface
    molcas.project_name = "example"
    molcas.settings.work_dir = path + "/../test/example"
    molcas.environment.molcas_scratch_dir = path + "/../test/scratch"
    molcas.settings.xyz_file = xyz_file

    # cas and hyphen do not matter for method names
    molcas.settings.method = "DMRGCI"

    # manually set dmrg sweeps and bond dmrg_bond_dimension to low number
    molcas.settings.dmrg_bond_dimension = 250
    molcas.settings.dmrg_sweeps = 5

    # make initial active space and evaluate initial DMRG calculation
    occ_initial, index_initial = autocas.make_initial_active_space()

    # no input means HF calculation
    molcas.calculate()

    # do cas calculation
    cas_results = molcas.calculate(occ_initial, index_initial)

    # energy = cas_results[0]
    s1_entropy = cas_results[1]
    # s2_entropy = cas_results[2]
    mut_inf = cas_results[3]

    # plot entanglement diagram
    plot = EntanglementPlot()
    plt = plot.plot(s1_entropy, mut_inf)  # type: ignore
    plt.savefig(molcas.settings.work_dir + "/entang.pdf")  # type: ignore

    # make active space based on single orbital entropies
    cas_occ, cas_index = autocas.get_active_space(
        occ_initial, s1_entropy   # type: ignore
    )

    # cas and hyphen do not matter for method names
    molcas.settings.method = "dmrg-scf"

    # manually set dmrg sweeps and bond dmrg_bond_dimension to low number
    molcas.settings.dmrg_bond_dimension = 2000
    molcas.settings.dmrg_sweeps = 20

    # Do a calculation with this CAS
    final_energy, final_s1, final_s2, final_mut_inf = molcas.calculate(cas_occ, cas_index)

    # use results
    n_electrons = sum(cas_occ)
    n_orbitals = len(cas_occ)
    print(f"final energy:      {final_energy}")
    print(f"final CAS(e, o):  ({n_electrons}, {n_orbitals})")
    print(f"final cas indices: {cas_index}")
    print(f"final occupation:  {cas_occ}")
    print(f"final s1:          {final_s1}")
    print(f"final s2: \n{final_s2}")
    print(f"final mut_inf: \n{final_mut_inf}")
    return cas_occ, cas_index, final_energy


if __name__ == "__main__":
    path_to_this_file = os.path.dirname(os.path.abspath(__file__))
    xyz = path_to_this_file + "/../scine_autocas/tests/files/n2.xyz"
    occupation, orbitals, energy = standard_autocas_procedure(path_to_this_file, xyz)
    print(f"n2 \ncas: {occupation} \norbs: {orbitals} \nenergy: {energy}")

    print("\n\n")
    print("only verification run")
    # to verify that everything is still valid
    main_functions = MainFunctions()
    test_molecule = Molecule(xyz)
    test_autocas = Autocas(test_molecule)
    test_interface = Molcas([test_molecule])

    test_interface.project_name = "verify_example"
    test_interface.settings.work_dir = path_to_this_file + "/../test/verify_example"
    test_interface.settings.xyz_file = xyz
    test_interface.environment.molcas_scratch_dir = path_to_this_file + "/../test/scratch_tmp"

    test_interface.settings.method = "dmrg-ci"
    test_interface.settings.dmrg_bond_dimension = 250
    test_interface.settings.dmrg_sweeps = 5
    test_cas, test_indices = main_functions.conventional(test_autocas, test_interface)

    test_interface.settings.method = "dmrg-scf"
    test_interface.settings.dmrg_bond_dimension = 2000
    test_interface.settings.dmrg_sweeps = 20
    test_results = test_interface.calculate(test_cas, test_indices)

    assert abs(test_results[0] - energy) < 1e-9
    assert test_cas == occupation
    assert test_indices == orbitals

More scripts can be found in /path/to/autoCAS/scripts

Analysing Completed Projects

In order to analyze already finished calculations, AutoCAS provides a script to do so. The analysis can be accomplished by:

python3 /path/to/autoCAS/scripts/analyse_only.py

In order to make this task more convenient, we suggest adding an alias to your .bashrc, for example:

echo -e "alias entanglement='python3 /path/to/autoCAS/scripts/analyse_only.py'" >> ~/.bashrc
source .bashrc

The analysis script requires a QCMaquis output (assuming the alias set before):

entanglement qcmaquis_output.results_state.0.h5

and can optionally save the entanglement diagram, select which state to analyze or set the molecule to get an active space suggestion from AutoCAS. Instead of directly analyzing only one output, the script can be applied to the root of the project, e.g. a folder containing initial/, dmrg/, final/. Use the following command to save the entanglement diagram and analyze the ground state for a molecule:

entanglement -s entanglement_diagram.pdf -e 0 -m <molecule.xyz> <autocas_dir>

Note: ensure either one dmrg/ folder is provided, or a number of folders dmrg_1/, dmrg_2/, etc. from a large active space calculation, but not both. For more information run:

python3 /path/to/autoCAS/scripts/analyse_only.py -h