API¶
Module: bootstrap¶
-
scine_puffin.bootstrap.
bootstrap
(config)[source]¶ Sets up all required and also all additionally requested programs/packages for the use with Puffin. Generates a
puffin.sh
to be sourced before running the actual puffin.- Parameters
- config :: puffin.config.Configuration
The current configuration of the Puffin.
Module: config¶
-
class
scine_puffin.config.
Configuration
[source]¶ The Puffin configuration. All values are defaulted. The main sections of the configuration are:
- daemon
The settings peratining the execution of Puffin and its daemon process.
- database
All information about the database the Puffin will be working on.
- resources
The information about the hardware the Puffin is running on and is allowed to use for calculations and the execution of jobs.
- programs
The settings for all the programs and packages Puffin relies on when executing jobs. Each program/packages has its own entry with the possibility of program specific settings. See the documentation for each individual program (found at
scine_puffin.programs
) for more details about the individual settings.
Note that the config is sensitive to environment variables when it is initialized/loaded. Each setting in the config can be set via a corresponding environment variable. The settings are given as
PUFFIN_<key1>_<key2>=<value>
where the keys are the chain of uppercase keys to the final setting. As an example:PUFFIN_DAEMON_MODE=debug
would equalconfig['daemon']['mode'] = 'debug'
.In detail, the options in the configuration are:
- daemon
- mode :: str
The mode to run the Puffin in, options are
release
anddebug
. Therelease
mode will fork the main process and run in a daemonized mode while thedebug
mode will run in the current shell, reporting any output and errors tostdout
andstderr
.- job_dir :: str
The path to the directory containing the currently running job.
- software_dir :: str
The path to the directory containing the software bootstrapped from sources. The Puffin will generate and fill this directory upon bootstrapping.
- error_dir :: str
If existent, the Puffin instance will archive all failed jobs into this directory.
- archive_dir :: str
If existent, the Puffin instance will archive all correctly completed jobs into this directory.
- uuid :: str
A unique name for the Puffin instance. This can be set by the user, if not, a unique ID will automatically be generated.
- pid :: str
The path to the file identifying the PID of the Puffin instance.
- log :: str
The path to the logfile of the Puffin instance.
- stop :: str
The path to a file that if existent will prompt the Puffin instance to stop taking new jobs and shut down instead. The instance will finish any running job though.
- cycle_time_in_s :: float
The time in between scans of the database for new jobs that can be run.
- timeout_in_h :: float
The number of hours the Puffin instance should stay alive. Once this limit is reached, the Puffin is shut down and its running job will be killed and re-flagged as new.
- idle_timeout_in_h :: float
The number of hours the Puffin instance should stay alive. After receiving the last job, once the limit is reached, the Puffin is shut down. Any accepted job will reset the timer. A negative value disables this feature and make the Puffin run until the
timeout_in_h
is reached independent of it being idle the entire time.- touch_time_in_s :: float
The time in seconds in between the attempts of the puffin to touch a calculation it is running in the database. In practice each Puffin will search for jobs in the database that are set as running but are not touched and reset them, as they indicate that the executing puffin has crashed. See
job_reset_time_in_s
for more information.- job_reset_time_in_s :: float
The time in seconds that may have passed since the last touch on pending jobs before they are considered dead and are reset to be worked by another puffin. Note: The time in this setting has to be larger than the
touch_time_in_s
of all Puffins working on the same database to work!- repeated_failure_stop :: int
The number of consecutive failed jobs that are allowed before the Puffin stops in order to avoid failing all jobs in a DB due to e.g. hardware issues. Failed jobs will be reset to new and rerun by other Puffins. Should always be greater than 1.
- max_number_of_jobs :: int
The maximum number of jobs a single Puffin will carry out (complete or failed), before gracefully exiting. Any negative number or zero disables this setting; by default it is disabled.
- enforce_memory_limit :: bool
If the given memory limit should be enforced (i.e., a job is killed as soon as it reaches it) or not. The puffin still continues to work on other calculations either way.
- database
- ip :: str
The IP at which the database server to connect with is found.
- port :: int
The port at which the database server to connect with is found.
- name :: str
The name of the database on the database server to connect with. Multiple databases (with multiple names) can be given as comma seperated list:
name_one,name_two,name_three
. The databases will be used in descending order of priority. Meaning: at any given time all jobs of the first named database will have to be completed before any job of the second one will be considered by the Puffin instance.
- resources
- cores :: int
The number of threads the executed jobs are allowed to use. Note that only jobs that are below this value in their requirements will be accepted by the Puffin instance.
- memory :: float
The total amount of memory the Puffin and its jobs are allowed to use. Given in GB. Note that only jobs that are below this value in their requirements will be accepted by the Puffin instance.
- disk :: float
The total amount of disk space the Puffin and its jobs are allowed to use. Given in GB. Note that only jobs that are below this value in their requirements will be accepted by the Puffin instance.
- ssh_keys :: List[str]
Any SSH keys needed by the Puffin in order to connect to the database or to bootstrap programs.
- programs
The specific details for each program are given in their respective documentation. However, common options are:
- available :: bool
The switch whether the program shall be available to Puffin. Any programs set to be unavailable will not be bootstrapped.
- source :: str
The link to the source location of the given program, usually a https link to a git repository
- root :: str
The folder at which the program is already installed at. This will request a non source based bootstrapping of the program.
- version :: str
The version of the program to use. Can also be a git tag or commit SHA.
The default version of a configuration file can be generated using
python3 -m puffin configure
(if no environment variables are set).-
daemon
()[source]¶ Grants direct access to the
daemon
part of the configuration.- Returns
- settings :: dict
A sub-dict of the total configuration.
- rtype
dict
..
-
database
()[source]¶ Grants direct access to the
database
part of the configuration.- Returns
- settings :: dict
A sub-dict of the total configuration.
- rtype
dict
..
-
dump
(path)[source]¶ Dumps the current configuration into a .yaml file.
- Parameters
- path :: str
The file to dump the configuration into.
-
load
(path=None)[source]¶ Loads the configuration. The configuration is initialized using the default values, then all settings given in the file (if there is one) are applied. Finally all settings given as environment variables are applied.
Each setting in the config can be set via a corresponding environment variable. The settings are given as
PUFFIN_<key1>_<key2>=<value>
where the keys are the chain of uppercase keys to the final setting. As an example:PUFFIN_DAEMON_MODE=debug
would equalconfig['daemon']['mode'] = 'debug'
.- The exact load order is (with the latter one overriding the former):
defaults
file path
environment variables
- Parameters
- path :: str
The file to read the configuration from. Default:
None
-
scine_puffin.config.
dict_generator
(indict, pre=None)[source]¶ A small helper function/generator recursively generating all chains of keys for a given dictionary.
- Parameters
- indict :: dict
The dictionary to traverse.
- pre :: dict
The the parent dictionary (used for recursion).
- Yields
- key_chain :: List[str]
A list of keys from top level to bottom level for each end in the tree of possible value fields in the given dictionary.
Module: daemon¶
-
scine_puffin.daemon.
shutdown
(signum, frame)[source]¶ A small helper function triggering the stop of the process.
- Parameters
- signum :: int
Dummy variable to match the signal dependent function signature.
- frame
Dummy variable to match the signal dependent function signature.
Module: jobloop¶
-
scine_puffin.jobloop.
check_setup
(config)[source]¶ Checks if all the programs are correctly installed or reachable.
- Parameters
- config :: scine_puffin.config.Configuration
The current configuration of the Puffin.
- :rtype: :py:class:`~typing.Dict`[:py:class:`str`, :py:class:`str`]
-
scine_puffin.jobloop.
kill_daemon
(config)[source]¶ Kills the Puffin instantaneously without any possibility of a graceful exit.
- Parameters
- config :: scine_puffin.config.Configuration
The current configuration of the Puffin.
- :rtype: ``None``
-
scine_puffin.jobloop.
loop
(config, available_jobs)[source]¶ The outer loop function. This function controls the forked actual loop function, which is implemented in _loop_impl(). The loop has an added timeout and also a 15 min ping is added showing that the runner is still alive.
- Parameters
- config :: scine_puffin.config.Configuration
The current configuration of the Puffin.
- available_jobs :: dict
The dictionary of available jobs, given the current config and runtime environment.
- :rtype: ``None``
-
scine_puffin.jobloop.
slow_connect
(manager, config)[source]¶ Connects the given Manager to the database referenced in the Configuration. This version of connecting tries 30 times to connect to the database. Each attempt is followed by a wait time of 1.0 + random([0.0, 1.0]) seconds in order to stagger connection attempts of multiple Puffin instances.
- Parameters
- manager :: scine_database.Manager
The database manager/connection.
- config :: scine_puffin.config.Configuration
The current configuration of the Puffin.
- :rtype: ``None``
Module: jobs.templates.job¶
-
class
scine_puffin.jobs.templates.job.
Job
[source]¶ A common interface for all jobs in/carried out by a Puffin
-
archive
(archive)[source]¶ Archives all files existent in the job’s directory into tarball named with the job’s ID. The tarball is then moved to the given destination.
- Parameters
- archive :: str
The path to move the resulting tarball to.
-
capture_raw_output
()[source]¶ Tries to capture the raw output of the calculation context and save it in the raw_output field of the configured calculation. This should never throw.
Notes
Requires run configuration
-
check_duplicate_property
(structure, properties, property_name, model)[source]¶ Checks for a property that is an exact match for the one queried here. Exact match meaning that key and model both are matches.
- Parameters
- properties :: db.Collection (Scine::Database::Collection)
The collection housing all properties.
- property_name :: str
The name (key) of the queried property, e.g.
electronic_energy
.- model :: db.Model (Scine::Database::Model)
The model used in the calculation that resulted in this property.
- structure :: db.Structure (Scine::Database::Structure)
The structure to be checked in. The structure has to be linked to its collection.
- Returns
- ID :: db.ID (Scine::Database::ID)
Returns
False
if there is no existing property like the one queried or the ID of the first duplicate.
- rtype
object
..
-
complete_job
()[source]¶ Saves the executing Puffin, changes status to db.Status.COMPLETE.
- Return type
None
-
configure_run
(manager, calculation, config)[source]¶ Configures a job for a given Calculation to do tasks in the run function
- Parameters
- manager :: db.Manager (Scine::Database::Manager)
The manager of the database holding all collections
- calculation :: db.Calculation (Scine::Database::Calculation)
The calculation to be performed
- config :: Configuration
The configuration of the Puffin doing the job
-
fail_job
()[source]¶ Saves the executing Puffin, changes status to db.Status.FAILED.
- Return type
None
-
failed_file
()[source]¶ Returns the path to the file indicating a failed calculation, None if job has not been prepared
-
get_collections
(self, manager)[source]¶ Saves Scine Database collections as class variables
- Parameters
- manager :: db.Manager (Scine::Database::Manager)
The manager of the database holding all collections
-
postprocess_calculation_context
()[source]¶ Postprocesses a calculation context, pushing all errors and comments.
- Returns
- True if the job succeeded, False otherwise.
- rtype
bool
..
-
prepare
(job_dir, id)[source]¶ Prepares the actual job. This function has to be implemented by any job that shall be added to Puffins job portfolio.
- Parameters
- job_dir :: str
The path to the directory in which all jobs are executed.
- id :: db.ID (Scine::Database::ID)
The calculation that triggered the execution of this job.
-
static
required_programs
()[source]¶ This function has to be implemented by any job that shall be added to Puffins job portfolio.
-
run
(manager, calculation, config)[source]¶ Runs the actual job. This function has to be implemented by any job that shall be added to Puffins job portfolio.
- Parameters
- manager :: db.Manager (Scine::Database::Manager)
The manager/database to work on/with.
- calculation :: db.Calculation (Scine::Database::Calculation)
The calculation that triggered the execution of this job.
- config :: scine_puffin.config.Configuration
The configuration of Puffin.
- :rtype: :py:class:`bool`
-
set_calculation
(self, calculation)[source]¶ Sets the current Calculation for this job and ensures connection
- Parameters
- calculation :: db.Calculation (Scine::Database::Calculation)
The calculation to be carried out
-
store_property
(properties, property_name, property_type, data, model, calculation, structure, replace=True)[source]¶ Adds a single property into the database, connecting it with a given structure and calculation (it’s results section) and also
- Parameters
- properties :: db.Collection (Scine::Database::Collection)
The collection housing all properties.
- property_name :: str
The name (key) of the new property, e.g.
electronic_energy
.- property_type :: str
The type of property to be added, e.g.
NumberProperty
.- data :: object (According to ‘property_type’)
The data to be stored in the property, the type of this object is dependent on the type of property requested. A
NumberProperty
will require afloat
, aVectorProperty
will require aList[float]
, etc.- model :: db.Model (Scine::Database::Model)
The model used in the calculation that resulted in this property.
- calculation :: db.Calculation (Scine::Database::Calculation)
The calculation that resulted in this property. The calculation has to be linked to its collection.
- structure :: db.Structure (Scine::Database::Structure)
The structure for which the property is to be added. The properties field of the structure will receive an additional entry, or have an entry replaced, based on the options given to this function. The structure has to be linked to its collection.
- replace :: bool
If true, replaces an existing property (identical name and model) with the new one. This option is true by default. If false, doesnothing in the previous case, and returns
None
- Returns
- property :: Derived of db.Property (Scine::Database::Property)
The property, a derived class of db.Property, linked to the properties’ collection, or
None
if no property was generated due to duplication.
- rtype
object
..
-
-
class
scine_puffin.jobs.templates.job.
TurbomoleJob
[source]¶ A common interface for all jobs in Puffin that use Turbomole.
-
class
scine_puffin.jobs.templates.job.
breakable
(value)[source]¶ Helper to allow breaking out of the contex manager early
> with breakable(open(path)) as f: > print ‘before condition’ > if condition: > raise breakable.Break > print ‘after condition’
-
scine_puffin.jobs.templates.job.
calculation_context
(job, stdout_name='output', stderr_name='errors', debug=None)[source]¶ A context manager for a types of calculations that are run externally and may fail, dump large amounts of files or do other nasty things.
The executed code will be run in the working directory of the given job, the first exceptions will be caught and appended to the error output, the context will then close and leave behind a file called
failed
in the scratch directory. If no exceptions are thrown, a file calledsuccess
will be generated in the scratch directory.The output redirector part has been adapted from here [access date Jun 25th, 2019]
- Parameters
- job :: Job
The job holding the working directory and receiving the output and error paths
- stdout_name :: str
Name of the file that the stdout stream should be redirected to. The file will be generated in the given scratch directory.
- stderr_name :: str
Name of the file that the stderr stream should be redirected to. The file will be generated in the given scratch directory.
- debug :: bool
If not given, will be taken from Job Configuration (config[‘daemon’][‘mode’]) If true, runs in debug mode, disabling all redirections.
- Returns
- The context generates three files in the
job.work_dir
beyond any other ones generated by the executed code. The first two are the redirected output streams
stderr` and ``stdout
(the name of these files are set by the context’s arguments), the third file is either calledfailed
orsuccess
depending on the occurrence of an exception in the executed code or not.
- The context generates three files in the
Module: programs.program¶
-
class
scine_puffin.programs.program.
Program
(settings)[source]¶ A common interface for all programs and their setups
- Parameters
- settings :: dict
The settings for the particular program. This dictionary should be the given program’s block in the
Configuration
.
-
available_models
()[source]¶ A small function returning the single point models available now with the given program loaded/installed.
-
check_install
(self)[source]¶ A small function checking if the program was installed/located correctly and does provide the expected features.
-
install
(repo_dir, install_dir, ncores)[source]¶ Installs or loads the given program. After the install, the
check_install
function should run through with out exceptions. The choice of installation/compilation or loading of the program is based on the settings given in the constructor.- Parameters
- repo_dir :: str
The folder for all repositories, if a clone or download is required for the installation, this folder will be used.
- install_dir :: str
If the program is actually installed and not just loaded, this folder will be used as target directory for the install process.
- ncores :: int
The number of cores/threads to be used when compiling/installing the program.
-
setup_environment
(config, env_paths, env_vars)[source]¶ Appends the program specific environment variables to the given dictionaries.
- Parameters
- config :: scine_puffin.config.Configuration
The current global configuration.
- env_paths :: dict
A dictionary for all the environment paths, such as
PATH
andLD_LIBRARY_PATH
. The added settings will be appended to the existing paths, usingexport PATH=$PATH:...
.- env_vars :: dict
A dictionary for all fixed environment variables. All settings will replace existing variables such as
export OMP_NUM_THREADS=1