fluiddyn.clusters.slurm

Slurm clusters (fluiddyn.clusters.slurm)

Provides:

class fluiddyn.clusters.slurm.ClusterSlurm(check_scheduler=True, **kwargs)[source]

Bases: ClusterLocal

Base class for clusters with SLURM job scheduler.

name_cluster = ''

Name of cluster used in check_name_cluster()

nb_cores_per_node: int | None = 32

Number of cores per node

default_project = None

Default project allocation

cmd_run = 'srun'

Command to launch executable

cmd_run_interactive = None

Interactive command to launch exectuable

cmd_launch = 'sbatch'

Command to submit job script

max_walltime = '23:59:59'

Maximum walltime allowed per job

partition = None

Partition on the cluster

dependency = None

Dependency option

mem = None

Minimum amount of real memory allocation for the job

account = None

Name of the project for jobs’ submission (mandatory on some clusters)

exclusive = False

Reserve nodes when submitting jobs

check_slurm()[source]

Check if this script is run on a frontal with slurm installed.

check_name_cluster(env='HOSTNAME')[source]

Check if name_cluster matches the environment variable.

submit_command(command, name_run='fluiddyn', nb_nodes=1, nb_cores_per_node=None, nb_tasks=None, nb_tasks_per_node=None, nb_cpus_per_task=None, walltime='23:59:58', project=None, nb_mpi_processes=None, omp_num_threads=1, nb_runs=1, path_launching_script=None, path_resume=None, retain_script=True, jobid=None, requeue=False, nb_switches=None, max_waittime=None, ask=True, bash=True, email=None, interactive=False, signal_num=12, signal_time=300, flexible_walltime=False, partition=None, dependency=None, mem=None, account=None, exclusive=False, **kwargs)[source]

Submit a command.

Parameters:
commandstring

Command which executes the run

name_runstring

Name of the run to be displayed in SLURM queue

nb_nodesinteger

Number of nodes

nb_cores_per_nodeinteger

Defaults to a maximum is fixed for a cluster, as set by self.nb_cores_per_node. Set as 1 for a serial job. Set as 0 to spread jobs across nodes (starts job faster, maybe slower).

nb_tasksinteger

Number of tasks. If not specified, computed as nb_nodes * nb_cores_per_node.

nb_tasks_per_nodeinteger

Number of tasks per node. If not specified, computed as nb_cores_per_node.

nb_cpus_per_taskinteger

Number of cpus requested per task. Only set if the –cpus-per-task option is specified.

walltimestring

Minimum walltime for the job

projectstring

Sets the allocation to run the job under

nb_mpi_processesinteger

Number of MPI processes. Defaults to None (no MPI). If "auto", computed as nb_cores_per_node * nb_nodes.

omp_num_threadsinteger

Number of OpenMP threads

nb_runsinteger

Number of times to submit jobs (launch once using command and resume thereafter with path_resume script / command).

path_launching_script: string

Path of the SLURM jobscript

path_resumestring

Path of the script to resume a job, which takes one argument - the path_run parsed from the output.

retain_scriptboolean

Retail or delete script after launching job

jobidinteger

Run under already allocated job

requeueboolean

If set True, permit the job to be requeued.

nb_switchesinteger

Max / Optimum switches

max_waittimestring

Max time to wait for optimum

askboolean

Ask for user input to submit the jobscript or not

bashboolean

Submit jobscript via fluiddyn.io.query.call_bash() function

emailstring

In case of failure notify to the specified email address

interactiveboolean

Use cmd_run_interactive instead of cmd_run inside the jobscript

signal_numint or False
signal_timeint

Send the signal signal_num signal_time` seconds before the end of the job.

flexible_walltimebool

If true, submit a job as:

sbatch --time-min=<walltime> --time=<max_walltime> ...

where walltime is a parameter of this method and max_walltime is a class attribute. This would allow SLURM to provide an optimum walltime in the range requested.

Note that if signal_num is provided flexible_walltime is not practical and will be forced to be False.

partition: str

Request a specific partition for the resource allocation. Default None.

dependency: str

Job dependencies are used to defer the start of a job until the specified dependencies have been satisfied. They are specified with the –dependency option to sbatch

mem: str

Minimum amount of real memory allocation for the job

account: str

Name of the project to which hours are allocated

exclusive: boolean

Reserve nodes when submitting jobs

launch_more_dependant_jobs(job_id, nb_jobs_added, path_launcher=None, job_status='afterok')[source]

Launch dependant jobs using sbatch --dependency=... command.

Parameters:
job_id: int

First running job id to depend on.

nb_jobs_added: int

Total number of dependent jobs to be added.

path_launcher: str

Path to launcher script

job_status: str

Job status of preceding job. Typical values are afterok, afternotok, afterany.

Classes

ClusterSlurm([check_scheduler])

Base class for clusters with SLURM job scheduler.