SGEToSLURM
From NJIT-ARCS HPC Wiki

SGE to SLURM Migration Guide
On that page, see:
Tutorials
Documentation
FAQ
Man pages exist for all SLURM daemons, commands, and API functions.
Example: man squeue
The command option "--help" also provides a brief summary of options.
Example: squeue --help
Command options are all case sensitive..
Some common commands and flags in SGE and SLURM with their respective equivalents
SGE | SLURM | |
User Commands | ||
Interactive login | qlogin | srun --pty bash |
Job submission | qsub [script_file] | sbatch [script_file] |
Job deletion | qdel [job_id] | scancel [job_id] |
Job status by job | qstat -u \* [-j job_id] | squeue [job_id] |
Job status by user | qstat [-u user_name] |
squeue -u [user_name] |
Job hold | qhold [job_id] | scontrol hold [job_id] |
Job release | qrls [job_id] | scontrol release [job_id] |
List enqueued jobs | qconf -sql | squeue |
List nodes | qhost | sinfo -N OR scontrol show nodes |
Cluster status | qhost -q | sinfo |
Environmental | ||
Job ID | $JOB_ID | $SLURM_JOBID |
Submit directory | $SGE_O_WORKDIR | $SLURM_SUBMIT_DIR |
Submit host | $SGE_O_HOST | $SLURM_SUBMIT_HOST |
Node list | $PE_HOSTFILE | $SLURM_JOB_NODELIST |
Job Array Index | $SGE_TASK_ID | $SLURM_ARRAY_TASK_ID |
Job Specification | ||
Script directive | #$ | #SBATCH |
Queue (called partition in SLURM) | -q [queue] | -p [partition] |
Count of nodes | N/A | -N [min[-max]] |
CPU count | -pe [PE] [count] | -n [count] |
Wall clock limit | -l h_rt=[seconds] | -t [min] OR -t [days-hh:mm:ss] |
Standard out file | -o [file_name] | -o [file_name] |
Standard error file | -e [file_name] | -e [file_name] |
Combine STDOUT and STDERR files | -j yes | use "-o" without "-e" |
Copy environment | -V | --export=[ALL | NONE | variables] |
Event notification | -m abe | --mail-type=[events] |
Send notification email | -M [address] | --mail-user=[address] |
Job name | -N [name] | --job-name=[name] |
Restart job | -r [yes|no] | --requeue OR --no-requeue (NOTE: configurable default) |
Set working directory | -wd [directory] | --workdir=[dir_name] |
Resource sharing | -l exclusive | --exclusive OR--shared |
Memory size | -l mem_free=[memory][K|M|G] | --mem=[mem][M|G|T] OR --mem-per-cpu= [mem][M|G|T] |
Charge to an account | -A [account] | --account=[account] |
Tasks per node | (Fixed allocation_rule in PE) | --tasks-per-node=[count] |
--cpus-per-task=[count] | ||
Job dependancy | -hold_jid [job_id | job_name] | --depend=[state:job_id] |
Job project | -P [name] | --wckey=[name] |
Job host preference | -q [queue]@[node] OR -q [queue]@@[hostgroup] |
--nodelist=[nodes] AND/OR --exclude= [nodes] |
Quality of service | --qos=[name] | |
Job arrays | -t [array_spec] | --array=[array_spec] (Slurm version 2.6+) |
Generic Resources | -l [resource]=[value] | --gres=[resource_spec] |
Licenses | -l [license]=[count] | --licenses=[license_spec] |
Begin Time | -a [YYMMDDhhmm] | --begin=YYYY-MM-DD[THH:MM[:SS]] |
Detailed qstat and squeue comparison in SGE and SLURM
SGE | SLURM |
---|---|
qstat | squeue |
qstat -u username | squeue -J jobname |
qstat -f | squeue -al |
qsub | sbatch |
qsub -N jobname | sbatch -J jobname |
qsub -q datasci | sbatch -q datasci |
qsub -m beas | sbatch --mail-type=ALL |
qsub -M ucid@njit.edu | sbatch --mail-user=ucid@njit.edu |
qsub -l h_rt=24:00:00 | sbatch -t 24:00:00 |
qsub -pe dmp4 16 | sbatch -p node -n 16 |
qsub -l mem=4G | sbatch --mem=4G |
qsub -P projectname | sbatch -A projectname |
qsub -o filename | sbatch -o filename |
qsub -e filename | sbatch -e filename |
qsub -l scratch_free=20G | sbatch --tmp=20480 |
Comparison between job scripts in SGE and SLURM
SGE for a single-core application | SLURM for a single-core application |
---|---|
#!/bin/bash # #$ -N testjobname #$ -q datasci #$ -j y #$ -o test.output #$ -cwd #$ -M ucid@njit.edu #$ -m bea # Request 5 hours run time #$ -l h_rt=5:0:0 #$ -P your_project_ID> # #$ -l mem=4G # your_application |
#!/bin/bash -l # NOTE the -l (login) flag! # #SBATCH -J testjobname #SBATCH -p datasci #SBATCH -o test.output #SBATCH -e test.output # Default in slurm #SBATCH --mail-user ucid@njit.edu #SBATCH --mail-type=ALL # Request 5 hours run time #SBATCH -t 5:0:0 #SBATCH -A your_project_ID #SBATCH --mem=4G your_application |
SGE for an MPI application | SLURM for an MPI application |
---|---|
#!/bin/bash # #$ -N testjobname #$ -q datasci #$ -j y #$ -o test.output #$ -cwd #$ -M ucid@njit.edu #$ -m bea # Request 5 hours run time #$ -l h_rt=5:0:0 #$ -P your_project_id #$ -R y #$ -pe dmp4 16 #$ -l mem=2G # memory is counted per process on node module load module1 module2 ... mpirun your_application |
#!/bin/bash -l # NOTE the -l (login) flag! # #SBATCH -J testjobname #SBATCH -p datasci #SBATCH -o test.output #SBATCH -e test.output # Default in slurm #SBATCH --mail-user ucid@njit.edu #SBATCH --mail-type=ALL # Request 5 hours run time #SBATCH -t 5:0:0 #SBATCH -A your_project_id # #SBATCH --mem-per-cpu 2G #SBATCH -n 16 # module load module1 module2 ... mpirun your_application |
Comparison of some parallel environments set by SGE and SLURM
SGE | SLURM |
---|---|
$JOB_ID | $SLURM_JOB_ID |
$NSLOTS | $SLURM_NPROCS |