-

This site is deprecated and will be decommissioned shortly. For current information regarding HPC visit our new site: hpc.njit.edu

Difference between pages "UpsalaMPI" and "UseOfHeadnodes"

From NJIT-ARCS HPC Wiki
(Difference between pages)
Jump to: navigation, search
(Importing text file)
 
(Importing text file)
 
Line 1: Line 1:
<p>A comparison between job scripts in slurm and sge</p>
+
Cluster headnodes are a limited shared resource and one researcher's activities could possibly interfere with the efforts of all other users of the cluster, including jobs running on the compute nodes. This makes it necessary to enforce a strict policy controlling how much CPU utilization a single researcher can occupy on the headnode. Longer running jobs should be submitted to the scheduler via "qsub" or "qlogin".  On Kong and Stheno the limit is one CPU minute per a process.
<table>
+
 
<tr>
+
If a user exceeds the limit an email is sent to the user and system administrators describing the terminated job, reiterating this policy, and asking that qsub or qlogin be used. If the user is a student researcher then the email is copied to his/her sponsoring faculty.
<th>SGE for an MPI application</th>
+
 
<th>SLURM for an MPI application</th>
+
If a user repeatedly exceeds the limit then ARCS will temporary disable the user's login.
</tr>
+
 
<tr>
+
If one CPU minute is not adequate for your testing of a process you intend to submit via qsub, then please email ARCS@njit.edu for an exception.
<td>
+
<div>
+
<pre>
+
#!/bin/bash
+
#
+
#$ -N test
+
#$ -j y
+
#$ -o test.output
+
#$ -cwd
+
#$ -M ucid@njit.edu
+
#$ -m bea
+
# Request 5 hours run time
+
#$ -l h_rt=5:0:0
+
#$ -P your_project_id
+
#$ -R y
+
#$ -pe dmp4 16
+
#$ -l mem=2G
+
# memory is counted per process on node
+
module load module1 module2 ...
+
mpirun your_application
+
</pre>
+
</div>
+
</td>
+
<td>
+
<div>
+
<pre>
+
#!/bin/bash -l
+
# NOTE the -l (login) flag!
+
#
+
#SBATCH -J test
+
#SBATCH -o test.output
+
#SBATCH -e test.output
+
# Default in slurm
+
#SBATCH --mail-user ucid@njit.edu
+
#SBATCH --mail-type=ALL
+
# Request 5 hours run time
+
#SBATCH -t 5:0:0
+
#SBATCH -A your_project_id
+
#
+
#SBATCH -p node -n 16
+
#
+
module load module1 module2 ...
+
mpirun your_application
+
</pre>
+
</div>
+
</td>
+
</tr>
+
<tr>
+
<td></td>
+
<td></td>
+
</tr>
+
</table>
+

Latest revision as of 16:36, 5 October 2020

Cluster headnodes are a limited shared resource and one researcher's activities could possibly interfere with the efforts of all other users of the cluster, including jobs running on the compute nodes. This makes it necessary to enforce a strict policy controlling how much CPU utilization a single researcher can occupy on the headnode. Longer running jobs should be submitted to the scheduler via "qsub" or "qlogin". On Kong and Stheno the limit is one CPU minute per a process.

If a user exceeds the limit an email is sent to the user and system administrators describing the terminated job, reiterating this policy, and asking that qsub or qlogin be used. If the user is a student researcher then the email is copied to his/her sponsoring faculty.

If a user repeatedly exceeds the limit then ARCS will temporary disable the user's login.

If one CPU minute is not adequate for your testing of a process you intend to submit via qsub, then please email ARCS@njit.edu for an exception.