-

This site is deprecated and will be decommissioned shortly. For current information regarding HPC visit our new site: hpc.njit.edu

Difference between revisions of "ForPHPCMajorExpansion"

From NJIT-ARCS HPC Wiki
Jump to: navigation, search
(Importing text file)
(Importing text file)
 
(26 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
<div class="noautonum">__TOC__</div>
 
<div class="noautonum">__TOC__</div>
  
==ForPHPCMajorExpansion 12Sep21-19:50==
+
==ForPHPCMajorExpansion 14Sep21-18:58==
 +
 
 +
NEW.1 Newark college of Engineering needs "Chemical Engineering"
 +
 
 +
NEW.1 YWCC needs "Data Science"
  
 
== 1. Preamble ==
 
== 1. Preamble ==
Line 30: Line 34:
 
<strong>Definitions</strong>
 
<strong>Definitions</strong>
 
<ul>
 
<ul>
<li>The current IST-managed high performance computing (HPC) clusters referred to in this assessment are</li>
+
<li>The current IST-managed high performance computing (HPC) clusters referred to in this assessment are:</li>
 
<ul>
 
<ul>
 
<li><em>Lochness</em></li>
 
<li><em>Lochness</em></li>
Line 43: Line 47:
 
</ul>
 
</ul>
  
<li>The expansion will include a parallel file system (PFS); current HPC resources do not include a PFS.</li>
+
<li>The expansion will include a <em>parallel file system (PFS)</em>; currently NJIT's HPC infrastructure does not have a PFS.</li>
 
<ul>
 
<ul>
 
<li>A PFS provides cluster nodes shared access to data in parallel.
 
<li>A PFS provides cluster nodes shared access to data in parallel.
Line 146: Line 150:
 
2.4 Please provide a <strong>brief, specific</strong> description(s) of the computational work for which you and/or your research group use IST-managed HPC {text box} (goes in 2.3)
 
2.4 Please provide a <strong>brief, specific</strong> description(s) of the computational work for which you and/or your research group use IST-managed HPC {text box} (goes in 2.3)
  
== 3. Main ==
+
== 3. Pre-Main ==
  
 
3.1 What applications, including your own code, do you run on the Lochness and/or Stheno clusters  
 
3.1 What applications, including your own code, do you run on the Lochness and/or Stheno clusters  
Line 160: Line 164:
 
<li>Extremely</li>
 
<li>Extremely</li>
 
</ul>
 
</ul>
 +
 +
== 3. Main ==
  
 
3.2 How often do you submit jobs to be run on the Lochness and/or Stheno clusters
 
3.2 How often do you submit jobs to be run on the Lochness and/or Stheno clusters
Line 179: Line 185:
 
</ul>
 
</ul>
  
3.4 What is the maximum amount of resources you typically request to run your applications  
+
3.4 NEW.1 What are the maximum resources you typically request for your <strong>CPU</strong> applications  
 
<ul>
 
<ul>
 
<li>Number of cores</li>
 
<li>Number of cores</li>
<li>Memory, GB</li>
+
<li>Memory, GB/core</li>
 
<li>Storage, GB</li>
 
<li>Storage, GB</li>
 
</ul>
 
</ul>
 +
 +
3.4.1 NEW.1 Given sufficient resources, what is the maximum amount you would request for your <strong>CPU</strong> applications
 +
<ul>
 +
<li>Number of cores</li>
 +
<li>Memory, GB/core</li>
 +
<li>Storage, GB</li>
 +
</ul>
 +
 +
3.4.2 NEW.1 Do your application(s) make use of GPUs
 +
<ul>
 +
<li>Yes</li>
 +
<li>No</li>
 +
<li>NEW.1 Don't know</li>
 +
</ul>
 +
 +
3.4.3 NEW.1 What are the maximum resources you typically request for your <strong>GPU</strong> applications
 +
<ul>
 +
<li>Number of cores</li>
 +
<li>NEW.2 Number of GPUs</li>
 +
<li>Memory, GB/core</li>
 +
<li>Storage, GB</li>
 +
</ul>
 +
 +
3.4.4 NEW.1 Given sufficient resources, what is the maximum amount you would request for your <strong>GPU</strong> applications
 +
<ul>
 +
<li>Number of cores</li>
 +
<li>Memory, GB/core</li>
 +
<li>Storage, GB</li>
 +
</ul>
 +
 +
3.4.5 NEW.1 Do any of your applications require more than 1TB/node of RAM
 +
<ul>
 +
<li>Yes</li>
 +
<li>No</li>
 +
<li>Don't know</li>
 +
</ul>
 +
 +
3.4.6 NEW.1 What is the maximum number of jobs you simultaneously submit to a cluster
 +
 +
3.4.6 NEW.1 Given sufficient resources, what is the maximum number of jobs you would be likely to simultaneously submit to a cluster
  
 
3.5 Do you require multiple nodes to run your application(s)
 
3.5 Do you require multiple nodes to run your application(s)
Line 190: Line 236:
 
<li>Yes</li>
 
<li>Yes</li>
 
<li>No</li>
 
<li>No</li>
 +
<li>NEW.1 Don't know</li>
 
</ul>
 
</ul>
  
Line 196: Line 243:
 
<li>Number of cores available</li>
 
<li>Number of cores available</li>
 
<li>Amount of RAM available</li>
 
<li>Amount of RAM available</li>
 +
<li>NEW.1 Disk I/O</li>
 
<li>Mixed - depends on the application</li>
 
<li>Mixed - depends on the application</li>
 
<li>Don't know</li>
 
<li>Don't know</li>
 
</ul>
 
</ul>
  
3.7 Do your application(s) make use of GPUs
+
3.6.1 NEW.1 What is your preference in CPU processor type
 
<ul>
 
<ul>
<li>Yes</li>
+
<li>Intel</li>
<ul>
+
<li>AMD</li>
<li>What is the maximum number of GPUs your application(s) use in a single job</li>
+
<li>No preference</li>
</ul>
+
<li>NEW.2 Unsure</li>
<li>No</li>
+
 
</ul>
 
</ul>
 +
 +
3.7 Moved
 +
 
3.8 Do your application(s) make use of parallel processing
 
3.8 Do your application(s) make use of parallel processing
 
<ul>
 
<ul>
 
<li>Yes</li>
 
<li>Yes</li>
 
<li>No</li>
 
<li>No</li>
 +
<li>NEW.1 Don't know</li>
 
</ul>
 
</ul>
  
3.9 Do your application(s) require a parallel fIle system (PFS) for optimal processing
+
3.9 NEW.1 Would your application(s) significantly benefit by using a parallel file system (PFS)
 
<ul>
 
<ul>
 
<li>Yes</li>
 
<li>Yes</li>
Line 221: Line 272:
 
</ul>
 
</ul>
  
3.10 Do your application(s) require a high-speed, low-latency compute node interconnect (e.g., InfiniBand)
+
3.10 NEW.1 Do your application(s) require a high-speed, low-latency compute node interconnect (e.g., InfiniBand) for minimally adequate performance
 
<ul>
 
<ul>
 
<li>Yes</li>
 
<li>Yes</li>
Line 236: Line 287:
 
<li>About a week</li>
 
<li>About a week</li>
 
<li>Several weeks</li>
 
<li>Several weeks</li>
<li>Other - specify</li>
+
<li>NEW.2 More than several weeks - please approximate</li>
 
</ul>
 
</ul>
 
<ul>
 
<ul>
Line 255: Line 306:
 
<li>Between a TB and a PB</li>
 
<li>Between a TB and a PB</li>
 
<li>More than a PB</li>
 
<li>More than a PB</li>
 +
<li>NEW.1 Don't know</li>
 
</ul>
 
</ul>
  
Line 297: Line 349:
 
<li>Other - specify</li>
 
<li>Other - specify</li>
 
</ul>
 
</ul>
 +
 +
4. Please provide comments on how this major HPC expansion is likely to affect your research

Latest revision as of 22:59, 14 September 2021

ForPHPCMajorExpansion 14Sep21-18:58

NEW.1 Newark college of Engineering needs "Chemical Engineering"

NEW.1 YWCC needs "Data Science"

1. Preamble

There will be a major, multi-million dollar expansion to NJIT's high performance computing (HPC) infrastructure, scheduled to be on-line in Spring 2022.

This expansion will include:

  • A significant increase in the number of public-access CPU cores
  • A significant increase in the number of public-access GPU cores
  • High-speed interconnects (InfiniBand) for all new nodes
  • A parallel file system with a capacity of at least a petabyte
  • Cluster management software
  • Support for the SLURM scheduler/resource manager

The purpose of this assessment is to obtain information from researchers that will be used to determine the hardware specifications of this expansion.

By providing input, you will influence the final specifications for this expansion.

Please be as informative as possible in your written responses.

Please complete this assessment by Wednesday 22 September 2021 - assessments submitted after that date will not be included in the results.

Defs

Definitions

  • The current IST-managed high performance computing (HPC) clusters referred to in this assessment are:
    • Lochness
      • Public-access and privately-owned nodes, both CPU and GPU
    • Stheno
      • CPU and GPU nodes, owned by the Dept. of Mathematical Sciences
  • The expansion will include a parallel file system (PFS); currently NJIT's HPC infrastructure does not have a PFS.
    • A PFS provides cluster nodes shared access to data in parallel. It enables concurrent access to storage by multiple tasks of a parallel application, to facilitate high-performance through simultaneous, coordinated input/output operations between compute nodes and storage.

2. Demographics

Your NJIT position and computational research areas.

2.1 What is your NJIT position? {button}

  • Faculty
    • Tenured
    • Tenure-track
    • Non-tenure-track
  • Academic research staff {text box}
  • Postdoc

2.1.1 What is your department {dd menu}

Newark College of Engineering

  • Biomedical Engineering
  • Biological and Pharmaceutical Engineering
  • Civil and Environmental Engineering
  • Electrical and Computer Engineering
  • Engineering Technology
  • Mechanical and Industrial Engineering
  • Other {text box}

College of Science and Liberal Arts

  • Aerospace Studies (AFROTC)
  • Chemistry and Environmental Science
  • Humanities
  • Mathematical Sciences
  • Physics
  • Federated Department of Biological Sciences
  • Federated Department of History
  • Rutgers/NJIT Theatre Arts Program
  • Other {text box}

Ying Wu College of Computing

  • Computer Science
  • Informatics
  • Other {text box}

Martin Tuchman School of Management

College of Architecture and Design

  • NJ School of Architecture
  • School of Art and Design
  • Other {text box}

2.2 For approximately how long have you and/or your research group been using IST-managed high performance computing (HPC) resources? {dd menu}

  • Less than 6 months
  • 6+ to 12 months
  • 1+ to 2 years
  • 2+ to 5 years
  • 5+ years
  • Don't know

2.3 What is the general classification of computations for which you and/or your research group use IST-managed HPC {check all that apply}

  • Bioinfomatics
  • Bioinformatics
  • Biophysics
  • Computational PDE
  • Computational biophysics
  • Computational chemistry
  • Computational fluid dynamics
  • Computational physics and chemistry
  • Condensed matter physics
  • Electromagnetism, Wave propagation
  • Granular science
  • Image forensics
  • Materials research
  • Monte Carlo
  • Neural networks, genetic algorithms
  • Software verification, static analysis
  • Statistical analysis
  • Steganalysis and image forensics
  • Transportation data analysis
  • Other {text box}

2.4 Please provide a brief, specific description(s) of the computational work for which you and/or your research group use IST-managed HPC {text box} (goes in 2.3)

3. Pre-Main

3.1 What applications, including your own code, do you run on the Lochness and/or Stheno clusters

3.1.1 Specify an application that you run on the clusters. If the application is your own ....

3.1.2 Importance of application

  • Minimally
  • Slightly
  • Moderately
  • Very
  • Extremely

3. Main

3.2 How often do you submit jobs to be run on the Lochness and/or Stheno clusters

  • Several times a day
  • Once daily
  • Every few days
  • Weekly
  • Monthly

3.3 Do you compile, or re-compile, your applications prior to processing

  • Yes
    • What compilers do you use
  • No

3.4 NEW.1 What are the maximum resources you typically request for your CPU applications

  • Number of cores
  • Memory, GB/core
  • Storage, GB

3.4.1 NEW.1 Given sufficient resources, what is the maximum amount you would request for your CPU applications

  • Number of cores
  • Memory, GB/core
  • Storage, GB

3.4.2 NEW.1 Do your application(s) make use of GPUs

  • Yes
  • No
  • NEW.1 Don't know

3.4.3 NEW.1 What are the maximum resources you typically request for your GPU applications

  • Number of cores
  • NEW.2 Number of GPUs
  • Memory, GB/core
  • Storage, GB

3.4.4 NEW.1 Given sufficient resources, what is the maximum amount you would request for your GPU applications

  • Number of cores
  • Memory, GB/core
  • Storage, GB

3.4.5 NEW.1 Do any of your applications require more than 1TB/node of RAM

  • Yes
  • No
  • Don't know

3.4.6 NEW.1 What is the maximum number of jobs you simultaneously submit to a cluster

3.4.6 NEW.1 Given sufficient resources, what is the maximum number of jobs you would be likely to simultaneously submit to a cluster

3.5 Do you require multiple nodes to run your application(s)

  • Yes
  • No
  • NEW.1 Don't know

3.6 Do your applications mainly depend on

  • Number of cores available
  • Amount of RAM available
  • NEW.1 Disk I/O
  • Mixed - depends on the application
  • Don't know

3.6.1 NEW.1 What is your preference in CPU processor type

  • Intel
  • AMD
  • No preference
  • NEW.2 Unsure

3.7 Moved

3.8 Do your application(s) make use of parallel processing

  • Yes
  • No
  • NEW.1 Don't know

3.9 NEW.1 Would your application(s) significantly benefit by using a parallel file system (PFS)

  • Yes
  • No
  • Don't know

3.10 NEW.1 Do your application(s) require a high-speed, low-latency compute node interconnect (e.g., InfiniBand) for minimally adequate performance

  • Yes
  • No
  • Don't know

3.11 What is the typical maximum time for your most compute-intensive runs to complete

  • Several minutes
  • Several hours
  • About a day
  • Several days
  • About a week
  • Several weeks
  • NEW.2 More than several weeks - please approximate
  • Is the maximum time to completion that you specified above acceptable
    • Yes
    • No
      • What is the maximum time to completion that would be acceptable

3.12 What is the maximum amount of data you need to store per run, or series of runs, for post-processing

  • Less than a few GBs
  • Between a few GBs and a TB
  • Between a TB and a PB
  • More than a PB
  • NEW.1 Don't know

3.13 What type(s) of data do you need to store - choose all that apply

  • Numerical
  • Text
  • Images
  • Video
  • Other - specify

3.14 How frequently does the data that you store need to be accessed

  • Several times a day
  • Once a day
  • Every few days
  • Once a week
  • Once every few weeks
  • Once a month
  • Every few months
  • About once a year
  • Other - specify

3.15 How long does this data need to be retained

  • A few days
  • A few weeks
  • A few months
  • A year
  • Several years
  • Other - specify

3.16 Other than yourself, how many individuals require access to this data

  • None
  • Between 1 and 5
  • Between 6 and 20
  • Other - specify

4. Please provide comments on how this major HPC expansion is likely to affect your research