ForPHPCMajorExpansion

From NJIT-ARCS HPC Wiki
Revision as of 14:13, 14 September 2021 by Hpcwiki1 dept.admin (Talk | contribs) (Importing text file)

Jump to: navigation, search

ForPHPCMajorExpansion 14Sep21-09:37

NEW.1 Newark college of Engineering needs "Chemical Engineering"

NEW.1 YWCC needs "Data Science"

1. Preamble

There will be a major, multi-million dollar expansion to NJIT's high performance computing (HPC) infrastructure, scheduled to be on-line in Spring 2022.

This expansion will include:

  • A significant increase in the number of public-access CPU cores
  • A significant increase in the number of public-access GPU cores
  • High-speed interconnects (InfiniBand) for all new nodes
  • A parallel file system with a capacity of at least a petabyte
  • Cluster management software
  • Support for the SLURM scheduler/resource manager

The purpose of this assessment is to obtain information from researchers that will be used to determine the hardware specifications of this expansion.

By providing input, you will influence the final specifications for this expansion.

Please be as informative as possible in your written responses.

Please complete this assessment by Wednesday 22 September 2021 - assessments submitted after that date will not be included in the results.

Defs

Definitions

  • The current IST-managed high performance computing (HPC) clusters referred to in this assessment are:
    • Lochness
      • Public-access and privately-owned nodes, both CPU and GPU
    • Stheno
      • CPU and GPU nodes, owned by the Dept. of Mathematical Sciences
  • The expansion will include a parallel file system (PFS); currently NJIT's HPC infrastructure does not have a PFS.
    • A PFS provides cluster nodes shared access to data in parallel. It enables concurrent access to storage by multiple tasks of a parallel application, to facilitate high-performance through simultaneous, coordinated input/output operations between compute nodes and storage.

2. Demographics

Your NJIT position and computational research areas.

2.1 What is your NJIT position? {button}

  • Faculty
    • Tenured
    • Tenure-track
    • Non-tenure-track
  • Academic research staff {text box}
  • Postdoc

2.1.1 What is your department {dd menu}

Newark College of Engineering

  • Biomedical Engineering
  • Biological and Pharmaceutical Engineering
  • Civil and Environmental Engineering
  • Electrical and Computer Engineering
  • Engineering Technology
  • Mechanical and Industrial Engineering
  • Other {text box}

College of Science and Liberal Arts

  • Aerospace Studies (AFROTC)
  • Chemistry and Environmental Science
  • Humanities
  • Mathematical Sciences
  • Physics
  • Federated Department of Biological Sciences
  • Federated Department of History
  • Rutgers/NJIT Theatre Arts Program
  • Other {text box}

Ying Wu College of Computing

  • Computer Science
  • Informatics
  • Other {text box}

Martin Tuchman School of Management

College of Architecture and Design

  • NJ School of Architecture
  • School of Art and Design
  • Other {text box}

2.2 For approximately how long have you and/or your research group been using IST-managed high performance computing (HPC) resources? {dd menu}

  • Less than 6 months
  • 6+ to 12 months
  • 1+ to 2 years
  • 2+ to 5 years
  • 5+ years
  • Don't know

2.3 What is the general classification of computations for which you and/or your research group use IST-managed HPC {check all that apply}

  • Bioinfomatics
  • Bioinformatics
  • Biophysics
  • Computational PDE
  • Computational biophysics
  • Computational chemistry
  • Computational fluid dynamics
  • Computational physics and chemistry
  • Condensed matter physics
  • Electromagnetism, Wave propagation
  • Granular science
  • Image forensics
  • Materials research
  • Monte Carlo
  • Neural networks, genetic algorithms
  • Software verification, static analysis
  • Statistical analysis
  • Steganalysis and image forensics
  • Transportation data analysis
  • Other {text box}

2.4 Please provide a brief, specific description(s) of the computational work for which you and/or your research group use IST-managed HPC {text box} (goes in 2.3)

3. Main

3.1 What applications, including your own code, do you run on the Lochness and/or Stheno clusters

3.1.1 Specify an application that you run on the clusters. If the application is your own ....

3.1.2 Importance of application

  • Minimally
  • Slightly
  • Moderately
  • Very
  • Extremely

3.2 How often do you submit jobs to be run on the Lochness and/or Stheno clusters

  • Several times a day
  • Once daily
  • Every few days
  • Weekly
  • Monthly

3.3 Do you compile, or re-compile, your applications prior to processing

  • Yes
    • What compilers do you use
  • No

3.4 NEW.1 What are the maximum resources you typically request for your CPU applications

  • Number of cores
  • Memory, GB/core
  • Storage, GB

3.4.1 NEW.1 If available, what are the maximum resources you would request for your CPU applications

  • Number of cores
  • Memory, GB/core
  • Storage, GB

3.4.2 Do your application(s) make use of GPUs

  • Yes
    • What is the maximum number of GPUs your application(s) use in a single job
  • No
  • NEW.1 Don't know

3.4.3 NEW.1 What are the maximum resources you typically request for your GPU applications

  • Number of cores
  • Memory, GB/core
  • Storage, GB

3.4.4 NEW.1 If available, what are the maximum resources you would request for your GPU applications

  • Number of cores
  • Memory, GB/core
  • Storage, GB

3.4.5 NEW.1 Do your applications require a very large amount of RAM (more than 1TB/node)

3.4.6 NEW.1 What is the maximum number of jobs you typically submit

3.4.6 NEW.1 If the resources were avaiable, what is the maximum number of jobs you would typically submit

3.5 Do you require multiple nodes to run your application(s)

  • Yes
  • No
  • NEW.1 Don't know

3.6 Do your applications mainly depend on

  • Number of cores available
  • Amount of RAM available
  • NEW.1 Disk I/O
  • Mixed - depends on the application
  • Don't know

3.6.1 NEW.1 What is your prefernce in CPU processor type

  • Intel
  • AMD
  • No preference
  • Not knowledgable enough to decide


3.8 Do your application(s) make use of parallel processing

  • Yes
  • No
  • NEW.1 Don't know

3.9 NEW.1 Would your application(s) significantly benefit by using a parallel file system (PFS)

  • Yes
  • No
  • Don't know

3.10 NEW.1 Do your application(s) require a high-speed, low-latency compute node interconnect (e.g., InfiniBand) for minimally adequate performance

  • Yes
  • No
  • Don't know

3.11 What is the typical maximum time for your most compute-intensive runs to complete

  • Several minutes
  • Several hours
  • About a day
  • Several days
  • About a week
  • Several weeks
  • Other - specify
  • Is the maximum time to completion that you specified above acceptable
    • Yes
    • No
      • What is the maximum time to completion that would be acceptable

3.12 What is the maximum amount of data you need to store per run, or series of runs, for post-processing

  • Less than a few GBs
  • Between a few GBs and a TB
  • Between a TB and a PB
  • More than a PB
  • NEW.1 Don't know

3.13 What type(s) of data do you need to store - choose all that apply

  • Numerical
  • Text
  • Images
  • Video
  • Other - specify

3.14 How frequently does the data that you store need to be accessed

  • Several times a day
  • Once a day
  • Every few days
  • Once a week
  • Once every few weeks
  • Once a month
  • Every few months
  • About once a year
  • Other - specify

3.15 How long does this data need to be retained

  • A few days
  • A few weeks
  • A few months
  • A year
  • Several years
  • Other - specify

3.16 Other than yourself, how many individuals require access to this data

  • None
  • Between 1 and 5
  • Between 6 and 20
  • Other - specify

4. Please provide comments on how this major HPC expansion is likely to affect your research