Difference between pages "ForPHPCMajorExpansion" and "ForPSurveyResearchIT"

From NJIT-ARCS HPC Wiki
(Difference between pages)
Jump to: navigation, search
(Importing text file)
 
(Importing text file)
 
Line 1: Line 1:
 
<div class="noautonum">__TOC__</div>
 
<div class="noautonum">__TOC__</div>
  
==ForPHPCMajorExpansion 12Sep21-19:50==
+
== ForP.survey.research.it ==
  
 
== 1. Preamble ==
 
== 1. Preamble ==
There will be a major, multi-million dollar expansion to NJIT's high performance
 
computing (HPC) infrastructure, scheduled to be on-line in Spring 2022.
 
 
This expansion will include:
 
<ul>
 
<li>A significant increase in the number of public-access CPU cores</li>
 
<li>A significant increase in the number of public-access GPU cores</li>
 
<li>High-speed interconnects (InfiniBand) for all new nodes</li>
 
<li>A parallel file system with a capacity of at least a petabyte</li>
 
<li>Cluster management software</li>
 
<li>Support for the SLURM scheduler/resource manager</li>
 
</ul>
 
 
The purpose of this assessment is to obtain information from researchers that will be used to
 
determine the hardware specifications of this expansion.
 
 
By providing input, you will influence the final specifications for this expansion.
 
 
Please be as informative as possible in your written responses.
 
 
Please complete this assessment by <strong> Wednesday 22 September 2021</strong> - assessments
 
submitted after that date will <strong>not</strong> be included in the results.
 
 
=== Defs ===
 
<strong>Definitions</strong>
 
<ul>
 
<li>The current IST-managed high performance computing (HPC) clusters referred to in this assessment are</li>
 
<ul>
 
<li><em>Lochness</em></li>
 
<ul>
 
<li>Public-access and privately-owned nodes, both CPU and GPU</li>
 
</ul>
 
 
<li><em>Stheno</em></li>
 
<ul>
 
<li>CPU and GPU nodes, owned by the Dept. of Mathematical Sciences</li>
 
</ul>
 
</ul>
 
 
<li>The expansion will include a parallel file system (PFS); current HPC resources do not include a PFS.</li>
 
<ul>
 
<li>A PFS provides cluster nodes shared access to data in parallel.
 
It enables concurrent access to storage by multiple tasks of a parallel application, to facilitate
 
high-performance through simultaneous, coordinated input/output operations
 
between compute nodes and storage.</li>
 
</ul>
 
</ul>
 
  
 
== 2. Demographics ==
 
== 2. Demographics ==
Your NJIT position and computational research areas.
 
 
2.1 What is your NJIT position? {button}
 
<ul>
 
<li>Faculty</li>
 
<ul>
 
<li>Tenured</li>
 
<li>Tenure-track</li>
 
<li>Non-tenure-track</li>
 
</ul>
 
<li>Academic research staff {text box}</li>
 
<li>Postdoc</li>
 
</ul>
 
 
2.1.1 What is your department {dd menu}
 
 
Newark College of Engineering
 
<ul>
 
<li>Biomedical Engineering</li>
 
<li>Biological and Pharmaceutical Engineering</li>
 
<li>Civil and Environmental Engineering</li>
 
<li>Electrical and Computer Engineering</li>
 
<li>Engineering Technology</li>
 
<li>Mechanical and Industrial Engineering </li>
 
<li>Other {text box}</li>
 
</ul>
 
 
College of Science and Liberal Arts
 
<ul>
 
<li>Aerospace Studies (AFROTC)</li>
 
<li>Chemistry and Environmental Science</li>
 
<li>Humanities</li>
 
<li>Mathematical Sciences</li>
 
<li>Physics</li>
 
<li>Federated Department of Biological Sciences</li>
 
<li>Federated Department of History</li>
 
<li>Rutgers/NJIT Theatre Arts Program</li>
 
<li>Other {text box}</li>
 
</ul>
 
 
Ying Wu College of Computing
 
<ul>
 
<li>Computer Science</li>
 
<li>Informatics</li>
 
<li>Other {text box}</li>
 
</ul>
 
 
Martin Tuchman School of Management
 
 
College of Architecture and Design
 
<ul>
 
<li>NJ School of Architecture</li>
 
<li>School of Art and Design</li>
 
<li>Other {text box}</li>
 
</ul>
 
 
2.2 For approximately how long have you and/or your research group been using IST-managed high performance computing (HPC) resources? {dd menu}
 
<ul>
 
<li>Less than 6 months</li>
 
<li>6+ to 12 months</li>
 
<li>1+ to 2 years</li>
 
<li>2+ to 5 years</li>
 
<li>5+ years</li>
 
<li>Don't know</li>
 
</ul>
 
  
2.3 What is the <strong>general</strong> classification of computations for which you and/or your research
+
== 3. Research area ==
group use IST-managed HPC {check all that apply}
+
2.1 What is the <strong>general</strong> classification of your research {select all that apply}
 
<ul>
 
<ul>
<li>Bioinfomatics </li>
 
 
<li>Bioinformatics </li>
 
<li>Bioinformatics </li>
 
<li>Biophysics </li>
 
<li>Biophysics </li>
Line 131: Line 18:
 
<li>Computational physics and chemistry </li>
 
<li>Computational physics and chemistry </li>
 
<li>Condensed matter physics </li>
 
<li>Condensed matter physics </li>
 +
<li>Data Science</li>
 
<li>Electromagnetism, Wave propagation </li>
 
<li>Electromagnetism, Wave propagation </li>
 
<li>Granular science </li>
 
<li>Granular science </li>
Line 143: Line 31:
 
<li>Other {text box} </li>
 
<li>Other {text box} </li>
 
</ul>
 
</ul>
 +
2.2 Please provide a <strong>brief, specific</strong> description(s) of your research {text box}
  
2.4 Please provide a <strong>brief, specific</strong> description(s) of the computational work for which you and/or your research group use IST-managed HPC {text box} (goes in 2.3)
+
== IT Needs ==
 
+
=== 3. Computational ===
== 3. Main ==
+
3.1 Does your research require significant computation : Y/N
 
+
3.1 What applications, including your own code, do you run on the Lochness and/or Stheno clusters
+
 
+
3.1.1 Specify an application that you run on the clusters. If the application is your own ....
+
 
+
3.1.2 Importance of application
+
 
<ul>
 
<ul>
<li>Minimally</li>
+
<li>If Y, continue with this section</li>
<li>Slightly</li>
+
<li>If N, go to Comments</li>
<li>Moderately</li>
+
<li>Very</li>
+
<li>Extremely</li>
+
 
</ul>
 
</ul>
 
+
3.2 Which of the following computational hardware do you use, and what is their importance
3.2 How often do you submit jobs to be run on the Lochness and/or Stheno clusters
+
 
<ul>
 
<ul>
<li>Several times a day</li>
+
<li>3.2.1 <em>High performance computing (HPC)</em></li>
<li>Once daily</li>
+
<ul>
<li>Every few days</li>
+
<li>My research currently uses HPC : Y/N</li>
<li>Weekly</li>
+
<li>If Y :
<li>Monthly</li>
+
<ul>
</ul>
+
<li>The importance of HPC to my research is : High, medium, low</li>
 +
</ul>
 +
</li>
 +
</ul>
 +
<li>3.2.2 <em>Lab workstations</em></li>
 +
<ul>
 +
<li>My research currently uses lab workstations : Y/N</li>
 +
<li>If Y :
 +
<ul>
 +
<li>Type of lab workstations used</li>
 +
<ul>
 +
<li>CPU-only</li>
 +
<li>GPU</li>
 +
</ul>
 +
</li>
 +
</ul>
 +
<ul>
 +
<li>The importance of lab workstations to my research is : High, medium, low</li>
 +
<li>My lab workstations are managed by : </li>
 +
<ul>
 +
<li>Academic and Research Computing Systems (ARCS}</li>
 +
<li>IST other than ARCS</li>
 +
<li>Internally by a research group member</li>
 +
<li>Other</li>
 +
<li>Not managed</li>
 +
<ul>
 +
<li>I would like to have my lab workstation(s) managed by IST : Y/N</li>
 +
</ul>
 +
</ul>
 +
<li>What operating system does your group <strong>predominantly</strong> use for your workstations :
 +
  Linux (CentOS, Red Hat, Ubuntu, etc.), MacOS, Windows</li>
 +
<li>If your group <strong>could exclusively</strong> use one operating system for your workstations, which would it be :
 +
  Linux (CentOS, Red Hat, Ubuntu, etc.), MacOS, Windows, One OS is not an optiom</li>
 +
</ul>
 +
</ul>
 +
<li><em>3.2.3 Virtual Desktop Infrastructure (VDI)</em></li>
 +
<ul>
 +
<li>Would you consider an IST-managed VDI environment (with and
 +
without GPU) utilizing bring-your-own-device (BYOD), instead of Linux/MacOS/Windows workstations : Y/N/Maybe</li>  
 +
 +
<li>If Y or Maybe :</li>
 +
<ul>
 +
<li>If you would use, or might use, CPU-only VDI</li>
 +
<ul>
 +
<li>What would be the expected daily use of a CPU-only VDI in hours per day : &lt;2, 2-6, 7-12, &gt;12</li>
 +
<li>What is the maximum annual cost you are willing to pay for VDIs (CPU only) : &lt;$100, $100-500,
 +
$501-1000, &gt;$1000</li>
 +
</ul>
  
3.3 Do you compile, or re-compile, your applications prior to processing
+
<li>If you would use, or might use, GPU VDI</li>
<ul>
+
<ul>
<li>Yes</li>
+
<li>What would be the expected daily use of a GPU VDI in hours per day : &lt;2, 2-6, 7-12, &gt;12</li>
 +
<li>What is the maximum annual cost you are willing to pay for GPU VDIs : &lt;$500, $500-1000,
 +
$1001-2000, &gt;$2000</li>
 +
<li>In your estimation, what percentage of your VDI usage will require GPUs : 0%, 1-25%, 26-50%, 51-75%, &gt;75%</li>
 +
</ul>
 +
</ul>
 +
</ul>
 +
<li><em>3.2.4 Laptops</em></li>
 
<ul>
 
<ul>
<li>What compilers do you use</li>
+
<li>My research currently uses laptops : Y/N</li>
 +
<li>If Y :
 +
<ul>
 +
<li>The importance of laptops to my research is : High, medium, low</li>
 +
</ul>
 +
</li>
 
</ul>
 
</ul>
<li>No</li>
 
</ul>
 
  
3.4 What is the maximum amount of resources you typically request to run your applications
 
<ul>
 
<li>Number of cores</li>
 
<li>Memory, GB</li>
 
<li>Storage, GB</li>
 
 
</ul>
 
</ul>
  
3.5 Do you require multiple nodes to run your application(s)
+
3.4 Comments on computational needs (text box)
<ul>
+
<li>Yes</li>
+
<li>No</li>
+
</ul>
+
  
3.6 Do your applications mainly depend on
+
=== 4. Storage ===
<ul>
+
<ul>
<li>Number of cores available</li>
+
<li>How much readily available storage capacity do you need, now and over the next 5 years :  &lt;1TB, 1-5TB, 6-50TB, 51-100TB, &gt;100TB</li>  
<li>Amount of RAM available</li>
+
<li>How much archival storage capacity do you need, now and over the next 5 years : &lt;1TB, 1-5TB, 6-50TB, 51-100TB, &gt;100TB</li>
<li>Mixed - depends on the application</li>
+
<li>Where do you currently store your data : Hard drive on local workstations,  External drives, USB flash drives, External disk,
<li>Don't know</li>
+
AFS, Google Drive, Dropbox, (select all that apply) Other</li>
</ul>
+
<li>Are your data backed up : Y/N/Don't know</li>
 
+
<li>If Y :
3.7 Do your application(s) make use of GPUs
+
<ul>
+
<li>Yes</li>
+
 
<ul>
 
<ul>
<li>What is the maximum number of GPUs your application(s) use in a single job</li>
+
<li>What is the backup mechanism : NJIT enterprise, local to lab, commercial cloud, other</li>
</ul>
+
</ul>
<li>No</li>
+
</li>
</ul>
+
<li>What is the maximum annual cost you are willing to pay for storage (no backups) per TB : &lt;$10, $10-25, $26-50, &gt;$50</li>
3.8 Do your application(s) make use of parallel processing
+
<li>What is the maximum annual cost you are willing to pay for backed-up storage (or backup services) per TB : &lt;$25, $25-50, $51-75, $76-100, &gt;$100</li>
<ul>
+
</ul>
<li>Yes</li>
+
Comments on storage needs (text box)
<li>No</li>
+
</ul>
+
  
3.9 Do your application(s) require a parallel fIle system (PFS) for optimal processing
+
=== 5. Software ===
 +
List the top three software applications you use in your research
 
<ul>
 
<ul>
<li>Yes</li>
+
<li>Name [ Opensource | Commercial | Own code ]
<li>No</li>
+
<li>Don't know</li>
+
 
</ul>
 
</ul>
 +
Comments on software (text box)
  
3.10 Do your application(s) require a high-speed, low-latency compute node interconnect (e.g., InfiniBand)
+
=== 6. Technical Assistance ===
<ul>
+
To facilitate your research, assistance in certain areas may be beneficial -
<li>Yes</li>
+
e.g., consultation or tutorials.
<li>No</li>
+
<li>Don't know</li>
+
</ul>
+
  
3.11 What is the typical maximum time for your most compute-intensive runs to complete
+
Select the areas in which you think such assistance would be of use
 
<ul>
 
<ul>
<li>Several minutes</li>
+
<li>Programming</li>
<li>Several hours</li>
+
<li>About a day</li>
+
<li>Several days</li>
+
<li>About a week</li>
+
<li>Several weeks</li>
+
<li>Other - specify</li>
+
</ul>
+
<ul>
+
<li>Is the maximum time to completion that you specified above acceptable</li>
+
 
<ul>
 
<ul>
<li>Yes</li>
+
<li>C/C++, Fortran, Python, Shell, PHP</li>
<li>No</li>
+
<li>Parallelization of code</li>
<ul>
+
<li>Optimization of code</li>
<li>What is the maximum time to completion that would be acceptable</li>
+
<li>Other</li>
</ul>
+
 
</ul>
 
</ul>
 +
<li>Building software from source code</li>
 +
<li>Backing up data</li>
 +
<li>Sharing data, code, documentation</li>
 +
<li>Security</li>
 +
<li>Fundamentals of working in a Linux environment</li>
 +
<li>Other</li>
 +
<li>Don't know</li>
 
</ul>
 
</ul>
  
3.12 What is the maximum amount of data you need to store per run, or series of runs, for post-processing
+
Comments on technical assistance needs (text box)
<ul>
+
<li>Less than a few GBs</li>
+
<li>Between a few GBs and a TB</li>
+
<li>Between a TB and a PB</li>
+
<li>More than a PB</li>
+
</ul>
+
  
3.13 What type(s) of data do you need to store - choose all that apply
+
=== 7. Additional Comments ===
<ul>
+
Please provide any additional comments relevant to this survey.
<li>Numerical</li>
+
<li>Text</li>
+
<li>Images</li>
+
<li>Video</li>
+
<li>Other - specify</li>
+
</ul>
+
  
3.14 How frequently does the data that you store need to be accessed
+
=== 8. Follow-up ===
<ul>
+
<ul>
<li>Several times a day</li>
+
<li>Can IST contact you for a follow-up to this survey : Y/N</li>
<li>Once a day</li>
+
<li>If Y :
<li>Every few days</li>
+
<ul>
<li>Once a week</li>
+
<li>Do you explicitly want to be contacted : Y/N</li>
<li>Once every few weeks</li>
+
</ul>
<li>Once a month</li>
+
</li>
<li>Every few months</li>
+
</ul>
<li>About once a year</li>
+
<li>Other - specify</li>
+
</ul>
+
 
+
3.15 How long does this data need to be retained
+
<ul>
+
<li>A few days</li>
+
<li>A few weeks</li>
+
<li>A few months</li>
+
<li>A year</li>
+
<li>Several years</li>
+
<li>Other - specify</li>
+
</ul>
+
 
+
3.16 Other than yourself, how many individuals require access to this data
+
 
+
<ul>
+
<li>None</li>
+
<li>Between 1 and 5</li>
+
<li>Between 6 and 20</li>
+
<li>Other - specify</li>
+
</ul>
+

Revision as of 16:14, 10 January 2023

ForP.survey.research.it

1. Preamble

2. Demographics

3. Research area

2.1 What is the general classification of your research {select all that apply}

  • Bioinformatics
  • Biophysics
  • Computational PDE
  • Computational biophysics
  • Computational chemistry
  • Computational fluid dynamics
  • Computational physics and chemistry
  • Condensed matter physics
  • Data Science
  • Electromagnetism, Wave propagation
  • Granular science
  • Image forensics
  • Materials research
  • Monte Carlo
  • Neural networks, genetic algorithms
  • Software verification, static analysis
  • Statistical analysis
  • Steganalysis and image forensics
  • Transportation data analysis
  • Other {text box}

2.2 Please provide a brief, specific description(s) of your research {text box}

IT Needs

3. Computational

3.1 Does your research require significant computation : Y/N

  • If Y, continue with this section
  • If N, go to Comments

3.2 Which of the following computational hardware do you use, and what is their importance

  • 3.2.1 High performance computing (HPC)
    • My research currently uses HPC : Y/N
    • If Y :
      • The importance of HPC to my research is : High, medium, low
  • 3.2.2 Lab workstations
    • My research currently uses lab workstations : Y/N
    • If Y :
      • Type of lab workstations used
        • CPU-only
        • GPU
      • The importance of lab workstations to my research is : High, medium, low
      • My lab workstations are managed by :
        • Academic and Research Computing Systems (ARCS}
        • IST other than ARCS
        • Internally by a research group member
        • Other
        • Not managed
          • I would like to have my lab workstation(s) managed by IST : Y/N
      • What operating system does your group predominantly use for your workstations : Linux (CentOS, Red Hat, Ubuntu, etc.), MacOS, Windows
      • If your group could exclusively use one operating system for your workstations, which would it be : Linux (CentOS, Red Hat, Ubuntu, etc.), MacOS, Windows, One OS is not an optiom
  • 3.2.3 Virtual Desktop Infrastructure (VDI)
    • Would you consider an IST-managed VDI environment (with and without GPU) utilizing bring-your-own-device (BYOD), instead of Linux/MacOS/Windows workstations : Y/N/Maybe
    • If Y or Maybe :
      • If you would use, or might use, CPU-only VDI
        • What would be the expected daily use of a CPU-only VDI in hours per day : <2, 2-6, 7-12, >12
        • What is the maximum annual cost you are willing to pay for VDIs (CPU only) : <$100, $100-500, $501-1000, >$1000
      • If you would use, or might use, GPU VDI
        • What would be the expected daily use of a GPU VDI in hours per day : <2, 2-6, 7-12, >12
        • What is the maximum annual cost you are willing to pay for GPU VDIs : <$500, $500-1000, $1001-2000, >$2000
        • In your estimation, what percentage of your VDI usage will require GPUs : 0%, 1-25%, 26-50%, 51-75%, >75%
  • 3.2.4 Laptops
    • My research currently uses laptops : Y/N
    • If Y :
      • The importance of laptops to my research is : High, medium, low

3.4 Comments on computational needs (text box)

4. Storage

  • How much readily available storage capacity do you need, now and over the next 5 years : <1TB, 1-5TB, 6-50TB, 51-100TB, >100TB
  • How much archival storage capacity do you need, now and over the next 5 years : <1TB, 1-5TB, 6-50TB, 51-100TB, >100TB
  • Where do you currently store your data : Hard drive on local workstations, External drives, USB flash drives, External disk, AFS, Google Drive, Dropbox, (select all that apply) Other
  • Are your data backed up : Y/N/Don't know
  • If Y :
    • What is the backup mechanism : NJIT enterprise, local to lab, commercial cloud, other
  • What is the maximum annual cost you are willing to pay for storage (no backups) per TB : <$10, $10-25, $26-50, >$50
  • What is the maximum annual cost you are willing to pay for backed-up storage (or backup services) per TB : <$25, $25-50, $51-75, $76-100, >$100

Comments on storage needs (text box)

5. Software

List the top three software applications you use in your research

  • Name [ Opensource | Commercial | Own code ]

Comments on software (text box)

6. Technical Assistance

To facilitate your research, assistance in certain areas may be beneficial - e.g., consultation or tutorials.

Select the areas in which you think such assistance would be of use

  • Programming
    • C/C++, Fortran, Python, Shell, PHP
    • Parallelization of code
    • Optimization of code
    • Other
  • Building software from source code
  • Backing up data
  • Sharing data, code, documentation
  • Security
  • Fundamentals of working in a Linux environment
  • Other
  • Don't know

Comments on technical assistance needs (text box)

7. Additional Comments

Please provide any additional comments relevant to this survey.

8. Follow-up

  • Can IST contact you for a follow-up to this survey : Y/N
  • If Y :
    • Do you explicitly want to be contacted : Y/N