-

This site is deprecated and will be decommissioned shortly. For current information regarding HPC visit our new site: hpc.njit.edu

PublicAccessNodesRetirement

From NJIT-ARCS HPC Wiki
Jump to: navigation, search

Public-Access HPC Cluster Nodes Retirement

Public-Access High Performamce Computing (HPC) History

NJIT has historically supported public-access HPC - i.e., making HPC computational resources freely available for research and instruction.

  • 1996 : first public-access HPC installation
  • 1998 : second public-access HPC installation
  • 2001 : third public-access HPC installation
  • 2004 : first public-access HPC cluster installation
  • 2013 : addition of over 300 nodes to the cluster, via a donation. The donated nodes were five years old at the time of the donation.

Public-access nodes end-of-life (EOL)

  • The legacy operating system used by the public-access nodes will be EOL'd by the vendor on November 30, 2020.
  • Successor OS versions are incompatible with the public-access nodes' network interconnects. These nodes cannot be upgraded to remedy this incompatibility.
  • The public-access nodes do not support many of the critical software applications needed by researchers. This problem is severe, and will only get worse.
  • The public-access nodes cannot be integrated into OpenHPC, the new cluster framework already being deployed.
  • For the reasons above, all public-access nodes will be taken out of service by November 30, 2020

Impact on Research

The public-access nodes support research from various departments in YWCC, NCE, and CSLA. Most of the researchers using the public access nodes do not have funding to support the HPC needed for their research. These researchers use the public-access nodes for research that they intend to leverage to secure funding. That research will effectively stop.

Ph.D. and Masters candidates who are dependent on the public-access nodes will not be able to complete their projects unless their research advisors are able to provide computational resources elsewhere.

Since research funding depends upon preliminary research to demonstrate feasibility, without public-access resources it is expected that external funding will be more difficult for these researchers to obtain.

See also: Researchers Open Letter

Non-public-access nodes also affected

There are several researchers who have exclusive use of groups of the donated nodes that will be retired. Those researchers should be provided with equivalent resources when the donated nodes are retired.

Impact on Instruction

Several courses use the public-access nodes for teaching and assignments. Dr. Usman Roshan, CS, performed an analysis comparing cloud costs vs. on-premise costs. A single one-semester data science course for 77 students would cost about $100,000. Using the cloud for instructional purposes would get very expensive, very quickly.

Plan for Continued Public-Access Node Provisioning

In order to continue to provide public-access HPC resources, the following two-phase plan is proposed.

Phase 1 : FY2021

  • Phase 1 provides a basic computational, node interconnect, and storage infrastructure that will provide resources adequate for many researchers to largely continue their work at its present level, and which can be built upon as additional funding becomes available
  • Phase 1 should be in place by November 30, 2020, and sooner if possible. The cost of Phase 1 is approximately $1.5M. Phase 1 cost estimate
  • It is assumed that Phase 1 equipment will be housed in the GITC 5302 data center, pending the possible construction of an HPC data center elsewhere on campus.
  • Providing such resources via commercial cloud providers would be three to six times as expensive, and would end when funds are exhausetd

Phase 2 : FY2022

While Phase 1 will provide resources for many researchers to continue at their research at their current level, it would not provide adequate resources for researchers to expand the scope of their work, or to attract new researchers.

The purpose of Phase 2 is the expansion on the foundation implemented in Phase 1 to provide resources commensurate with the goals of an R1 university.

The funding required for Phase 2 is expected to be at about the same level as that for Phase 1.