-

This site is deprecated and will be decommissioned shortly. For current information regarding HPC visit our new site: hpc.njit.edu

HPCBaselineAWS

From NJIT-ARCS HPC Wiki
Revision as of 16:33, 5 October 2020 by Hpcwiki1 dept.admin (Talk | contribs) (Importing text file)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Computational Cost

The purpose of this exercise is to provide approximate pricing for a baseline HPC resource hosted entirely in AWS.

AWS pricing calculator

(Note that the above link includes pricing for support. This is not included in the above pricing or the spreadsheet.)

AWS provides several pricing instances, including on-demand, reserved, and spot

For this exercise pricing using reserved instances was chosen as being the most appropriate. An HPC resource built with reserved instances most closely resembles an on-premise, always available resource both in functionality and structure.

Costing for on-demand and spot instances can be highly variable and unpredictable.

On-demand pricing can possibly be more cost effective than reserved pricing.

However, it is extremely difficult to predict the level of demand, even if accurate historical HPC usage is available :

  • Existing researchers' problem domains, size of models, and scale of analyses change
  • New researchers bring unknown needs for computational resources

On-demand instances are most cost-effective when historical HPC usage can be used to reliably predict future usage. This is not currently the case at NJIT, where the base level of HPC activity, augmented regularly by new researchers, is still changing significantly.

Spot instances can be terminated without warning, requiring workload checkpointing, which may be difficult to implement. This unpredictability will cause significant user frustration.

The instances in this exercise were chosen to most closely resemble the on-premise HPC Baseline Resource:

Compute Nodes : 25
Total Cores : 1,800
Total RAM : 12.8 TB
GPU Nodes : 5
Total GPU : 40
Total Cores : 320
Total RAM : 2.4 TB

Total Cost for Compute and GPU instances : $2,500,571.75 for 3 years

AWS pricing Gsheet

Parallel File System Cost

The on-premise HPC Baseline resource includes a 1-PB IBM Spectrum Scale (formerly called GPFS) parallel file system (PFS).

Spectrum Scale licensing is not available on the AWS pricing calculator. Instead, 1 PB of storage was calculated using 100 TB of EBS storage and 900 TB of S3 storage.

Storage cost, 3 years : $699,738.84

Total cost for hosting the HPC Baseline Resource at AWS for Three Years

$2,500,571.75 (computational) + $699,738.84 (storage) : $3,200,310.59

Three years, used in this example, is a commonly used time period for costing cloud services.

The cost varies linearly with the time period the service is used.