-

This site is deprecated and will be decommissioned shortly. For current information regarding HPC visit our new site: hpc.njit.edu

MinicondaUserMaintainedEnvs

From NJIT-ARCS HPC Wiki
Revision as of 16:34, 5 October 2020 by Hpcwiki1 dept.admin (Talk | contribs) (Importing text file)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Miniconda is an easy to install, minimal python distribution. Users can use miniconda to create virtual python environments to manage python modules. The instructions that follow are for linux.

Installation

Download miniconda <pre=code> login-1-95 ~ >: wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh --2020-07-29 16:24:59-- https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh Resolving repo.anaconda.com (repo.anaconda.com)... 104.16.131.3, 104.16.130.3, 2606:4700::6810:8203, ... Connecting to repo.anaconda.com (repo.anaconda.com)|104.16.131.3|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 93052469 (89M) [application/x-sh] Saving to: ‘Miniconda3-latest-Linux-x86_64.sh’

100%[========================================================================================>] 93,052,469 31.6MB/s in 2.8s

2020-07-29 16:25:03 (31.6 MB/s) - ‘Miniconda3-latest-Linux-x86_64.sh’ saved [93052469/93052469] </pre> Run the installation <pre=code> login-1-96 ~ >: chmod +x Miniconda3-latest-Linux-x86_64.sh login-1-96 ~ >: ./Miniconda3-latest-Linux-x86_64.sh </pre>

Accept the license and the default location. After python and some packages are installed you will be prompted to run conda init. Enter 'yes' at the prompt.

When the installation is complete the following appears:

<pre=code> > For changes to take effect, close and re-open your current shell. <

If you'd prefer that conda's base environment not be activated on startup,

  set the auto_activate_base parameter to false:

conda config --set auto_activate_base false

Thank you for installing Miniconda3! </pre>

Since you will likely be maintaining your own virtual environments, it is recommended not to activate the base environment on startup.

<pre=code> login-1-101 ~ >: conda config --set auto_activate_base false </pre>

Log off and log in again.

Create and Activate a Conda Virtual Environment

The following example will create a new conda environment based on python 3.7 and install tensorflow in the environment.

<pre=code> login-1-105 ~ >: conda create --name tf python=3.7 Collecting package metadata (current_repodata.json): done Solving environment: done

    1. Package Plan ##
 environment location: /home/g/guest24/miniconda3/envs/tf
 added / updated specs:
   - python=3.7


The following packages will be downloaded:

<output snipped>

Proceed ([y]/n)?y

<output snipped>
  1. To activate this environment, use
  2. $ conda activate tf
  3. To deactivate an active environment, use
  4. $ conda deactivate

</pre>

Activate the new 'tf' environment

<pre=code> login-1-106 ~ >: conda activate tf (tf) login-1-107 ~ >: </pre> Install tensorflow-gpu <pre=code> (tf) login-1-107 ~ >: conda install -c anaconda tensorflow-gpu Collecting package metadata (current_repodata.json): done Solving environment: done

    1. Package Plan ##
 environment location: /home/g/guest24/miniconda3/envs/tf
 added / updated specs:
   - tensorflow-gpu

<output snipped>

The following packages will be SUPERSEDED by a higher-priority channel:

 ca-certificates                                 pkgs/main --> anaconda
 certifi                                         pkgs/main --> anaconda
 openssl                                         pkgs/main --> anaconda


Proceed ([y]/n)?y

<output snipped>

mkl_fft-1.1.0 | 143 KB | ####################################################################################### | 100% urllib3-1.25.9 | 98 KB | ####################################################################################### | 100% cudatoolkit-10.1.243 | 513.2 MB | ####################################################################################### | 100% protobuf-3.12.3 | 711 KB | ####################################################################################### | 100% blinker-1.4 | 21 KB | ####################################################################################### | 100% requests-2.24.0 | 54 KB | ####################################################################################### | 100% werkzeug-1.0.1 | 243 KB | ####################################################################################### | 100% Preparing transaction: done Verifying transaction: done Executing transaction: done

</pre>

Check to see if tensorflow can be loaded

<pre=code> (tf) login-1-108 ~ >: python Python 3.7.7 (default, May 7 2020, 21:25:33) [GCC 7.3.0] :: Anaconda, Inc. on linux Type "help", "copyright", "credits" or "license" for more information. >>> import tensorflow as tf >>> </pre>

Simple tensorflow test program to make sure the virtual env can access a gpu. Program is called "tf.gpu.test.py."

<pre=code> import tensorflow as tf

if tf.test.gpu_device_name():

   print('Default GPU Device: {}'.format(tf.test.gpu_device_name()))

else:

  print("Please install GPU version of TF")

</pre>

Slurm script to submit the job

<pre=code>

  1. !/bin/bash -l
  2. SBATCH --job-name=tf_test
  3. SBATCH --output=%x.%j.out # %x.%j expands to JobName.JobID
  4. SBATCH --nodes=1
  5. SBATCH --tasks-per-node=1
  6. SBATCH --partition=datasci
  7. SBATCH --gres=gpu:1
  8. SBATCH --mem=4G
  1. Purge any module loaded by default

module purge > /dev/null 2>&1 conda activate tf srun python tf.gpu.test.py </pre>

Result:

<pre=code> Starting /home/g/guest24/.bash_profile ... standard AFS bash profile

Home directory : /home/g/guest24 is not in AFS -- skipping quota check

On host node430 :

        17:14:13 up 1 day,  1:17,  0 users,  load average: 0.01, 0.07, 0.06
     Your Kerberos ticket and AFS token status 

klist: No credentials cache found (filename: /tmp/krb5cc_22967_HvCVvuvMMX) Kerberos : AFS  :

Loading default modules ... Create file : "/home/g/guest24/.modules" to customize.

No modules loaded 2020-07-29 17:14:19.047276: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA 2020-07-29 17:14:19.059941: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2200070000 Hz 2020-07-29 17:14:19.060093: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55ea8ebfdb90 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2020-07-29 17:14:19.060136: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2020-07-29 17:14:19.061484: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1

<ouput snipped>

2020-07-29 17:14:19.817386: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-07-29 17:14:19.817392: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0 2020-07-29 17:14:19.817397: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N 2020-07-29 17:14:19.819082: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/device:GPU:0 with 15064 MB memory) -> physical GPU (device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:02:00.0, compute capability: 6.0) Default GPU Device: /device:GPU:0

</pre>

GPU was recognized.