Difference between pages "GettingStartedWithSerialAndParallelMATLABOnLochness" and "MediaWiki:Sidebar"

From NJIT-ARCS HPC Wiki
(Difference between pages)
Jump to: navigation, search
 
 
Line 1: Line 1:
  
Following this procedure a user will be able to submit jobs to lochness or stheno from Matlab running locally on the user's computer. The version of Matlab on the user's computer must be the same as on the cluster, currently 2021a.
+
* SEARCH
==Installing the Add-On==
+
  
From the Matlab window, click on "Add-ons" and select "Get Add-Ons."
+
* Phake I
 +
** HPC and BD | Welcome to HPC & BD
 +
** UserAccess | Access
 +
** AWSInstanceDeployment | AWS Instance Deployment
 +
* Clusters
 +
** Lochness | Lochness
 +
** http://ist.njit.edu/cyberinfrastructure/hpckong.php | Kong
 +
** http://ist.njit.edu/cyberinfrastructure/hpcstheno | Stheno
 +
* Compilers
 +
** CompilersGeneral | General
 +
** Intel | Intel
 +
* Consultation
 +
** ScheduleAppt | Schedule an appointment to get help
 +
* FAQ
 +
** FAQ | FAQ
 +
* GPU
 +
** RunningCUDASamplesOnKong | Running CUDA Samples on Kong
 +
** MatlabGPUOnStheno | Matlab GPU on Stheno
 +
* Hardware Costs
 +
** DiskAndBackupCost | Disk and Backup
 +
** VMCost | Virtual Machine
 +
** https://wiki.hpc.arcs.njit.edu/external/HPC_catalog-20140501-0858-AM-WIP.pdf | Node Catalog (outdated)
 +
* HPC URLs
 +
** HPCURLS | Useful URLs
 +
* HTCondor
 +
** HTCondor | Using HTCondor
 +
* IST/ARCS Services
 +
** ISTARCSServices | Services overview
 +
* Lessons
 +
** LessonsAndTutorials | Lessons and Tutorials
 +
* Matlab Parallel Server
 +
** GettingStartedWithSerialAndParallelMATLABOnLochness | Getting Started with Serial and Parallel MATLAB on Lochness and Stheno
 +
* News
 +
** ReplacementOfKong | Replacement of Kong
 +
* Outages
 +
** Outage17Feb2016 | 17 Feb 2016
 +
** Outage16Oct2018 | 15-16 Oct 2018
 +
* Policies
 +
** GITC4320 | GITC 4320 datacenter
 +
** GuestAccounts | Guest Accounts
 +
** MaxNumberOfJobsSubmitted | Maximum Number of Jobs Submitted
 +
** PurgingOfScratchSpace | Purging of Scratch Space
 +
** UseOfHeadnodes | Use of Headnodes
 +
*  Python
 +
** MinicondaUserMaintainedEnvs | Miniconda for User Maintained Python Environments
 +
** Python2.7Packages | Python 2.7 Installed Packages
 +
** Python3.5Packages | Python 3.5 Installed Packages
 +
** Python3.6Packages | Python 3.6 Installed Packages
 +
** Python3.6.tfPackages | Python 3.6.tf Installed Packages
 +
** Python3.7Packages | Python 3.7 Installed Packages
 +
* Researcher Resources
 +
** HPCAndBDResources | HPC and BD Resources
 +
** ResearcherBaseResources | Base Resource Allocations
 +
* Researcher On-and-Off-Premise Resources
 +
** ComputationalResourcesOnPremise | On-premise computational resources
 +
** ComputationalResourcesOffPremise | Off-premise computational resources
 +
* Researcher Problem Domains
 +
** ResearcherProblemDomains | Classifications, Spring 2015
 +
** https://wiki.hpc.arcs.njit.edu/external/surveys/spring2018/GeneralComp.pdf | Classifications, Spring 2018
 +
* Researcher Symposia
 +
** ResearcherSymposiumJan2017 | 11 January 2017
 +
* Roadmap
 +
** Roadmap | 18-month Roadmap
 +
* Running Jobs
 +
** SonOfGridEngine | SGE
 +
** KongQueues | Kong Queues
 +
** KongQueuesTable | Kong Queues Table
 +
** SthenoQueues | Stheno Queues
 +
** SthenoQueuesTable | Stheno Queues Table
 +
** MPIOnHPCClusters | Using MPI
 +
** UsingKrenew | Using krenew
 +
** UsingKsub | Using ksub
 +
** SpecifyResources | Specify Resources
 +
** JobLimits | Job Limits
 +
* Sharing Data
 +
** SharingData | Methods of Sharing Data
 +
* SLURM
 +
** SchedulerIntro | Scheduler Introduction
 +
** SGEToSLURM | SGE to SLURM Migration Guide
 +
** KongToLochnessOrStheno-QuickStart | Kong to Lochness or Stheno Migration Quick Start Guide (SGE to SLURM
 +
** SLURMExampleScripts | SLURM Example Scripts
 +
* Software
 +
** AfsandOptSoftware | Software Installed in /afs and in /opt
 +
** SoftwareModulesAvailable | Modules Available
 +
** ModulesOnLochness | Modules on Lochness
 +
** Abaqus | Abaqus
 +
** Fluent | Fluent
 +
** Libraries | Libraries
 +
** Lammps | Lammps
 +
** Languages | Languages
 +
** Tecplot | Tecplot
 +
** NewSoftware | New
 +
* Specifications
 +
** ISTResearcherSupport | Material for Inclusion in Funding Proposals
 +
** https://web-debug.njit.edu/hpc.specs | HPC Specs
 +
** HPCSpecsExtract | HPC Specs Extract
 +
** StorageTable | Storage table
 +
* Surveys
 +
** https://wiki.hpc.arcs.njit.edu/external/surveys/cc.star-feb.2021/CC_-Data-Summary.pdf | Public GPU Needs Assessment
 +
** https://wiki.hpc.arcs.njit.edu/external/surveys/campchamp/Jan2019.results.pdf | Campus Champions On/off-premise HPC
 +
** https://wiki.hpc.arcs.njit.edu/external/surveys/HPCsurveySpring2018Results.pdf | HPC & BD Survey Results 12/2018
 +
** https://wiki.hpc.arcs.njit.edu/external/surveys/HPCsurveySpring2015Results.pdf | HPC & BD Survey Results 09/2015
 +
** https://wiki.hpc.arcs.njit.edu/external/surveys/HPCsurveySpring2013Results.pdf | HPC & BD Survey Results 03/2013
 +
** https://wiki.hpc.arcs.njit.edu/external/surveys/HPCStorageandStaffSpring2014.pdf | Researchers' HPC, Storage and Staff Needs 2015-2020
 +
** https://wiki.hpc.arcs.njit.edu/external/surveys/spring2018/OutsideResource.pdf | Off-premise Resources, Srping 2018
 +
** https://wiki.hpc.arcs.njit.edu/external/surveys/spring2018/InterBandSciDMZ.pdf | Internet 2 and Science DMZ, Spring 2018
 +
* Tartan Initiative
 +
** http://web.njit.edu/topics/hpc/tartan | Tartan HPC Initiative
 +
* User
 +
** UserAccess | Access
 +
** UserEnvironment | Environment
 +
** RemoteGUI | Remote GUI
 +
** UserTools | Tools
 +
* User Contributions
 +
*Wiki Usage
 +
** GettingStarted | Getting Started
 +
** WikiEditingHelp | Wiki Editing Help
 +
** WikiTextFormatting | Wiki Text Formatting
 +
** Adding Links to Wiki | Adding Links to Wiki
 +
** CreatingNewWikiPage | Creating a New Wiki Page
 +
** Uploading Documents and Images | Uploading Documents and Images
  
[[File:ClickOnAddons.png|900px]]
+
* TOOLBOX
 
+
In the search box enter "slurm" and click on the magnifying glass icon.
+
 
+
Select "Parallel Computing Toolbox plugin for MATLAB Parallel Server with Slurm"
+
 
+
Alternatively, this Add-On can be downloaded directly from the [https://www.mathworks.com/matlabcentral/fileexchange/52807-parallel-computing-toolbox-plugin-for-matlab-parallel-server-with-slurm Mathworks] site.
+
 
+
[[File:SlurmAddOn.png|1000px]]
+
 
+
Click on "Install."
+
 
+
[[File:ClickOnInstall.png|1000px]]
+
 
+
The installation of the Add-On is complete. Click on "OK" the start the "Generic Profile Wizard for Slurm."
+
<br>
+
 
+
[[File:InstallationComplete.png|900px]]
+
==Creating a Profile for Lochness or Stheno==
+
 
+
The following steps will create a profile for lochness (or stheno).
+
Click "Next" to begin.
+
 
+
[[File:GenericProfile1.png|800px]]
+
 
+
In the "Operating System" screen "Unix" is already selected. Click "Next" to continue.
+
 
+
[[File:GenericProfile2.png|800px]]
+
 
+
This "Submission Mode" screen determines whether or not to use a "shared" or "nonshared" submission mode. Since Matlab installed on your personal computer or laptop does not use a shared job location storage, select "No" where indicated and click "Next" to continue.
+
 
+
 
+
[[File:GenericProfile3.png|800px]]
+
 
+
Click "Next" to continue.
+
 
+
[[File:GenericProfile4.png|800px]]
+
 
+
In the "Connection Details" screen, enter the cluster host, either "lochness.njit.edu" or "stheno.njit.edu."
+
Enter your UCID for the username.
+
 
+
Select "No" for the "Do you want to use an identity file to log in to the cluster" option and click next to continue.
+
 
+
[[File:GenericProfile5.png|800px]]
+
 
+
In the "Cluster Details" screen enter the full path to the directory on lochness to store the Matlab job files. In the case the directory is $HOME/MDCS. MDCS stands for Matlab Distributed Computing Server. It is not necessary to name this directory MDCS. This directory can be named anything you wish. To determine the value of $HOME, log onto lochness and run the following:
+
 
+
<pre>
+
login-1-45 ~ >: echo $HOME
+
/home/g/guest24</pre>
+
 
+
Make sure to check the box "Use unique subfolders."
+
 
+
Click "Next" to continue.
+
 
+
[[File:GenericProfile6.png|800px]]
+
 
+
In the "Workers" screen enter "512" for the number of workers and "/opt/site/apps/matlab/R2021a" for "MATLAB installation folders for workers."
+
Click "Next" to continue.
+
 
+
[[File:GenericProfile7_1.png|800px]]
+
 
+
In the "License" screen make sure to select "Network license manager" and click "Next" to continue.
+
 
+
[[File:GenericProfile8.png|800px]]
+
 
+
In the "Profile Details" screen enter either "Lochness" or "Stheno" depending on which cluster you are making a profile for. The "Cluster description" is optional and may be left blank.
+
Click "Next" to continue.
+
 
+
[[File:GenericProfile9.png|800px]]
+
 
+
In the "Summary" screen make sure everything is correct and click "Create."
+
 
+
[[File:GenericProfile10_1.png|800px]]
+
 
+
In the "Profile Created Successfully" screen, check the "Set the new profile as default" box and click on "Finish."
+
 
+
[[File:GenericProfile11.png|800px]]
+
 
+
==Submitting a Serial Job==
+
 
+
This section will demonstrate how to create a cluster object and submit a simple job to the cluster. The job will run the 'hostname' command on the node assigned to the job. The output will indicate clearly that the job ran on the cluster and not on the local computer.
+
 
+
The hostname.m file used in this demonstration can be downloaded [https://www.mathworks.com/matlabcentral/fileexchange/24096-hostname-m here.]
+
 
+
In the Matlab window enter:
+
 
+
<pre> >> c=parcluster </pre>
+
 
+
[[File:c=parcluster_1.png|900px]]
+
 
+
Certain arguments need to be passe dto SLURM in order for the job to run properly. Her we will set values for partion, mem-per-cpu and time. In the Matlab window enter:
+
 
+
<pre> >> c.AdditionalProperties.AdditionalSubmitArgs=['--partition=public --mem-per-cpu=10G --time=2-00:00:00'] </pre>
+
 
+
To make this persistent between Matlab sessions these arguments need to be saved to the profile. In the Matlab window enter:
+
 
+
<pre> >> c.saveProfile </pre>
+
 
+
[[File:AdditionalArguments.png|900px]]
+
 
+
We will now submit the hostname.m function to the cluster. In the Matlab window enter the following:
+
 
+
<pre>>> j=c.batch(@hostname, 1, {}, 'AutoAddClientPath', false); </pre>
+
 
+
@: Submitting a function. <br>
+
1: The number of output arguments from the evaluated function. <br>
+
{}: Cell array of input arguments to the function. In this case empty.<br>
+
AutoAddClientPath', false: The client path is not available on the cluster.
+
 
+
When the job is submitted, you will be prompted for your password.
+
 
+
For more information see the Mathworks page: [https://www.mathworks.com/help/parallel-computing/batch.html batch]
+
 
+
[[File:BatchEnterPasswd.png|900px]]
+
 
+
To wait for the job to finish enter in the Matlab window:
+
 
+
<pre> >>j.wait</pre>
+
+
Finally, to get the results:
+
 
+
<pre> >>fetchOutputs(j)</pre>
+
 
+
As can be seen, this job ran on node720
+
 
+
[[File:BatchHostname.png|900px]]
+
 
+
==Submitting a Parallel Function==
+
 
+
The "Job Monitor" is a convenient way to monitor jobs submitted to the cluster. In the Matlab window select "Parallel" and then "Monitor Jobs."
+
 
+
For more information see the Mathworks page: [https://www.mathworks.com/help/parallel-computing/job-monitor.html Job Monitor]
+
 
+
[[File:MonitorJobs.png|900px]]
+
 
+
Here we will submit a simple function using a "parfor" loop. The code for this example is as follows:
+
<pre>function t = parallel_example
+
 
+
t0 = tic;
+
parfor idx = 1:16
+
        A(idx) = idx;
+
        pause (2)
+
end
+
 
+
t=toc(t0);</pre>
+
 
+
To submit this job:
+
 
+
<pre> >> j=c.batch(@parallel_example, 1, {}, 'AutoAddClientPath', false, 'Pool', 7)</pre>
+
 
+
Since this is a parallel job a 'Pool' must be started. The actual number of tasks started will be one more than requested in the pool. I this case, the batch command calls for a pool of seven. Eight tasks will be started on the cluster.
+
 
+
Also see that the state of the job in the "Job Monitor" is "running."
+
 
+
[[File:SubmitParallel.png|900px]]
+
 
+
 
+
The job takes a few minutes to run and the state of the job changes to "finished."
+
 
+
[[File:JobFinished.png|900px]]
+
 
+
 
+
Once again to get the results enter:
+
 
+
<pre> >> fetchOutputs(j) </pre>
+
 
+
As can be seen the parfor loop was completed in 6.7591 seconds.
+
 
+
[[File:FetchOutputs.png|900px]]
+
 
+
==Submitting a Script Requiring a GPU==
+
 
+
In this section we will submit a matlab script using a GPU. The results will be written to the job diary.  The code for this example is as follows:
+
 
+
<pre>% MATLAB script that defines a random matrix and does FFT
+
%
+
% The first FFT is without a GPU
+
% The second is with the GPU
+
%
+
% MATLAB knows to use the GPU the second time because it
+
%  is passed a type gpuArray as an argument to FFT
+
% We do the FFT a bunch of times to make using the GPU worth it,
+
%  or else it spends more time offloading to the GPU
+
%  than performning the calculation
+
%
+
% This example is meant to provide a general understanding
+
%  of MATLAB GPU usage
+
% Meaningful performance measurements depend on many factors
+
%  beyond the scope of this example
+
% Downloaded from https://projects.ncsu.edu/hpc/Software/examples/matlab/gpu/gpu_m
+
 
+
% Define a matrix
+
A1 = rand(3000,3000);
+
 
+
% Just use the compute node, no GPU
+
tic;
+
% Do 1000 FFT's
+
for i = 1:1000
+
      B2 = fft(A1);
+
end
+
time1 = toc;
+
fprintf('%s\n',"Time to run FFT on the node:")
+
disp(time1);
+
 
+
% Use GPU
+
tic;
+
A2 = gpuArray(A1);
+
% Do 1000 FFT's
+
for i = 1:1000
+
      % MALAB knows to use GPU FFT because A2 is defined by gpuArray
+
        B2 = fft(A2);
+
end
+
time2 = toc;
+
fprintf('%s\n',"Time to run FFT on the GPU:")
+
disp(time2);
+
 
+
% Will be greater than 1 if GPU is faster
+
speedup = time1/time2 </pre>
+
 
+
We will need to change the partition to datasci and request a gpu. In the Matlab window enter:
+
 
+
<pre> >> c.AdditionalProperties.AdditionalSubmitArgs=['--partition=datasci --gres=gpu:1 --mem-per-cpu=10G --time=2-00:00:00'] </pre>
+
 
+
[[File:GpuSubmitArgs.png|900px]]
+
 
+
Submit the job as before. Since a script is submitted as opposed to a function, only the name of the script is included in the batch command. Do not include the '@' symbol. In a script there are no inputs or ouptuts.
+
 
+
<pre> >> j=c.batch('gpu', 'AutoAddClientPath', false) </pre>
+
 
+
[[File:GpuSubmit.png|900px]]
+
 
+
To get the result:
+
 
+
<pre> >> j.diary </pre>
+
 
+
[[File:GpuDiary.png|900px]]
+
 
+
==Load and Plot Results from A Job==
+
 
+
In this section we will run a job on the cluster and then load and plot the results in the local Matlab workspace. The code for this example is as follows:
+
 
+
<pre>n=100;
+
disp("n = " + n);
+
A = gallery('poisson',n-2);
+
b = convn(([1,zeros(1,n-2),1]'|[1,zeros(1,n-1)]), 0.5*ones(3,3),'valid')';
+
x = reshape(A\b(:),n-2,n-2)';%</pre>
+
 
+
 
+
As before submit the job:
+
 
+
<pre> >> j=c.batch('plot_demo', 'AutoAddClientPath', false);</pre>
+
 
+
[[File:PlotDemoSub.png|900px]]
+
 
+
To load 'x' into the local Matlab workspace:
+
 
+
<pre> >> load(j,'x') </pre>
+
 
+
[[File:load_x.png|900px]]
+
 
+
Finally, plot the results:
+
 
+
<pre> >> plot('x') </pre>
+
 
+
[[File:plot_x.png|900px]]
+

Revision as of 14:12, 1 July 2021

  • SEARCH
  • TOOLBOX