Difference between revisions of "GettingStartedWithSerialAndParallelMATLABOnKongAndStheno"
(Importing text file) |
(No difference)
|
Latest revision as of 16:33, 5 October 2020
Contents
Configuration
Note that currently R2014a is available on Kong only. Stheno users must user R2013a.
- For R2013a download njit.remote.r2013a.tar.gz or njit.remote.r2013a.zip.
- For R2014a download njit.remote.r2014a.tar.gz or njit.remote.r2014a.zip.
- Unzip it and place the contents into $matlab/toolbox/local
- Start MATLAB. Configure MATLAB to run parallel jobs on the Kong or Stheno cluster by calling
configCluster
. For each cluster,configCluster
only needs to be called once per version of MATLAB.
Credentials
The first time a user submits a job to the cluster, the user will be prompted for their username.
The user will then be prompted for a password. The username is stored with MATLAB so that they are not prompted for it at a later time.
Serial Jobs
Use the batch
command to submit asynchronous jobs to the cluster. Thebatch
command will return a job object which is used to access the output of the submitted job. See the example below and see the MATLAB documentation for more help on batch
.
Note: In the example below, wait is used to ensure that the job has completed before requesting results. In regular use, one would not use wait, since a job might take an elongated period of time, and the MATLAB session can be used for other work while the submitted job executes.
To retrieve a list of currently running or completed jobs, call parcluster
to retrieve the cluster object. The cluster object stores an array of jobs that were run, are running, or are queued to run. This allows us to fetch the results of completed jobs. Retrieve and view the list of jobs as shown below.
Once we’ve identified the job we want, we can retrieve the results as we’ve done previously. If the job produces an error, we can call the getDebugLog
method to view the error log file. The error log can be lengthy and is not shown here. The example below will retrieve the results of job #3.
NOTE: fetchOutputs
is used to retrieve function output arguments. Data that has been written to files on the cluster needs be retrieved directly from the file system.
Parallel Jobs
Users can also submit parallel workflows with batch. Let’s use the following example for a parallel job.
We’ll use the batch
command again, but since we’re running a parallel job, we’ll also specify a MATLAB Pool.
The job ran in 4.73 seconds using eight workers. Note that these jobs will always request N+1 CPU cores, since one worker is required to manage the batch job and pool of workers. For example, a job that needs eight workers will consume nine CPU cores.
We’ll run the same simulation, but increase the Pool size. Note, for some applications, there will be a diminishing return when allocating too many workers. This time, to retrieve the results at a later time, we’ll keep track of the job ID.
Once we have a handle to the cluster, we’ll call the findJob method to search for the job with the specified job ID.
The job now runs in 2.75 seconds using 16 workers. Run code with different numbers of workers to determine the ideal number to use.
Alternatively, to retrieve job results via a graphical user interface, use the Job Monitor (Parallel > Monitor Jobs).
Configuring Jobs
Prior to submitting the job, we can specify:
- Email Notification (when the job is running, exiting, or aborting)
- Use of GPU
- Queue name, and
- Wall time
Specification is done with ClusterInfo
. The ClusterInfo
class supports tab completion to ease recollection of method names.
NOTE: Any parameters set with ClusterInfo
will be persistent between MATLAB sessions.
To see the values of the current configuration options, call the state
method. To clear a value, assign the property an empty value (‘’, [], or false), or call the clear
method to clear all values.
Writing Figures to Files
Saving figures to files requires some changes in your .M file to provide file handles and to turn off display (attempting to display figures on a server will cause an error):
fig = figure('Visible','off')
Your plot(), title(), labels, and other figure attributes are not changed. After the attributes are given you can specify writing to a FIG, PDF, or other file type:
print(fig,'-dpdf','yourfilename.pdf') % Saves as a PDF, see print() help for many other types saveas(fig,'yourfilename.fig') % Saves as MATLAB FIG file, accessible with openfig()
On your Windows PC the figures will not pop up as before, but the files will appear in your Documents\MATLAB directory. On the Linux server the files will appear in your $HOME directory, so after your jobs runs you will have to move your files back to your PC using MobaXterm or another ssh client (SSH Secure Shell or PuTTY scp).
Users of Linux or Mac can use the built-in scp command utility to copy files from the server.
Files written on the server in this manner are not erased by MATLAB's delete() function. You will have to login to the server and erase them with the rm command.
%% Parallel Example with figure output function t = parallel_savefig_example upto=30 parfor x = 1:upto N(x) = x; N2(x) = x * x; end fig = figure('Visible','off') % Visible=off required to run on server plot(N,N2,'g-*') title('Parallel Example Figure') xlabel('Number') ylabel('Number squared') % On Windows files are written to your Documents\MATLAB directory % On server they're writtent to your $HOME directory print(fig,'-dpdf','pxfig.pdf') % PDF version saveas(fig,'pxfig.fig') % MATLAB FIG version t = [N;N2]
To Learn More
To learn more about the MATLAB Parallel Computing Toolbox, check out these resources: