GettingStartedWithSerialAndParallelMATLABOnKongAndStheno

From NJIT-ARCS HPC Wiki
Jump to: navigation, search

Configuration

Note that currently R2014a is available on Kong only. Stheno users must user R2013a.

Mdcsimg1.png

Credentials

The first time a user submits a job to the cluster, the user will be prompted for their username.
Mdcsimg2.png
The user will then be prompted for a password. The username is stored with MATLAB so that they are not prompted for it at a later time.

Serial Jobs

Use the batch command to submit asynchronous jobs to the cluster. Thebatch command will return a job object which is used to access the output of the submitted job. See the example below and see the MATLAB documentation for more help on batch. Note: In the example below, wait is used to ensure that the job has completed before requesting results. In regular use, one would not use wait, since a job might take an elongated period of time, and the MATLAB session can be used for other work while the submitted job executes. Mdcsimg3.png
To retrieve a list of currently running or completed jobs, call parcluster to retrieve the cluster object. The cluster object stores an array of jobs that were run, are running, or are queued to run. This allows us to fetch the results of completed jobs. Retrieve and view the list of jobs as shown below.

Mdcsimg4.png
Once we’ve identified the job we want, we can retrieve the results as we’ve done previously. If the job produces an error, we can call the getDebugLog method to view the error log file. The error log can be lengthy and is not shown here. The example below will retrieve the results of job #3.
NOTE: fetchOutputs is used to retrieve function output arguments. Data that has been written to files on the cluster needs be retrieved directly from the file system. Mdcsimg5.png

Parallel Jobs

Users can also submit parallel workflows with batch. Let’s use the following example for a parallel job.
Mdcsimg6.png
We’ll use the batch command again, but since we’re running a parallel job, we’ll also specify a MATLAB Pool.

Mdcsimg7.png

The job ran in 4.73 seconds using eight workers. Note that these jobs will always request N+1 CPU cores, since one worker is required to manage the batch job and pool of workers. For example, a job that needs eight workers will consume nine CPU cores.

We’ll run the same simulation, but increase the Pool size. Note, for some applications, there will be a diminishing return when allocating too many workers. This time, to retrieve the results at a later time, we’ll keep track of the job ID. Mdcsimg8.png

Once we have a handle to the cluster, we’ll call the findJob method to search for the job with the specified job ID.

Mdcsimg9.png

The job now runs in 2.75 seconds using 16 workers. Run code with different numbers of workers to determine the ideal number to use.

Alternatively, to retrieve job results via a graphical user interface, use the Job Monitor (Parallel > Monitor Jobs).

Mdcsimg10.png

Configuring Jobs

Prior to submitting the job, we can specify:

  • Email Notification (when the job is running, exiting, or aborting)
  • Use of GPU
  • Queue name, and
  • Wall time

Specification is done with ClusterInfo. The ClusterInfo class supports tab completion to ease recollection of method names. NOTE: Any parameters set with ClusterInfo will be persistent between MATLAB sessions.

Mdcsimg11.png

To see the values of the current configuration options, call the state method. To clear a value, assign the property an empty value (‘’, [], or false), or call the clear method to clear all values.

Mdcsimg12.png

Writing Figures to Files

Saving figures to files requires some changes in your .M file to provide file handles and to turn off display (attempting to display figures on a server will cause an error):

fig = figure('Visible','off')

Your plot(), title(), labels, and other figure attributes are not changed. After the attributes are given you can specify writing to a FIG, PDF, or other file type:

print(fig,'-dpdf','yourfilename.pdf')       % Saves as a PDF, see print() help for many other types
saveas(fig,'yourfilename.fig')              % Saves as MATLAB FIG file, accessible with openfig()

On your Windows PC the figures will not pop up as before, but the files will appear in your Documents\MATLAB directory. On the Linux server the files will appear in your $HOME directory, so after your jobs runs you will have to move your files back to your PC using MobaXterm or another ssh client (SSH Secure Shell or PuTTY scp).

Users of Linux or Mac can use the built-in scp command utility to copy files from the server.

Files written on the server in this manner are not erased by MATLAB's delete() function. You will have to login to the server and erase them with the rm command.

 %% Parallel Example with figure output
 function t = parallel_savefig_example

	upto=30
	parfor x = 1:upto
		N(x) = x;
		N2(x) = x * x;
	end

	fig = figure('Visible','off')		% Visible=off required to run on server
	plot(N,N2,'g-*')
	title('Parallel Example Figure')
	xlabel('Number')
	ylabel('Number squared')

	% On Windows files are written to your Documents\MATLAB directory
	% On server they're writtent to your $HOME directory

	print(fig,'-dpdf','pxfig.pdf')		% PDF version
	saveas(fig,'pxfig.fig')			% MATLAB FIG version

	t = [N;N2]

To Learn More

To learn more about the MATLAB Parallel Computing Toolbox, check out these resources: