I have a MATLAB program that takes a long time to run on my desktop. Will this program run faster when I run it on HPC?
Whether your job can run faster depends on several factors.
Parallelization
In general, the secret of getting a computation run faster on HPC is “parallelization”, that is, breaking down your tasks into smaller subtasks that can be executed together by many different processors. These subtasks must be independent of each other, or at least, can be running independently of each other for a substantial enough amount of time, to make it worth running in parallel. If the computation of the subtasks are dependent on one another (e.g. subtask 1 has to be completed before subtask 2 can start), then these subtasks cannot be parallelized.
Take for example, the training of a machine learning model, which is essentially an iterative parameter optimization for a high-dimensional function. At the start of the run, a set of guess parameters is proposed; then these parameters are improved through a complicated set of math operations. After so many iterations, these parameters would yield the best function that we are after:
-
Iteration is usually not parallelizable, since the parameters from the n-th iteration would be needed for the computation in the (n+1)-th iteration.
-
However, in one iteration, there could be many independent operations. For example, there could be many matrix multiplications (matmul) required along the way. Matmul is a parallelizable operation. We can parallelize the matmul step, so that each iteration would take shorter time.
Going back to your own MATLAB computation, the independent operations–which usually takes the form of a loop (or loops) with lots of iterations–can be sped up using the parfor
statement. See MATLAB Parallel Computing Toolbox to learn more.
Using compiled code
Another option is to convert the intensive part of your computation with a subroutine written in a compiled language. Please see MEX to learn more.
Using different algorithm
Another way MATLAB may run faster on HPC is the availability of memory. Sometimes, under memory constraints, a toolbox in MATLAB may switch to a memory-conserving algorithm at the cost of more computation. If memory is not an issue, then a faster, memory-intensive algorithm could be used. Please check your code and identify what function(s) may take a long time, and if there is a different algorithm that can accomplish the same thing (or nearly identical result) at the expense of memory.
Analyze your computation first!
You may find it surprising that there is no simple answer to this question. Your workstation may have a 3.5 GHz processor; and Wahab’s processors are only 2.4 GHz. If this is the case, then your MATLAB job might run slower on our HPC system without any modification.
For this reason, you need to first analyze the nature of your computation to know whether you have an opportunity to parallelize your computation (MATLAB or otherwise), and what strategy to use. Or perhaps rewrite the compute-intensive part of your code in C++ and use MEX. Or if your code is nearly all compute-intensive, you may rewrite the entire code in C++.