When new hardware comes out most of the time the big hardware companies send sites like Legit Reviews a sample to check out along with reviewer materials. For example, Intel usually sends out a “what to expect guide” in their reviewer materials with some baseline performance numbers along with access to a few suggested workloads that might be of interest. No pressure is put on us to test, but we are given access to the material and can use it if we would like to.
One new workload that Intel suggested that we run was a script for a MATLAB workload that they said was fairly fast to run. The tested functions include matrix factorizations, linear equation solving, and computation of singular values that are commonly used in machine learning, geometric modeling, and scientific applications.
We started digging into it a bit and noticed that MATLAB runs both AMD and Intel processors by default with the Intel Math Kernel Library (Intel MKL). During our search we ran across a post on Reddit that shows AMD performance can be greatly boosted by running a script to tell MATLAB to run AMD processors in AVX2 mode. By forcing Matlab to use a fast codepath on AMD processors, the performance gains were said to be between 20% and 300% by doing this simple change.
Last week @intel suggested that we take a look at MATLAB workloads for CPU testing. A recent post on Reddit shows that @AMDRyzen CPUs do poorly with Intel MKL. A proposed 'fix' runs AMD CPUs in AVX2 mode. Perf gains on 2600x are between 20% and 300%. https://t.co/dIql1FUuRF pic.twitter.com/qsxvH6cpJa
— Nathan Kirsch (@LegitReviews) November 18, 2019
Maybe Intel PR didn’t know about this minor little detail, but it seemed like something we should stay away from with regards to CPU benchmarking in our launch reviews. So, we went about our hunt for new workloads to run on the 10th Gen Intel Core i9-10980XE ‘Cascade Lake-X’ and 3rd Gen Ryzen Threadripper 3970X processors as this seemed a bit too hot to touch. A number of people reached out to us on social media though and wanted us to test MATLAB anyway to see what the deal was.
So, we downloaded the Intel provided workload as you can see from the screenshot above and did some testing. The script that Intel provided to the media looks at the individual elapsed times that it takes MATLAB to complete 9 different linear algebra functions. Benchmarked functions include matrix factorizations, linear equation solving, and computation of singular values that are commonly used in machine learning, geometric modeling, and scientific applications. The functions performed, in no particular order are:
We ran the 3rd Gen AMD Ryzen Threadripper 3970X 32-core, 64-thread both ways and manually forcing AVX2 resulted in rather dramatic performance gains. We are seeing nearly a 4x performance improvement in the Matrix Multiplication function!
The next step was to repeat the testing that we did on the AMD TRX40 platform on the Intel X299 platform. Now might be a good time to mention that both systems were water cooled and running 64GB of Corsair DDR4-3600 memory at XMP settings. You can find out more about our CPU test benches in our launch article for the AMD Ryzen Threadripper 3970X processor.
When including the numbers for the new Intel Core i9-10980XE 18-core, 36-thread processor things get rather interesting. Intel wins or ties in 8 out of the 9 areas when both processors run the software without any tweaks. After we forced AVX2 to be used, Intel was winning or tied in just two of the 9 areas.
We went back to Intel and asked which of the 9 tests are the most important since this is software is new to us. Intel got back to LR and let us know that Matrix Multiplication is one of the most common uses of MATLAB and is used for deep learning. That happens to be one of the two tests that Intel still manages to lead in. All of the tests cover a variety of functions that are used in different fields, so you’ll have to figure out which are the most important to you!
The take home message here is that systems with AMD Ryzen that are running MATLAB can see huge performance benefits from a simple codepath change.