The HPC Challenge is a benchmark that combines HPL (Linpack) and a battery of other tests on memory, floating point and network performance that can give you a pretty well-rounded view on the performance characteristics of your cluster.
For Intel processors, the Intel Math Kernel Library (MKL) can be used instead of CBLAS or ATLAS to get better performance on the HPL benchmark. So it seems preferential to compile the HPC Challenge (which includes an HPL) with the Intel MKL.
This article on the Intel Developer Zone is way out of date and doesn’t offer any help if you are compiling without Intel’s C and Fortran compilers.
Step 0: Prerequisites
- Install the Intel MKL (you can get it for free for academia)
- Install the development packages you need
- Download and extract the HPCC source files
I used the following
- Ubuntu 15.10
- MPICH2 3.1
- Intel MKL 18.104.22.168
- gcc 5.2.1
- clang 3.6.2-1
This equates to
$ apt-get install build-essential clang libmpich2-dev libiomp-dev libfftw-dev
Step 1: Build MPI MKL FFTW library and C wrapper
$ cd /opt/intel/mkl/interfaces/fftw2x_cdft
$ make libintel64 PRECISION=MKL_DOUBLE interface=ilp64 mpi=mpich2 compiler=gnu
$ cd /opt/intel/mkl/interfaces/fftw2xc
$ make libintel64 PRECISION=MKL_DOUBLE compiler=gnu
This should create the files
/opt/intel/mkl/lib/intel64. If you are using a different MPI implementation, just type
make to see a list of available options.
Step 2: Create Makefile for HPL
$ cd (path to hpcc)/hpl
$ cp setup/Make.Linux_PII_CBLAS Make.intel64
Now you should edit this
Make.intel64 file to point to MPI and MKL libraries.
Set the MPI library
For using MPICH2 on Ubuntu set MPdir, MPinc and MPlib like this
MPdir = /usr/lib/mpich MPinc = -I$(MPdir)/include MPlib = /usr/lib/x86_64-linux-gnu/libmpich.a
Set the MKL library
Set the LAdir and LAlib variables to point to the Intel MKL library
LAdir = /opt/intel/mkl/lib/intel64 LAinc = LAlib = -Wl,--start-group $(LAdir)/libfftw2x_cdft_DOUBLE_ilp64.a \ $(LAdir)/libfftw2xc_double_gnu.a \ $(LAdir)/libmkl_intel_lp64.a \ $(LAdir)/libmkl_intel_thread.a \ $(LAdir)/libmkl_core.a \ $(LAdir)/libmkl_blacs_intelmpi_lp64.a \ $(LAdir)/libmkl_cdft_core.a \ -Wl,--end-group
Set the compiler options
The compiler options are the same whether you want to use GCC or clang. clang seems to compile faster for me and looks pretty.
CC = /usr/bin/[gcc|clang] CCNOOPT = $(HPL_DEFS) CCFLAGS = $(HPL_DEFS) -fomit-frame-pointer -O3 -funroll-loops \ -Wl,--no-as-needed -ldl -lmpi -liomp5 -lpthread -lm \ -DUSING_FFTW -DMKL_INT=long -DLONG_IS_64BITS
You shouldn’t use
-lopenmp because we want it to be linked against
You should also change the linker to use
LINKER = /usr/bin/gfortran LINKFLAGS = $(CCFLAGS)
Step 3: Build HPCC
From the top level of the HPCC source tree, just run
$ make all arch=intel64
and you should get a nice shiny
hpcc executable. Now to tune those HPL parameters…
On my modest desktop with Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz and 16GB of memory, the following results were obtained with a minimal amount of HPL.dat tweaking, using this as a starting point.
Not so much difference, I wonder how it compares with the Intel compilers. I was surprised clang is slower, but maybe it will be faster second time around.