Intel Xeon Phi (KNC) MPSS software for CentOS 8.2-8.5
The latest version (as of October 4th 2019) of the MPSS software stack for the
first generation Intel Xeon Phi (aka KNC) is 3.8.6. Unfortunately, this version
of the mpss stack supports RHEL/CentOS only up to version 7.4. No support for
RHEL/CentOS 8 has even been released.
One can install all but one of the RPMs from the CentOS 7 version of the MPSS software.
Some RPMs will fail on a postinstall due to a missing /usr/bin/python
interpreter, but this is relatively harmless. The biggest obstacle to running the
MPSS software on CentOS 8 is the lack of the appropriate mic
kernel
module.
To overcome this, I've grabbed the mpss-modules source tarball and ported it to
the Linux kernel 4.15+. This involved some extensive changes, mostly due to the change
of the kernel timers from init_timer
to timer_setup
.
The resulting kernel module loads and the Xeon Phi's boot into an "online" state. I can
run offloading code as well as SSH into the Xeon Phi's. So far, so good.
DISCLAIMER: I have created and tested this patch on a single development system.
I am not able to do widescale testing, nor have I extensively tested whether everything
is fully functional.
Things that work
- SSH access into the cards:
ssh mic0
- Offloading using SCIF
- Offloading using OpenCL (Xeon Phi 7120 FP32 = 2.2 TFlops, FP64 = 1.1 TFlops)
- Virtual console access:
minicom -s
, modem=/dev/ttyMIC0
- Shutting down the
mpssd
daemon should trigger a reset of the card from
state shutdown to state ready. With build 5 of the mpss-modules RPM this
bug was fixed.
Things that do not work
- If you have more than one Xeon Phi in your system then the standard
micctrl
command will get confused and might list some devices twice. This will also cause the
startup and shutdown of the mpssd
deamon to fail. A patch for mpss-daemon
is provided below.
- The MPSS software stack does not support NetworkManager. Instead, it relies on the good old
initscripts
-style scripts to bring up and down the micN
network
interface. This no longer works in RHEL 8.
The MPSS software stack writes out the ifcfg-micN
configuration files with a line
NM_CONTROLLED=no
When you remove this line and try to bring up the mic0
kernel using
$ /sbin/ifup mic0
it will fail. To overcome this, part of the old initscripts
scripts were included
in the mpss-modules patch to overcome this.
Alternatively, if you install the base RHEL/CentOS 8 network-scripts
package then
you can continue to use the command /sbin/ifup mic0
(and ignore or suppress
the warning message).
- ???? If you find anything that does not work, please let me know!
MPSS 3.8.6 on CentOS 8.5
MPSS 3.8.6 on CentOS 8.4
MPSS 3.8.6 on CentOS 8.3
MPSS 3.8.6 on CentOS 8.2
Installation
You can install the mpss-modules
file using
rpm -ivh https://github.com/jjkeijser/mpss/releases/download/MPSS3_RHEL8.5/mpss-modules-4.18.0-348.2.1.el8_5.x86_64-3.8.6-7.x86_64.rpm
Note
Normally, a weak-updates
link is created when a module is installed, e.g.
$ ls -al /lib/modules/4.18.0-348.12.1.el8_5.x86_64/weak-updates/
total 4
drwxr-xr-x 2 root root 20 Dec 17 11:32 .
drwxr-xr-x 6 root root 4096 Dec 18 01:29 ..
lrwxrwxrwx 1 root root 40 Dec 17 11:32 mic.ko -> ../../4.18.0-348.2.1.el8_5.x86_64/extra/mic.ko
This link is needed for the existing kernel to find the mic.ko
file. If the module cannot
be found after installing the mpss-modules
RPM then you can simply create it yourself, or
re-install the kernel RPM.
With RHEL/CentOS 8 and Linux kernel 4.18 the sysfs interface changed quite a bit. This can cause some odd things to happen if you have
more than one Xeon Phi inserted in your system. In CentOS 7, the devices in the sysfs directory /sys/class/mic
were always
listed low-to-high. In CentOS 8, however, the order is random:
# cd /sys/class/mic
# find .
.
./ctrl
./mic1
./scif
./mic0
This causes the micctrl
command to choke on the output and to list the mic0
twice: once as a configured
devices and once as "present-but-not-configured". Needless to say, this causes some errors when stopping and starting the cards.
With a small patch to the mpss-deamon-3.8.6.tar.bz2
this problem is corrected and the right number of cards is listed again.
A second issue with the micctrl
command is that it has hardcoded references to the /sbin/ifup
and
/sbin/ifdown
commands built in.
To overcome both of these problems, a patched version of the mpss-daemon RPM is made. The RPM and source RPM can be found here:
Installation
You can install the mpss-daemon
file using
rpm -ivh https://github.com/jjkeijser/mpss/releases/download/MPSS3_RHEL8.5/mpss-daemon-3.8.6-4.el8.x86_64.rpm
or
rpm -Uvh https://github.com/jjkeijser/mpss/releases/download/MPSS3_RHEL8.5/mpss-daemon-3.8.6-4.el8.x86_64.rpm
After that, you can start and stop the mpss
daemon using the regular commands
systemctl start|stop mpss
Note
The patched code was compiled using 'gcc' as the compiler instead of 'g++' in the original. There was absolutely no reason
to use the 'g++' compiler originally, so to make the code smaller I switched to 'gcc' in the mpss-daemon.spec
file.
MPSS Main |
MPSS CentOS 7 |
MPSS CentOS 8 |
MPSS Ubuntu