-
Notifications
You must be signed in to change notification settings - Fork 17
Intel Chainer Single Node Performance Test Configurations
This document describes the Intel Chainer Single Node performance test configurations for your reference.
- BIOS configurations
-
Broadwell
Turbo Boost Technology: onHyper-treading (HT): offNUMA: off -
Knights Lading
Turbo Boost Technology: onHyper-treading (HT): onNUMA: offMemory mode: cache -
Knights Mill
Turbo Boost Technology: onHyper-treading (HT): onNUMA: offMemory mode: cache -
Skylake
Turbo Boost Technology: onHyper-treading (HT): onNUMA: on
-
- Optimize hardware in BIOS: set CPU max frequency, set 100% fan speed, check cooling system.
-
It is recommended to use Linux Centos 7.2 or newer for Intel Chainer.
-
Make sure that there are no unnecessary processes during training and scoring. Intel® Distribution of Chainer* is using all available resources and other processes (like monitoring tools, java processes, network traffic etc.) might impact performance.
-
Please clean up cache to make the test result stable before each test.
sudo sh -c 'sync; echo 3 > /proc/sys/vm/drop_caches' -
With Intel Knights Mill, we recommend the following configurations:
-
Boot with “nohz_full=1-287 rcu_nocbs=1-287” to enable adaptive-ticks in kernel, for example boot with command “linux16 /vmlinuz-3.10.0-514.el7.x86_64 root=/dev/mapper/cl-root ro crashkernel=auto rd.lvm.lv=cl/root rd.lvm.lv=cl/swap rhgb quiet LANG=en_US.UTF-8 nohz_full=1-287 rcu_nocbs=1-287” in grub.cfg.
-
Run test with the following environments:
export KMP_HW_SUBSET=1Texport OMP_NUM_THREADS=72export KMP_AFFINITY=granularity=fine,compact
-
-
With Intel Xeon Scalable processors (Skylake) , we recommend the following configurations:
-
Set CPU frequency:
sudo cpupower frequency-set -d 2.4G -u 3.7G -g performance -
Disable numa balancing after reboot:
sudo echo 0 > /proc/sys/kernel/numa_balancing -
Run test with the following environments:
export KMP_HW_SUBSET=1Texport OMP_NUM_THREADS=56export KMP_AFFINITY=granularity=fine,compact -
Run with numactl:
numactl -l python performance.py -a ${arch} -b ${batchsize} -i ${insize} -d ${datasize} -e ${epoch}
-