Skip to content
This repository was archived by the owner on Jan 7, 2023. It is now read-only.

Intel Chainer Single Node Performance Test Configurations

mingxiaoh edited this page Oct 24, 2017 · 3 revisions

This document describes the Intel Chainer Single Node performance test configurations for your reference.

Hardware / BIOS configuration

  • BIOS configurations
    • Broadwell

      Turbo Boost Technology: on

      Hyper-treading (HT): off

      NUMA: off

    • Knights Lading

      Turbo Boost Technology: on

      Hyper-treading (HT): on

      NUMA: off

      Memory mode: cache

    • Knights Mill

      Turbo Boost Technology: on

      Hyper-treading (HT): on

      NUMA: off

      Memory mode: cache

    • Skylake

      Turbo Boost Technology: on

      Hyper-treading (HT): on

      NUMA: on

  • Optimize hardware in BIOS: set CPU max frequency, set 100% fan speed, check cooling system.

Software / OS configuration

  • It is recommended to use Linux Centos 7.2 or newer for Intel Chainer.

  • Make sure that there are no unnecessary processes during training and scoring. Intel® Distribution of Chainer* is using all available resources and other processes (like monitoring tools, java processes, network traffic etc.) might impact performance.

  • Please clean up cache to make the test result stable before each test.

    sudo sh -c 'sync; echo 3 > /proc/sys/vm/drop_caches'

  • With Intel Knights Mill, we recommend the following configurations:

    • Boot with “nohz_full=1-287 rcu_nocbs=1-287” to enable adaptive-ticks in kernel, for example boot with command “linux16 /vmlinuz-3.10.0-514.el7.x86_64 root=/dev/mapper/cl-root ro crashkernel=auto rd.lvm.lv=cl/root rd.lvm.lv=cl/swap rhgb quiet LANG=en_US.UTF-8 nohz_full=1-287 rcu_nocbs=1-287” in grub.cfg.

    • Run test with the following environments:

      export KMP_HW_SUBSET=1T

      export OMP_NUM_THREADS=72

      export KMP_AFFINITY=granularity=fine,compact

  • With Intel Xeon Scalable processors (Skylake) , we recommend the following configurations:

    • Set CPU frequency:

      sudo cpupower frequency-set -d 2.4G -u 3.7G -g performance

    • Disable numa balancing after reboot:

      sudo echo 0 > /proc/sys/kernel/numa_balancing

    • Run test with the following environments:

      export KMP_HW_SUBSET=1T

      export OMP_NUM_THREADS=56

      export KMP_AFFINITY=granularity=fine,compact

    • Run with numactl:

      numactl -l python performance.py -a ${arch} -b ${batchsize} -i ${insize} -d ${datasize} -e ${epoch}

Clone this wiki locally