Skip to content

nucci6/openblas-test

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

OpenBLAS Testing for the RISE SW Stack

A successful execution of the following tests will verify the installation, linking, and multi-threading capabilities of OpenBLAS on the new RISE software stack.

Two C programs are provided for testing:

  • openblas_sanity.c - Verifies C-header availability, library linking, and basic math correctness.
  • openblas_benchmark.c - Tests matrix multiplication performance (GFLOP/s) and OpenMP threading.

Step 1: Environment Setup

Unlike Python packages, OpenBLAS is a compiled C/Fortran library. We do not need a virtual environment, but we do need the gcc compiler and pkg-config to locate the library files.

Load the required modules:

# Clear existing modules
module purge

# Load the RISE software stack, compiler, and OpenBLAS
module use /storage/icds/sw8/modulefiles_rc2026/linux-rhel8-x86_64/Core
module load gcc
module load openblas/0.3.30

Step 2: Run the Core Sanity Check (openblas_sanity.c)

This program compiles a simple 2x2 matrix multiplication using the cblas_dgemm function. It verifies that the compiler can find cblas.h, that the linker can find libopenblas.so, and that the underlying math engine is calculating correctly.

  1. Compile the code: (We use pkg-config to automatically inject the correct include and library paths).
gcc openblas_sanity.c -o openblas_sanity $(pkg-config --cflags --libs openblas)
  1. Run the executable:
./openblas_sanity

Expected Output:

--- OpenBLAS Sanity Check ---
OpenBLAS Config: OpenBLAS 0.3.30 DYNAMIC_ARCH NO_AFFINITY USE_OPENMP USE_LOCKING Haswell MAX_THREADS=512
Detected CPU Core: Haswell

Performing 2x2 DGEMM (Matrix Multiplication)...
Result Matrix:
[ 19.0  22.0]
[ 43.0  50.0]

 OpenBLAS is linked and calculating correctly!

Step 3: Run the Performance Benchmark (openblas_benchmark.c)

Because OpenBLAS is designed for High-Performance Computing, this test allocates large (2000x2000) matrices and measures the GFLOP/s (Giga-Floating Point Operations per Second).

Because this OpenBLAS module is compiled with USE_OPENMP, it relies on the OMP_NUM_THREADS environment variable to control its multi-threading. We will test it using 4 threads.

  1. Compile the code:
gcc openblas_benchmark.c -o openblas_benchmark $(pkg-config --cflags --libs openblas)
  1. Set the thread count and run:
export OMP_NUM_THREADS=4
./openblas_benchmark

Expected Output: (Note: Your exact Time and GFLOP/s will vary depending on the specific CPU architecture of your current compute node).

--- OpenBLAS Performance & Threading Test ---
Allocating 2000x2000 matrices...
Threads: 4 | Time:  0.180 sec | Performance:    88.89 GFLOP/s
Threads: 4 | Time:  0.182 sec | Performance:    87.91 GFLOP/s
Threads: 4 | Time:  0.185 sec | Performance:    86.49 GFLOP/s
Threads: 4 | Time:  0.190 sec | Performance:    84.21 GFLOP/s

 Benchmark complete!

Step 4: Cleanup

Once testing is complete, you can remove the compiled executables:

rm openblas_sanity openblas_benchmark

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages