Compile for Specific x86-64 Microarchitecture Level using gcc or g++ Compiler

Compile for Specific x86-64 Microarchitecture Level using gcc or g++ Compiler

Modern x86-64 processors support different instruction set extensions. To simplify optimization targeting, standardized microarchitecture levels are available: x86-64-v2, x86-64-v3, and x86-64-v4. Each level represents a bundle of CPU features such as SSE, AVX, and AVX-512, enabling software to be compiled for a defined capability baseline. This tutorial explains how to compile for specific x86-64 microarchitecture level using gcc or g++ compiler.

  • x86-64-v2 - baseline enhancements beyond the original x86-64 (e.g., SSE3, SSSE3, SSE4.1, SSE4.2).
  • x86-64-v3 - adds AVX, AVX2, BMI1, BMI2, and other extensions.
  • x86-64-v4 - includes AVX-512 and newer capabilities.

Selecting a higher level allows the compiler to emit more advanced SIMD instructions, improving performance on compatible CPUs.

Consider a simple program that performs element-wise division on an array:

#include <stdio.h>

void divide(float *data, const size_t n, const float divisor) {
    for (size_t i = 0; i < n; ++i) {
        data[i] /= divisor;
    }
}

int main() {
    const int n = 18;
    float a[] = {
        0, 14, 28, 42, 56, 70, 84, 98, 112, 126, 140, 154, 168, 182, 196, 210, 224, 255
    };

    divide(a, n, 255);
    for (size_t i = 0; i < n; ++i) {
        printf("%f ", a[i]);
    }

    return 0;
}

The loop structure is straightforward, making it suitable for benefiting from advanced instruction sets available in higher microarchitecture levels.

The -march option can be used for specifying the x86-64 architecture level. Combined with optimization option, it enables efficient code generation.

gcc -O3 -march=x86-64-v2 main.c -o test
gcc -O3 -march=x86-64-v3 main.c -o test
gcc -O3 -march=x86-64-v4 main.c -o test

The same options apply when using g++ for C++ programs.

Leave a Comment

Cancel reply

Your email address will not be published.