Modern x86-64 processors support different instruction set extensions. To simplify optimization targeting, standardized microarchitecture levels are available: x86-64-v2, x86-64-v3, and x86-64-v4. Each level represents a bundle of CPU features such as SSE, AVX, and AVX-512, enabling software to be compiled for a defined capability baseline. This tutorial explains how to compile for specific x86-64 microarchitecture level using gcc or g++ compiler.
- x86-64-v2 - baseline enhancements beyond the original x86-64 (e.g., SSE3, SSSE3, SSE4.1, SSE4.2).
- x86-64-v3 - adds AVX, AVX2, BMI1, BMI2, and other extensions.
- x86-64-v4 - includes AVX-512 and newer capabilities.
Selecting a higher level allows the compiler to emit more advanced SIMD instructions, improving performance on compatible CPUs.
Consider a simple program that performs element-wise division on an array:
#include <stdio.h>
void divide(float *data, const size_t n, const float divisor) {
for (size_t i = 0; i < n; ++i) {
data[i] /= divisor;
}
}
int main() {
const int n = 18;
float a[] = {
0, 14, 28, 42, 56, 70, 84, 98, 112, 126, 140, 154, 168, 182, 196, 210, 224, 255
};
divide(a, n, 255);
for (size_t i = 0; i < n; ++i) {
printf("%f ", a[i]);
}
return 0;
}
The loop structure is straightforward, making it suitable for benefiting from advanced instruction sets available in higher microarchitecture levels.
The -march option can be used for specifying the x86-64 architecture level. Combined with optimization option, it enables efficient code generation.
gcc -O3 -march=x86-64-v2 main.c -o test
gcc -O3 -march=x86-64-v3 main.c -o test
gcc -O3 -march=x86-64-v4 main.c -o test
The same options apply when using g++ for C++ programs.
Leave a Comment
Cancel reply