Loop vectorization is a powerful compiler optimization where repetitive operations (like array processing) are transformed into SIMD instructions. This allows the CPU to perform multiple operations in parallel using vector registers - massively boosting performance for math-heavy code. This tutorial explains how to check which loops were vectorized using gcc or g++ compiler.
Here's a simple function that divides each element in an array by a constant:
#include <stdio.h>
void divide(float *data, const size_t n, const float divisor) {
for (size_t i = 0; i < n; ++i) {
data[i] /= divisor;
}
}
int main() {
const int n = 18;
float a[] = {
0, 14, 28, 42, 56, 70, 84, 98, 112, 126, 140, 154, 168, 182, 196, 210, 224, 255
};
divide(a, n, 255);
for (size_t i = 0; i < n; ++i) {
printf("%f ", a[i]);
}
return 0;
}
This loop is simple and ideal for vectorization - no function calls inside the loop, no pointer aliasing, and regular memory access.
The -fopt-info-vec
option tells to output a report of vectorized loops. To compile and generate the report, run the following command:
gcc -O3 -fopt-info-vec=vec.log main.c
-O3
- enables aggressive optimizations (including vectorization).-fopt-info-vec=vec.log
- writes vectorization messages tovec.log
file.
After compiling, inspect the vec.log
file:
main.c:4:26: optimized: loop vectorized using 16 byte vectors
main.c:4:26: optimized: loop vectorized using 16 byte vectors
main.c:5:17: optimized: basic block part vectorized using 16 byte vectors
This means:
- The loop in
divide
was successfully vectorized. - Compiler used 16-byte wide vectors (i.e., 128-bit SIMD, typical of SSE).
The -fopt-info-vec
also works with g++ when compiling C++ code.
Leave a Comment
Cancel reply