Convert 32-bit Integer Array to Float Array using C++ SIMD

April 28, 2025
C++
0 Comments
266 Views

When working with numerical data, a common task is to convert arrays of integers into arrays of floating-point numbers. Although the basic scalar approach works perfectly fine, it may not be the most efficient when handling large datasets. Modern CPUs offer SIMD capabilities that allow us to process multiple elements in parallel, significantly accelerating such conversions.

The straightforward way to convert an array of int32_t to float is to iterate through each element and cast it manually. Here's how it looks:

#include <iostream>
#include <vector>

void int32float(const int32_t *data, float *result, const size_t n) {
    for (size_t i = 0; i < n; ++i) {
        result[i] = (float) data[i];
    }
}

int main() {
    std::vector a = {
        -1, 0, 1, -2, 3, -4, 5, -6, 7, -8, 9, -10, 11, -12, 13, -14, 15, -16,
    };
    std::vector<float> result(a.size());

    int32float(a.data(), result.data(), a.size());
    for (auto value: result) {
        std::cout << value << " ";
    }

    return 0;
}

While this method is simple and portable, it doesn't take advantage of CPU vectorization capabilities.

Here's the optimized implementation using AVX2:

#include <immintrin.h>

void int32float(const int32_t *data, float *result, const size_t n) {
    size_t i = 0;
    for (; i + 8 <= n; i += 8) {
        __m256i va = _mm256_loadu_si256((__m256i*) &data[i]);
        __m256 vresult = _mm256_cvtepi32_ps(va);
        _mm256_storeu_ps(&result[i], vresult);
    }

    for (; i < n; ++i) {
        result[i] = (float) data[i];
    }
}

Explanation:

_mm256_loadu_si256 loads 8 int32_t elements from array.
_mm256_cvtepi32_ps converts the 8 signed integers into 8 floating-point values.
_mm256_storeu_ps writes these 8 floats into the result array.

The scalar fallback loop ensures that if the number of elements isn't a multiple of 8, the leftovers are processed at the end.

Related

Leave a Comment