In performance-critical applications such as scientific computing, graphics, game development, or data processing, converting large arrays of floating-point numbers to integers is a common operation. While a basic scalar loop is simple and effective for small arrays, it quickly becomes a bottleneck when dealing with large datasets. Modern CPUs offer SIMD capabilities, which allow you to process multiple data elements in parallel using a single instruction. This can dramatically speed up such conversions.
The simplest way to convert a float
array to int32_t
is by looping through each element and casting it individually. Here's what that approach looks like:
#include <iostream>
#include <vector>
void floatint32(const float *data, int32_t *result, const size_t n) {
for (size_t i = 0; i < n; ++i) {
result[i] = (int32_t) data[i];
}
}
int main() {
std::vector<float> a = {
-1.4, 0, 1.5, -2.6, 3.4, -4.5, 5.6, -6.4, 7.5,
-8.6, 9.4, -10.5, 11.6, -12.4, 13.5, -14.6, 15.4, -16.5,
};
std::vector<int32_t> result(a.size());
floatint32(a.data(), result.data(), a.size());
for (auto value: result) {
std::cout << value << " ";
}
return 0;
}
This method is straightforward and portable, but it doesn't leverage the CPU's vectorization features for improved performance.
Below is the optimized version that takes advantage of AVX2 instructions:
#include <immintrin.h>
void floatint32(const float *data, int32_t *result, const size_t n) {
size_t i = 0;
for (; i + 8 <= n; i += 8) {
__m256 va = _mm256_loadu_ps(&data[i]);
__m256i vresult = _mm256_cvttps_epi32(va);
_mm256_storeu_si256((__m256i*) &result[i], vresult);
}
for (; i < n; ++i) {
result[i] = (int32_t) data[i];
}
}
Explanation:
_mm256_loadu_ps
loads 8 floating-point values from the array._mm256_cvttps_epi32
converts 8 floating-point values to 8 signed integers using truncation, not rounding._mm256_storeu_si256
writes these 8 integers into the result array.
The scalar fallback loop handles any remaining elements at the end when the total count isn't a multiple of 8.
Leave a Comment
Cancel reply