Perform Element-wise Subtraction of Arrays using C++ SIMD

Perform Element-wise Subtraction of Arrays using C++ SIMD

Element-wise subtraction is a common operation in numerical computing, signal processing, and data science, where corresponding elements of two arrays are subtracted and stored in a result array. Modern CPUs provide powerful SIMD instructions that can greatly accelerate this process.

Here's the basic implementation:

#include <iostream> #include <vector> void vectorSubtract(const float *a, const float *b, float *result, const size_t n) { for (size_t i = 0; i < n; ++i) { result[i] = a[i] - b[i]; } } int main() { std::vector<float> a = { 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0, }; std::vector<float> b = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, }; std::vector<float> result(a.size()); vectorSubtract(a.data(), b.data(), result.data(), a.size()); for (auto value: result) { std::cout << value << " "; } return 0; }

In this code, we iterate over each element in the arrays, subtracting them and storing the result. Output:

16 14 12 10 8 6 4 2 0 -2 -4 -6 -8 -10 -12 -14 -16 -18

However, for large arrays, this approach can be slow due to its sequential nature.

Here's the optimized implementation using AVX2:

#include <immintrin.h> void vectorSubtract(const float *a, const float *b, float *result, const size_t n) { size_t i = 0; for (; i + 8 <= n; i += 8) { __m256 va = _mm256_loadu_ps(&a[i]); __m256 vb = _mm256_loadu_ps(&b[i]); __m256 vresult = _mm256_sub_ps(va, vb); _mm256_storeu_ps(&result[i], vresult); } for (; i < n; ++i) { result[i] = a[i] - b[i]; } }

Here how it works:

  • _mm256_loadu_ps loads 8 floating-point numbers from arrays.
  • _mm256_sub_ps performs element-wise subtraction of two vectors.
  • _mm256_storeu_ps stores the result to the array.

The remaining elements (if any) are handled in a fallback loop.

Leave a Comment

Cancel reply

Your email address will not be published.