Perform Element-wise Subtraction of Arrays using C++ SIMD

November 10, 2024
C++
0 Comments
398 Views

Element-wise subtraction is a common operation in numerical computing, signal processing, and data science, where corresponding elements of two arrays are subtracted and stored in a result array. Modern CPUs provide powerful SIMD instructions that can greatly accelerate this process.

Here's the basic implementation:

#include <iostream>
#include <vector>

void vectorSubtract(const float *a, const float *b, float *result, const size_t n) {
    for (size_t i = 0; i < n; ++i) {
        result[i] = a[i] - b[i];
    }
}

int main() {
    std::vector<float> a = {
        17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0,
    };
    std::vector<float> b = {
        1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
    };
    std::vector<float> result(a.size());

    vectorSubtract(a.data(), b.data(), result.data(), a.size());
    for (auto value: result) {
        std::cout << value << " ";
    }

    return 0;
}

In this code, we iterate over each element in the arrays, subtracting them and storing the result. Output:

16 14 12 10 8 6 4 2 0 -2 -4 -6 -8 -10 -12 -14 -16 -18

However, for large arrays, this approach can be slow due to its sequential nature.

Here's the optimized implementation using AVX2:

#include <immintrin.h>

void vectorSubtract(const float *a, const float *b, float *result, const size_t n) {
    size_t i = 0;
    for (; i + 8 <= n; i += 8) {
        __m256 va = _mm256_loadu_ps(&a[i]);
        __m256 vb = _mm256_loadu_ps(&b[i]);
        __m256 vresult = _mm256_sub_ps(va, vb);
        _mm256_storeu_ps(&result[i], vresult);
    }

    for (; i < n; ++i) {
        result[i] = a[i] - b[i];
    }
}

Here how it works:

_mm256_loadu_ps loads 8 floating-point numbers from arrays.
_mm256_sub_ps performs element-wise subtraction of two vectors.
_mm256_storeu_ps stores the result to the array.

The remaining elements (if any) are handled in a fallback loop.

Related

Leave a Comment