我正在优化矩阵乘法的代码。
for (int i = 0; i < SIZE; i++) {
for (int j = 0; j < SIZE; j++) {
float tmp = 0;
for (int k = 0; k < SIZE; k+=4) {
v1 = _mm_load_ps(&m1[i][k]);
v2 = _mm_load_ps(&m2[j][k]);
vMul = _mm_mul_ps(v1, v2);
vRes = _mm_add_ps(vRes, vMul);
}
vRes = _mm_hadd_ps(vRes, vRes);
vRes = _mm_hadd_ps(vRes, vRes);
_mm_store_ss(&result[i][j], vRes);
}
}
但是g++
报错说“*'_mm_hadd_ps' was not declared in this scope*”。为什么会这样,我可以使用其他SSE函数比如_mm_add_ps
…
#include <pmmintrin.h>
? - Mysticial