These patches have been committed to the master branch: 8e16f26ca9f — i386: Support partial vectorized V2BF/V4BF plus/minus/mult/div/sqrt 62df24e5003 — i386: Support partial vectorized V2BF/V4BF smaxmin f82fa0da4d9 — i386: Support vectorized BF16 add/sub/mul/div with AVX10.2 6d294fb8ac9 — i386: Support vectorized BF16 FMA with AVX10.2 29ef601973d — i386: Support vectorized BF16 smaxmin with AVX10.2 e19f65b0be1 — i386:… Read more i386: Vectorized BF16 arithmetic with AVX10.2 – add/sub/mul/div/FMA/sqrt/smaxmin
Month: August 2024
AVX10.2: Support FP16/BF16/FP8 convert instructions
This patch has been committed to the master branch: 2a046117a83 — AVX10.2: Support convert instructions This patch adds GCC intrinsic support for the AVX10.2 conversion instructions — a new set of instructions for converting between FP16, BF16, FP8 (HF8/BF8), and FP32 formats with various rounding and saturation options. Overview AVX10.2 introduces a family of vector… Read more AVX10.2: Support FP16/BF16/FP8 convert instructions