Add RVV optimization for ZSTD_row_getMatchMask
This pull request introduces a RISC-V Vector (RVV) specific optimization for the ZSTD_row_getMatchMask
function, replacing the generic SWAR implementation on RV64 platforms with V-extension support. The goal is to leverage RVV's parallel computation capabilities to improve performance on the RISC-V architecture.
Performance
Microbenchmark Results
A microbenchmark isolating the ZSTD_row_getMatchMask
function shows a significant speedup compared to the SWAR fallback.
rowEntries |
Speedup |
---|---|
16 bytes | 5.87x |
32 bytes | 9.63x |
64 bytes | 17.98x |
Fullbench
The overall impact on the fullbench
is modest. However, the new implementation shows a consistent small improvement and, most importantly, no performance regression.
Validation
-
All quick checks passed (
make check
). -
All long-running tests passed (
make test
). -
Static analysis reports no new issues (
make staticAnalyze
).