index_byte
last_index_byte
- Assume unaligned loads are cheap - Explicilty use 256-bit or 128-bit SIMD to avoid AVX512 - Limit "vectorized" scanning to 128-bits if SIMD is emulated via SWAR - Add a few more benchmark cases
core:simd/util
core:bytes