mirror of
https://github.com/Ed94/Odin.git
synced 2026-06-13 01:21:38 -07:00
12dd0cb72a
This new algorithm uses a Scalar->Vector->Scalar iteration loop which requires no masking off of any incomplete data chunks. Also, the width was reduced to 32 bytes instead of 64, as I found this to be about as fast as the previous 64-byte x86 version.