Commit Graph

98 Commits

Author SHA1 Message Date
Barinzaya 6ebd30033f Removed an extra character that slipped into a comment. 2025-05-03 13:24:22 -04:00
Barinzaya 8b6436201e Fixed a reduce_add proc doing multiplication instead. 2025-05-03 13:04:11 -04:00
Barinzaya 7e34d707bb core:simd helpers: indices and reduce_add/mul
The indices proc simply creates a vector where each lane contains its
own lane index. This can be useful for use in generating masks for loads
and stores at the beginning/end of slices, among other things.

The new reduce_add/reduce_mul procs perform the corresponding arithmetic
reduction, in different orders than just "in sequential order". These
alternative orders can often be faster to calculate, as they can offer
better SIMD hardware utilization.

Two different orders are added for these: pair-wise (operating on
adjacent pairs of elements) or split-wise (operating element-wise on the
two halves of the vector).

This doesn't actually cover the *fastest* way for arbitrarily-sized
vectors. That would be an ordered reduction across the native vector
width, then reducing the resulting vector to a scalar in an appropriate
parallel fashion. I'd created an implementation of that, but it required
multiple procs and a fair bit more trickery than I was comfortable with
submitting to `core`, so it's not included yet. Maybe in the future.
2025-05-03 11:55:52 -04:00
Jeroen van Rijn f7c4c80ef3 Fix broken examples in documentation tester.
No more:
```
We could not find the procedure "pkg_foo_example :: proc()" needed to test the example created for "pkg.foo"
The following procedures were found:
   bar()
```
2025-04-05 16:36:26 +02:00
Yawning Angel 982ab11aa1 core/crypto/sha2: Use hardware SHA224/256 when available (AMD64) 2025-03-23 19:14:33 +09:00
flysand7 70daf40cb1 Fix documentation for simd_shuffle 2025-03-02 20:42:13 +11:00
flysand7 698c510ba7 Merge branch 'master' into docs-simd 2025-03-02 20:05:55 +11:00
Barinzaya 4afedbc051 Added simd_extract_lsbs intrinsic as well.
Equivalent to the simd_extract_msbs intrinsic, except it extracts the
least significant bit of each element instead.
2025-02-24 08:49:57 -05:00
Barinzaya 33a3aab791 Added simd_extract_msbs intrinsic. 2025-02-24 08:39:32 -05:00
flysand7 5d290dce06 Merge branch 'simd-docs' into docs-simd 2025-01-21 11:45:51 +11:00
flysand7 b7afbd6d57 Suggestion fixes 2025-01-21 11:15:00 +11:00
flysand7 dfe3073cef [simd] Fixes to inputs/result/example/output sections & grmamar fixes 2025-01-08 16:47:12 +03:00
flysand7 49b8abe3ef Apply suggestions from code review
Co-authored-by: Laytan <laytanlaats@hotmail.com>
2024-12-04 22:50:10 +11:00
flysand7 ba6224b61a Apply suggestions from code review
Co-authored-by: Laytan <laytanlaats@hotmail.com>
2024-12-04 19:11:21 +11:00
flysand7 8387561d0a [core/simd]: Write package documentation 2024-12-02 01:24:33 +11:00
flysand7 d41c7d52e7 Fix indentation 2024-12-01 11:50:00 +11:00
flysand7 d48c351330 Fix indentation 2024-12-01 11:48:52 +11:00
flysand7 596921fb7a First pass 2024-12-01 11:42:24 +11:00
Antonino Simone Di Stefano 357c8f6f34 Replace "." with "," in parameter list 2024-09-22 23:19:36 +02:00
Karl Zylinski 19f0127e55 Moved all packages in core, base, vendor, tests and examples to use new #+ file tag syntax. 2024-09-14 18:27:49 +02:00
Laytan Laats 58e5078b66 add riscv to simd.IS_EMULATED 2024-08-22 14:17:45 +02:00
Yawning Angel 7020e9b66a core/simd: Add IS_EMULTATED so there is one place to look for potatos 2024-08-18 22:52:39 +09:00
gingerBill f56abf3780 Add intrinsics.masked_expand_load and intrinsics.masked_compress_store 2024-08-05 14:54:09 +01:00
gingerBill 78919f8524 Fix typos 2024-08-05 14:48:55 +01:00
gingerBill 84ac56f778 Add intrinsics.simd_masked_load and intrinsics.simd_masked_store 2024-08-05 14:08:41 +01:00
gingerBill 7e701d1677 Add intrinsics.simd_gather and `intrinsics.simd_scatter 2024-08-05 13:46:24 +01:00
gingerBill b67ed78afd add_sat -> saturating_add 2024-08-05 13:21:27 +01:00
gingerBill 90fc52c2ee Rename add_sat -> saturating_add 2024-08-05 13:19:01 +01:00
gingerBill 9a01a13914 Add simd_reduce_any and simd_reduce_all 2024-08-05 13:13:19 +01:00
Yawning Angel 69026852ce core/crypto/aes: Add Intel AES-NI support
This supports AES-NI + PCLMUL, and provides optimized key schedule, ECB,
CTR, and GCM.  Other modes are trivial to add later if required.
2024-07-16 01:29:43 +09:00
Yawning Angel f578994fa6 core/simd/x86: Make the AES-NI intrinsics consistent with Intel 2024-07-16 01:29:43 +09:00
Yawning Angel 390cd3c30d core/simd/x86: Fix some intrinsics
- _mm_slli_si128 produced totally incorrect output
- _mm_storeu_si128 refered to a LLVM intrinsic that is missing
2024-07-16 01:29:43 +09:00
Jeroen van Rijn 7b31acd2d7 Let simd/x86 pass new transmute/cast vet. 2024-07-09 16:50:55 +02:00
Yawning Angel f49575f1fb core/simd/x86: Add the AES-NI intrinsics 2024-06-01 22:55:42 +09:00
gingerBill 1b593fc1ca Correct core:intrinsics to base:intrinsics 2024-05-13 13:27:44 +01:00
Laytan Laats 25f1d0906d compiler: improve target features support 2024-05-02 00:59:52 +02:00
gingerBill 3e7e779abf Replace core:* to base:* where appropriate 2024-01-28 22:18:51 +00:00
Yawning Angel 8d7c37e384 core/simd/x86: Use the none calling convention for intrinsics
The LLVM intrinsics that live under `llvm.x86` are not actual functions,
so trying to invoke them as such using the platform's native C
calling convention causes incorrect types to be emitted in the IR.

Thanks to laytanl for assistance in testing.
2024-01-07 20:04:40 +09:00
Yawning Angel 9235e82451 core/simd/x86: Correct a target feature name 2024-01-07 20:04:40 +09:00
jakubtomsu b06583133a Fix the other bit_* intrinsic calls 2023-10-22 20:59:19 +02:00
jakubtomsu a2e6fc5909 change and_not to bit_and_not 2023-10-22 20:52:35 +02:00
gingerBill 63f755554b Rename simd bitwise operations from intrinsics.simd_and to intrinsics.simd_bit_and etc 2023-09-28 16:42:08 +01:00
Jeroen van Rijn d5f94d73ad [sys/info] Initial version. 2022-09-01 00:43:47 +02:00
gingerBill bb7f291f5f Remove simd_rem; Disallow simd_div for integers 2022-06-02 12:10:43 +01:00
gingerBill 4e49d24df9 Add enable_target_feature to ABM 2022-05-30 16:08:06 +01:00
gingerBill 68222cb8ab Add SSE4.2 2022-05-30 16:06:31 +01:00
gingerBill 912d29af83 Add @(require_results) to all appropriate procedures 2022-05-30 15:59:48 +01:00
gingerBill 51707032d1 Add SSE4.1 2022-05-30 15:17:02 +01:00
gingerBill f3aefbc443 @(require_target_feature=<string>) @(enable_target_feature=<string>)
require_target_feature - required by the target micro-architecture
enable_target_feature - will be enabled for the specified procedure only
2022-05-30 14:53:12 +01:00
gingerBill cef022539e Rename to lanes_rotate_left, lanes_rotate_right, lanes_reverse 2022-05-29 15:13:14 +01:00