Skip to content

Commit 09abf11

Browse files
authored
update README.md (#411)
This PR updates the README to clarify minimum hardware architecture requirements for different data types. The documentation now specifies which instruction sets are needed for each function type (f32, bf16, int8). Signed-off-by: Vishal <Vishal.Akula@amd.com>
1 parent 9caecbb commit 09abf11

1 file changed

Lines changed: 12 additions & 7 deletions

File tree

README.md

Lines changed: 12 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -103,15 +103,20 @@ The library provides specialized element-wise operations:
103103
| f32os32 | float input to int32_t output |
104104
| f32os8 | float input to int8_t output |
105105

106-
## Hardware Support
106+
## Hardware Requirements
107107

108-
AOCL-DLP is optimized for AMD processors with the following instruction sets:
109-
- AVX2/FMA3 (available on Zen1 and newer)
110-
- AVX512 (available on Zen4 and newer)
111-
- AVX512_VNNI (available on Zen4 and newer, for int8 operations)
112-
- AVX512_BF16 (available on Zen4 and newer, for bfloat16 operations)
108+
AOCL-DLP is optimized for AMD processors and requires specific minimum architecture support based on the functions being used:
113109

114-
It also runs on any x86_64 (AMD64) CPU that supports these instruction sets.
110+
### Minimum Architecture Requirements
111+
112+
| Function Type | Minimum Required ISA | Available On |
113+
|------------------------|---------------------------|-----------------------------------------------------|
114+
| f32 (float) | AVX2/FMA3 | AMD Zen1 and newer, Intel Haswell and newer |
115+
| bf16 (bfloat16) | AVX2/FMA3 | AMD Zen1 and newer, Intel Haswell and newer |
116+
|| AVX512_BF16 (optimal) | AMD Zen4 and newer, Intel Cooper Lake and newer |
117+
| int8 (int8, uint8) | AVX512_VNNI | AMD Zen4 and newer, Intel Cascade Lake and newer |
118+
119+
While optimized for AMD processors, the library is compatible with any x86_64 CPU that meets these minimum requirements. For best performance on AMD processors, it is recommended to use Zen4 or newer architectures which support all instruction sets.
115120

116121
## Build
117122

0 commit comments

Comments
 (0)