-
Notifications
You must be signed in to change notification settings - Fork 5
Eltwise Guide
Abhiram S edited this page Mar 9, 2026
·
3 revisions
AOCL-DLP provides standalone element-wise operations that apply transformations to a matrix without performing a GEMM. These are different from GEMM post-ops -- use eltwise ops when you already have a computed matrix and want to apply activations, type conversions, or other element-wise transforms to it.
| Scenario | Use |
|---|---|
| Apply activation after GEMM (same call) | GEMM with dlp_metadata_t post-ops -- see Post-Ops Guide
|
| Apply activation to a matrix that was not produced by GEMM | Standalone eltwise ops (this page) |
| Convert matrix between data types with fused operations | Standalone eltwise ops (this page) |
All eltwise functions share the same parameter pattern:
aocl_gemm_eltwise_ops_<input_type>o<output_type>(
const char order, // 'R' = row-major, 'C' = column-major
const char transa, // transpose option for input
const char transb, // transpose option for output
const md_t m, // number of rows
const md_t n, // number of columns
const <in_t>* a, // input matrix
const md_t lda, // leading dimension of input
<out_t>* b, // output matrix
const md_t ldb, // leading dimension of output
dlp_metadata_t* metadata // post-operations to apply
);The metadata parameter controls which operations are applied. Configure it exactly as described in the Post-Ops Guide -- the same dlp_metadata_t struct, seq_vector, and post-op types work here.
| Input | Output | Function |
|---|---|---|
| bfloat16 | float | aocl_gemm_eltwise_ops_bf16of32 |
| bfloat16 | bfloat16 | aocl_gemm_eltwise_ops_bf16obf16 |
| float | float | aocl_gemm_eltwise_ops_f32of32 |
| float | bfloat16 | aocl_gemm_eltwise_ops_f32obf16 |
| float | int32_t | aocl_gemm_eltwise_ops_f32os32 |
| float | int8_t | aocl_gemm_eltwise_ops_f32os8 |
| float | uint8_t | aocl_gemm_eltwise_ops_f32ou8 |
#include <aocl_dlp.h>
// Input matrix (m x n, row-major)
float input[M * N] = { /* ... */ };
float output[M * N] = {0};
// Configure GELU post-op
dlp_post_op_eltwise gelu_op = {
.sf = NULL,
.algo = {
.alpha = NULL,
.beta = NULL,
.algo_type = GELU_TANH,
.stor_type = DLP_F32
}
};
DLP_POST_OP_TYPE seq[] = { ELTWISE };
dlp_metadata_t meta = {0};
meta.seq_length = 1;
meta.seq_vector = seq;
meta.eltwise = &gelu_op;
meta.num_eltwise = 1;
aocl_gemm_eltwise_ops_f32of32(
'R', 'N', 'N', m, n,
input, n,
output, n,
&meta);
// output[i][j] = GELU(input[i][j])bfloat16 bf16_data[M * N] = { /* ... */ };
float f32_output[M * N] = {0};
// RELU post-op
dlp_post_op_eltwise relu_op = {
.sf = NULL,
.algo = { .alpha = NULL, .beta = NULL, .algo_type = RELU, .stor_type = DLP_F32 }
};
DLP_POST_OP_TYPE seq[] = { ELTWISE };
dlp_metadata_t meta = {0};
meta.seq_length = 1;
meta.seq_vector = seq;
meta.eltwise = &relu_op;
meta.num_eltwise = 1;
aocl_gemm_eltwise_ops_bf16of32(
'R', 'N', 'N', m, n,
bf16_data, n,
f32_output, n,
&meta);
// f32_output[i][j] = RELU( bf16_to_f32(bf16_data[i][j]) )- Post-Ops Guide -- Fusing operations with GEMM
- GEMM Guide -- GEMM data types and parameters
-
Examples --
eltwise_ops.c - API Reference -- Generated documentation
Getting Started
User Guides
- Library Overview
- GEMM Guide
- Batch GEMM Guide
- Post-Operations
- Eltwise Operations
- Quantization
- API Lifecycle
Performance & Config
Testing & Benchmarking
Developer Guides
Reference