Eltwise Guide

Eltwise Operations Guide

AOCL-DLP provides standalone element-wise operations that apply transformations to a matrix without performing a GEMM. These are different from GEMM post-ops -- use eltwise ops when you already have a computed matrix and want to apply activations, type conversions, or other element-wise transforms to it.

When to Use Eltwise vs GEMM Post-Ops

Scenario	Use
Apply activation after GEMM (same call)	GEMM with `dlp_metadata_t` post-ops -- see Post-Ops Guide
Apply activation to a matrix that was not produced by GEMM	Standalone eltwise ops (this page)
Convert matrix between data types with fused operations	Standalone eltwise ops (this page)

Function Signature

All eltwise functions share the same parameter pattern:

aocl_gemm_eltwise_ops_<input_type>o<output_type>(
    const char      order,     // 'R' = row-major, 'C' = column-major
    const char      transa,    // transpose option for input
    const char      transb,    // transpose option for output
    const md_t      m,         // number of rows
    const md_t      n,         // number of columns
    const <in_t>*   a,         // input matrix
    const md_t      lda,       // leading dimension of input
    <out_t>*        b,         // output matrix
    const md_t      ldb,       // leading dimension of output
    dlp_metadata_t* metadata   // post-operations to apply
);

The metadata parameter controls which operations are applied. Configure it exactly as described in the Post-Ops Guide -- the same dlp_metadata_t struct, seq_vector, and post-op types work here.

Supported Type Combinations

Input	Output	Function
bfloat16	float	`aocl_gemm_eltwise_ops_bf16of32`
bfloat16	bfloat16	`aocl_gemm_eltwise_ops_bf16obf16`
float	float	`aocl_gemm_eltwise_ops_f32of32`
float	bfloat16	`aocl_gemm_eltwise_ops_f32obf16`
float	int32_t	`aocl_gemm_eltwise_ops_f32os32`
float	int8_t	`aocl_gemm_eltwise_ops_f32os8`
float	uint8_t	`aocl_gemm_eltwise_ops_f32ou8`

Example: Apply GELU to a Float Matrix

#include <aocl_dlp.h>

// Input matrix (m x n, row-major)
float input[M * N]  = { /* ... */ };
float output[M * N] = {0};

// Configure GELU post-op
dlp_post_op_eltwise gelu_op = {
    .sf   = NULL,
    .algo = {
        .alpha     = NULL,
        .beta      = NULL,
        .algo_type = GELU_TANH,
        .stor_type = DLP_F32
    }
};
DLP_POST_OP_TYPE seq[] = { ELTWISE };

dlp_metadata_t meta = {0};
meta.seq_length  = 1;
meta.seq_vector  = seq;
meta.eltwise     = &gelu_op;
meta.num_eltwise = 1;

aocl_gemm_eltwise_ops_f32of32(
    'R', 'N', 'N', m, n,
    input, n,
    output, n,
    &meta);
// output[i][j] = GELU(input[i][j])

Example: Convert BF16 to F32 with RELU

bfloat16 bf16_data[M * N] = { /* ... */ };
float    f32_output[M * N] = {0};

// RELU post-op
dlp_post_op_eltwise relu_op = {
    .sf   = NULL,
    .algo = { .alpha = NULL, .beta = NULL, .algo_type = RELU, .stor_type = DLP_F32 }
};
DLP_POST_OP_TYPE seq[] = { ELTWISE };

dlp_metadata_t meta = {0};
meta.seq_length  = 1;
meta.seq_vector  = seq;
meta.eltwise     = &relu_op;
meta.num_eltwise = 1;

aocl_gemm_eltwise_ops_bf16of32(
    'R', 'N', 'N', m, n,
    bf16_data, n,
    f32_output, n,
    &meta);
// f32_output[i][j] = RELU( bf16_to_f32(bf16_data[i][j]) )

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Eltwise Guide

Eltwise Operations Guide

When to Use Eltwise vs GEMM Post-Ops

Function Signature

Supported Type Combinations

Example: Apply GELU to a Float Matrix

Example: Convert BF16 to F32 with RELU

See Also

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally