Skip to content

Why cast_lean() before conv op ? #483

@MirkoCalvi

Description

@MirkoCalvi

Description

In all the codegenerated convolutions calls (like in Conv, Conv_Clip, Conv_Relu..) before the conv operation cast_lean() is called on the Kernel of the convolution.

//--KERNEL CAST--
        if (need_kernel_cast) {
            // Generate cast for kernel
            const kernel_name = try IR_utils.getSanitizedName(self.op_Conv.input_W.name);
            _ = try writer.print(
                \\
                \\    // Cast kernel from {s} to {s}
                \\    var tensor_{s}_casted = Tensor({s}).fromShape(&allocator, @constCast(param_lib.tensor_{s}.shape)) catch return -2;
                \\    defer tensor_{s}_casted.deinit();
                \\    tensMath.cast_lean({s}, {s}, @constCast(&param_lib.tensor_{s}), &tensor_{s}_casted, zant.onnx.DataType.FLOAT) catch return -1;
                \\
            , .{
                self.op_Conv.input_W.ty.toString(),
                target_type,
                kernel_name,
                target_type,
                kernel_name,
                kernel_name,
                self.op_Conv.input_W.ty.toString(),
                target_type,
                kernel_name,
                kernel_name,
            });
            final_kernel_string = try std.mem.concat(allocator, u8, &[_][]const u8{ "@constCast(&tensor_", kernel_name, "_casted)" });
            need_free_kernel = true;
        } else {
            final_kernel_string = tensor_W_string;
        }

Why? Isn't it an unnecessary step? Shouldn't we have the kernels already in the correct form? if not, why aren't we doing the casting when needed instead of adding another element-wise operation before?

Expected Behavior

Correct functioning without the cast_lean() on the kernel.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or request

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions