TTNN Operation Parameter Consistency: Problem Statement & Parameter Builder Solution #4931
Replies: 4 comments 1 reply
-
Additional Critical Issue: ConstantOp API DivergenceBeyond the parameter inconsistencies documented above, there is a more fundamental API-level inconsistency affecting ConstantOp: ConstantOp Dual API ProblemOpModel Implementation: return ::ttnn::graph::query_op_constraints(
::ttnn::from_buffer, device, rawData, getShape(value),
getDataType(value), device, metalLayout,
detail::getNullableMemoryConfig(outputLayout));Runtime Implementation: ::ttnn::Tensor out = utils::toTTNNTensor(op->data(), shape, dtype, meshDevice, layout, memoryConfig);
// Which internally calls:
::ttnn::Tensor tensor = ::ttnn::Tensor::from_vector(data, tensorSpec, device);The ProblemThis represents a fundamental architectural inconsistency where:
This goes beyond parameter mismatches - it's entirely different TTNN APIs being used for the same logical operation, potentially causing different tensor creation behavior, memory layouts, and runtime characteristics. Solution IntegrationThe parameter builder pattern should be extended to handle API consistency in addition to parameter consistency. For ConstantOp, this means: struct ConstantOpParameterPack {
// Unified data and parameters
// Single tensor creation method that both paths use
::ttnn::Tensor createTensor() const;
auto queryConstraints(device) const;
};This ensures both validation and execution use identical tensor creation logic, eliminating both parameter inconsistencies and API-level divergence. |
Beta Was this translation helpful? Give feedback.
-
|
I'm probably missing something, but I don't see how this would result in a consistent result across components. The biggest issue is that the input into the runtime and OpModel conversions are different, though they are transitively related. As I understand it, // Test
TEST(Conv2dParameterConsistency, SameParametersFromBothSources) {
auto mlirPack = Conv2dParameterPack::fromMLIR(mlirOp, inputSpec);
auto runtimePack = Conv2dParameterPack::fromFlatbuffer(fbOp, context);
EXPECT_TRUE(mlirPack.isEquivalent(runtimePack));
}this can indeed be useful for testing, but what I'm struggling with is that I cannot find some other place where this would be useful (at least as it stands for now). What we have is and what we want is to verify that starting from the same TTNN dialect op, we end up with the same lib op. So the question of composability can be (and should be) tested regardless of which approach we take. There are a few problematic configs like ( opmodel_conversion::x(ttnnAttr) == fb_to_runtime::x(ttnn_to_fb::x(ttnnAttr));This, of course, wouldn't guarantee that the calls won't be inconsistent, but at least if would give us some level of confidence that individual parameters are correctly converted. We can sync offline tomorrow to iterate faster through this idea, as there might be important pieces that I miss here. One idea that I had for some time now is that we should design our own TTNN API (let's call it ttnn::Tensor ttmlirnn::add(ttnn::Tensor a, ttnn::Tensor b, whatever) {
return ttnn::add(a, b, whatever);
}but there could also be something like ttnn::Tensor ttmlirnn::sigmoid(ttnn::Tensor input, ttnn::MemoryConfig memoryConfig) {
return ttnn::sigmoid(input, VecMode::RC, false, memoryConfig);
}IMO, this has several advantages:
Some disadvantages:
To add to the last point, we would also have to add a pybindings for EmitPy if we want to use that new API, albeit I think we would still want to use TTNN directly for EmitPy for portability reasons. With proper testing of the aforementioned non-trivial conversions, this would mitigate almost all of the risks of inconsistency, even though it wouldn't be able to always guarantee consistency (I'm pretty sure that's impossible with the current architecture). |
Beta Was this translation helpful? Give feedback.
-
|
I agree we can't reach ultimate single point definition for conversions. My proposal is more towards having unified operation call builder. So we avoid inconsistencies where we call different API for an operation, or we pass different number of parameters because some of them have defaults. Also, we can reduce number of utilities across our code. For example in conv2d op invocation we use Another potential idea comes to me, and that is to have Flatbuffer as our main OP representation (like you mentioned thin wrapper), but instead of having 3rd IR format, let's use existing one from FB. And then all we need to do is to convert MLIR->FB and from then on all conversions and builders are same. For example: What makes this easy solution is that we already have |
Beta Was this translation helpful? Give feedback.
-
|
That makes a lot of sense to me. With that approach, this picture becomes which means we no longer need to 'prove' commutativity of the graph, because there is only one path. If we take that approach, |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
TTNN Operation Parameter Consistency: Problem Statement & Parameter Builder Solution
Executive Summary
The tt-mlir compiler has two distinct paths for invoking TTNN operations that risk parameter inconsistency:
::ttnn::graph::query_op_constraints()for validationThis document analyzes the parameter consistency challenges and proposes a Parameter Builder Pattern to ensure both paths use identical parameters.
Problem Statement
The Dual Invocation Challenge
The tt-mlir compiler architecture requires two different invocation paths for the same logical operations:
Path 1: OpModel Query Path (
lib/OpModel/TTNN/TTNNOpModel.cpp)::ttnn::graph::query_op_constraints()to query operation feasibilityconversion::functionsstd::nulloptfor optional parametersPath 2: Runtime Execution Path (
runtime/lib/ttnn/operations/*/)utils::conversion functionsConcrete Example: Conv2d Operation Inconsistency
OpModel Conv2d Query (TTNNOpModel.cpp:3264-3276):
Runtime Conv2d Call (runtime/lib/ttnn/operations/conv/conv2d.cpp:78-82):
ResultWithOptions result = ::ttnn::conv2d( input, weight, &targetDevice, op->in_channels(), op->out_channels(), op->batch_size(), op->input_height(), op->input_width(), kernelSize, stride, padding, dilation, op->groups(), outputDtype, bias, conv2dConfig, computeConfig, outputMemoryConfig, /*dram_slice_config_=*/std::nullopt);Identified Parameter Inconsistencies
Conv2d Configuration:
conv2dConfigConverted = conversion::getConv2dConfig(conv2dConfig)conv2dConfig = utils::createConv2dConfig(op->conv2d_config())Compute Configuration:
deviceComputeKernelConfigConverted = conversion::getDeviceComputeKernelConfig(deviceComputeKernelConfig)computeConfig = utils::createDeviceComputeKernelConfig(op->compute_config())Memory Configuration:
detail::getNullableMemoryConfig(outputLayout)outputMemoryConfig = createMemoryConfigIfNeeded(getTensorRefMemoryConfig(op->out()))Missing Parameters:
dram_slice_configparameter entirely/*dram_slice_config_=*/std::nulloptImpact of Inconsistencies
Solution: Parameter Builder Pattern
Core Architecture
The Parameter Builder Pattern centralizes parameter construction logic by creating unified parameter packs that both invocation paths use:
Key Innovation: Single Configuration Constructor
Instead of dual conversion functions, the parameter pack uses one unified builder:
Implementation Strategy
1. Adapter Pattern for Source Normalization
2. Unified Factory Methods
3. Unified Invocation Points
4. Built-in Validation & Testing
Benefits of the Parameter Builder Pattern
1. Single Source of Truth
2. Explicit Default Management
3. Validation & Testing
isEquivalent()method enables automated consistency testing4. Maintainability
5. Debugging Support
6. Future-Proofing
Implementation Roadmap
Current Parameter Inconsistencies Across All Operations
Operations Analysis Summary
Based on analysis of the codebase, 64 runtime operation files contain TTNN calls, while the OpModel contains 50+ query_op_constraints calls. Here are the major operations with dual invocation paths and their parameter inconsistencies:
1. Conv2d Operations
Files:
conv/conv2d.cpp,conv/prepare_conv2d_weights.cpp,conv/prepare_conv2d_bias.cppOpModel Query:
Runtime Call:
Inconsistencies:
conversion::getConv2dConfig()vsutils::createConv2dConfig()conversion::getDeviceComputeKernelConfig()vsutils::createDeviceComputeKernelConfig()2. Matmul Operations
File:
matmul/matmul.cppOpModel Query:
Runtime Call:
Inconsistencies:
3. Sigmoid Operation
File:
eltwise/unary/unary.cppOpModel Query:
::ttnn::sigmoid, device, inputSpec, vectorMode, approximateMode, detail::getNullableMemoryConfig(outputLayout)Runtime Call:
Inconsistencies:
VecMode::RCdirectly, Runtime casts to int4. Softmax Operation
File:
normalization/softmax.cppOpModel Query:
::ttnn::softmax, device, inputSpec, dimArg, detail::getNullableMemoryConfig(outputLayout), std::nullopt, numericStableRuntime Call:
Inconsistencies:
std::nulloptfor compute_kernel_config, Runtime omits it entirely5. Reshape Operation
File:
data_movement/reshape.cppOpModel Query:
::ttnn::reshape, device, inputSpec, conversion::getShape(outputShape), detail::getNullableMemoryConfig(outputLayout)Runtime Call:
::ttnn::reshape(in, shape, memoryConfig)Inconsistencies:
conversion::getShape(), Runtime uses direct vector6. Slice Operations
File:
data_movement/slice.cppOpModel Query:
::ttnn::slice, device, inputSpec, beginsSpan, endsSpan, stepSpan, detail::getNullableMemoryConfig(outputLayout), std::nullopt, std::nulloptRuntime Call:
::ttnn::slice(in, begins, ends, step, memoryConfig)Inconsistencies:
Common Parameter Inconsistency Patterns
Pattern 1: Dual Configuration Constructors
conversion::getXXXConfig()utils::createXXXConfig()Operations Affected: Conv2d, Matmul, DeviceComputeKernel configurations
Pattern 2: Missing Optional Parameters
Operations Affected: Matmul, Linear, Most unary operations
Pattern 3: Memory Configuration Handling
detail::getNullableMemoryConfig(outputLayout)createMemoryConfigIfNeeded(getTensorRefMemoryConfig(op->out()))Operations Affected: All operations with memory configuration
Pattern 4: Parameter Order Differences
Operations Affected: Softmax, some ternary operations
Impact Assessment by Priority
High Priority (Immediate Risk)
Medium Priority
Low Priority
Estimated Operations Requiring Parameter Builders
Based on the analysis:
The parameter builder pattern would address all these inconsistencies by providing a single, unified parameter construction and validation system for each operation type.
Beta Was this translation helpful? Give feedback.
All reactions