- 
                Notifications
    
You must be signed in to change notification settings  - Fork 5
 
          feat: introduce ListType<T> type and attribute
          #121
        
          New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
  
    feat: introduce ListType<T> type and attribute
  
  #121
              Conversation
ListType<T> type and attributeListType<T> type and attribute
      ListType<T> type and attributeListType<T> type and attribute
      There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice progress! A few comments, most of them minor and some optional.
| let description = [{ | ||
| This attribute represents a list of atomic attributes, parameterized with a ListType. | ||
| }]; | ||
| let parameters = (ins "mlir::ArrayAttr":$value, "ListType":$type); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we typically use Substrait_ListType here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh no wait: I'm pretty sure that ListType is the syntax I should use in this file, not Substrait_ListType. I did the same thing for my other attributes that referenced types, i.e. VarCharAttr has a parameter called VarCharType, NOT Substrait_VarCharType.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And the other syntax doesn't work? Maybe we've been doing it wrong the whole time? I don't have the details in my head and haven't checked, so maybe there is a reason why we are doing it this way...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://mlir.llvm.org/docs/DefiningDialects/AttributesAndTypes/#parameters
Substrait_ListType is not AttrParameter or TypeParameter, but rather a TypeDef. Therefore, based on my understanding of what the above link says, we have to use the "raw c++Type string instead", which in this case would be ListType.
(Other syntax does not work btw, I tried)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, cool, thanks for confirming.
| def Substrait_ListAttr : Substrait_Attr<"List", "list", [TypedAttrInterface]> { | ||
| let summary = "Substrait list attribute"; | ||
| let description = [{ | ||
| This attribute represents a list of atomic attributes, parameterized with a ListType. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the restriction to atomic element types intentional? If so, we should probably add a TODO (here and to the type if it has the same restriction). Note that we can have "lists of lists" and "structs of lists of structs of lists" etc. in Substrait.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add TODO. I think better to do in a separate PR.
| let description = [{ | ||
| This type represents a list type. | ||
| }]; | ||
| let parameters = (ins Substrait_AtomicType:$type); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here: if the restriction to atomic types is intentional, then add a TODO. Adding support for arbitrary nesting of lists and structs is probably not trivial and it may be a good strategy to do it in two steps.
| } | ||
| 
               | 
          ||
| if (typedAttr.getType() != expectedType) { | ||
| return emitError() << "Mismatched element type in ListAttr: expected " | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nits: error messages start with lower case and single-quoted attribute name.
| return emitError() << "Mismatched element type in ListAttr: expected " | |
| return emitError() << "mismatched element type in 'ListAttr': expected " | 
| auto typedAttr = mlir::dyn_cast<mlir::TypedAttr>(attr); | ||
| if (!typedAttr) { | ||
| return emitError() | ||
| << "ListAttr values must be typed attributes, but got: " << attr; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: single-quoted class name.
| << "ListAttr values must be typed attributes, but got: " << attr; | |
| << "'ListAttr' values must be typed attributes, but got: " << attr; | 
| }]; | ||
| let parameters = (ins Substrait_AtomicType:$type); | ||
| let assemblyFormat = [{ `<` $type `>` }]; | ||
| let genVerifyDecl = 1; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is possible and most certainly preferrable to verify this using tablegen (more). Take a look at TupleOf in CommonTypeConstraints.td upstream. (NestedTupleOf will then be helpful as inspiration for nested Substrait structs, lists, and maps...). One problem here is that we don't see in this file which types can be used, plus we now have to maintain two lists in two different files, which risk to run out of sync at some point.
| // `IntegerType`s. | ||
| if (auto intType = dyn_cast<IntegerType>(literalType)) { | ||
| if (!intType.isSigned()) | ||
| op->emitOpError("has integer value with unsupported signedness"); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change (here and two times below)?
Maybe the intention was to only emit an error message below in exportOperation(LiteralOp). However, we have more information here and I think it's better to emit the more precise message here (and none below).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I created a function called exportAttribute(Attribute value) that contains all of the logic that was previously in exportOperation(LiteralOp).
Then I altered exportOperation(LiteralOp op) to look like this.
FailureOr<std::unique_ptr<Expression>>
SubstraitExporter::exportOperation(LiteralOp op) {
  // Build `Literal` message depending on type.
  Attribute value = op.getValue();
  auto literal = exportAttribute(value);
  if (failed(literal))
    return op->emitOpError("has unsupported value");
  // Build `Expression` message.
  auto expression = std::make_unique<Expression>();
  expression->set_allocated_literal(literal->release());
  return expression;
}
Since exportAttribute(Attribute value) does not have access to the op, I can't emit an error message via the op. And not sure how to pass the error strings into failure() failures. Wanted to talk to you about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. Three patterns come to mind: (1) Use InFlightDiagnostics but maybe you have the same problem of not having an op; (2) provide a Location loc argument and emit the error at that location; or (3) provide an emitError function argument and use that to emit the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay will do. And do you agree with the restructuring? Should I similarly implement exportAttribute(Attribute value) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, cool. Yeah, I think it's a good idea to cut this mega function into pieces :) Thanks!
        
          
                lib/Target/SubstraitPB/Import.cpp
              
                Outdated
          
        
      | llvm::SmallVector<Attribute> listElements; | ||
| listElements.reserve(listType.values_size()); | ||
| for (const Expression_Literal &element : listType.values()) { | ||
| // TODO: Create importAttribute function to avoid creating redundant | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this TODO still up-to-date?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This TODO is to implement the same logic in the import.cpp, as what I did in export.cpp (i.e. with the creation of exportAttribute(Attribute value) and augmentation of exportOperation(LiteralOp).)
Did not want to implement it yet until we spoke about you're thoughts on my export.cpp implementation
        
          
                test/Dialect/Substrait/literal.mlir
              
                Outdated
          
        
      | %0 = named_table @t1 as ["a"] : tuple<si1> | ||
| %1 = project %0 : tuple<si1> -> tuple<si1, !substrait.list<!substrait.fixed_binary<4>>> { | ||
| ^bb0(%arg : tuple<si1>): | ||
| %bytes = literal #substrait.list<[ | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: variable name.
| %bytes = literal #substrait.list<[ | |
| %list = literal #substrait.list<[ | 
| %bytes = literal #substrait.list<[ | ||
| #substrait.fixed_binary<"8181">, | ||
| #substrait.fixed_binary<"8181">, | ||
| #substrait.fixed_binary<"8181">], !substrait.list<!substrait.fixed_binary<4>>> | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a lot of redundancy in the assembly format but I think it's hard to reduce. Unfortunately, there is no built-in way to remove the #substrait. and !substrait. prefixes, for example. In the LLVM dialect, there is a work-around that parses type names nested in other LLVM types manually, so the !llvm. prefix can be omitted there. We could do something like that. What would be even more concise is if we could do something like #substrait.list<["8181", "8181", "8181"], list<fixed_binary<4>>> and use the type for the parsing of the values (potentially swapping the order of the values and the type). Both are probably quite a lift, and that "only" for the assembly format, so this is a clear candidate for skipping now (and potentially forever).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea it's super redundant, also wasn't quite sure how to handle. :( Should I maybe add a TODO to implement a printer/parser?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, TODO is definitely fine. As I said, all alternatives that come to my mind are significantly more complex to implement, which is a good reason not to do it, at least immediately.
| 
           Note to self: Final TODO for this PR, implement   | 
    
…om parsing/printing
bf5d254    to
    3eaa9cf      
    Compare
  
    
This PR introduces the container type
ListType<T>andListAttr<T>which represent values of type T, where T can be an atomic or parameterized type.