- 
                Notifications
    
You must be signed in to change notification settings  - Fork 68
 
Implement simple hash function for Fusion #5453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| 
          
 Review updated until commit b6cf1fb Description
 Changes walkthrough 📝
 PR Reviewer Guide 🔍Here are some key observations to aid the review process: 
  | 
    
9be8437    to
    b6cf1fb      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
Implements hash function for Fusion by recursively hashing inputs, expressions, and outputs. Each Statement combines its sequential name_ with content-based hash from getHash().
Critical Issues Found:
Val::getHash()usesstd::get<PrimDataType>which will throwstd::bad_variant_accesswhendtype_.typecontainsPointerType,ArrayType,StructType, orOpaqueType(all actively used in codebase)Statement::hash()includesname_field, causing semantically identical statements from different containers to produce different hashes, defeating cache lookup purpose
Implementation Details:
Fusion::hash()sequentially combines hashes of inputs → exprs → outputsVal::getHash()hashesvtype_,dtype_, andvalue_Expr::getHash()hashes operation string and all inputs/outputsPolymorphicValue_functions::hash()handles monostate, complex, double, int64_t, bool
Confidence Score: 1/5
- This PR has critical bugs that will cause runtime crashes and incorrect cache behavior
 - Two critical logic errors: (1) 
Val::getHash()will crash withstd::bad_variant_accesswhen hashing Vals with non-primitive DataTypes like PointerType, which are used throughout the codebase, and (2) including Statementname_in hash defeats the cache lookup purpose since equivalent fusions from different containers will have different hashes - csrc/ir/base_nodes.cpp and csrc/ir/base_nodes.h require immediate fixes before merge
 
Important Files Changed
File Analysis
| Filename | Score | Overview | 
|---|---|---|
| csrc/ir/base_nodes.cpp | 1/5 | Added Val::getHash() and Expr::getHash() with critical bug: assumes dtype_ is always PrimDataType, will crash on PointerType/ArrayType/StructType | 
| csrc/ir/base_nodes.h | 1/5 | Added Statement::hash() that includes name_, causing semantically identical statements to hash differently across containers | 
| csrc/fusion.cpp | 3/5 | Implements Fusion::hash() by combining hashes of inputs, exprs, and outputs in order | 
| csrc/polymorphic_value.cpp | 4/5 | Implements hash for PolymorphicValue supporting std::monostate, std::complex<double>, double, int64_t, bool | 
Sequence Diagram
sequenceDiagram
    participant User
    participant Fusion
    participant Val
    participant Expr
    participant Statement
    participant PolymorphicValue
    
    User->>Fusion: hash()
    Fusion->>Fusion: hash = 0
    
    loop For each input
        Fusion->>Val: hash()
        Val->>Statement: hash()
        Statement->>Val: getHash()
        Val->>Val: hash vtype_
        Val->>Val: hash dtype_ (PrimDataType only)
        Val->>PolymorphicValue: hash(value_)
        PolymorphicValue-->>Val: value hash
        Val-->>Statement: content hash
        Statement->>Statement: combine name_ + content hash
        Statement-->>Fusion: val hash
        Fusion->>Fusion: hashCombine(hash, val_hash)
    end
    
    loop For each expr
        Fusion->>Expr: hash()
        Expr->>Statement: hash()
        Statement->>Expr: getHash()
        Expr->>Expr: hash opString
        loop For each input/output
            Expr->>Val: hash()
            Val-->>Expr: val hash
        end
        Expr-->>Statement: content hash
        Statement->>Statement: combine name_ + content hash
        Statement-->>Fusion: expr hash
        Fusion->>Fusion: hashCombine(hash, expr_hash)
    end
    
    loop For each output
        Fusion->>Val: hash()
        Val-->>Fusion: val hash
        Fusion->>Fusion: hashCombine(hash, val_hash)
    end
    
    Fusion-->>User: final hash
    8 files reviewed, 2 comments
| hashCombine( | ||
| hash, | ||
| std::hash<int>()(static_cast<int>(std::get<PrimDataType>(dtype_.type)))); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: std::get<PrimDataType> will throw std::bad_variant_access if dtype_.type holds ArrayType, PointerType, StructType, or OpaqueType. The codebase uses these types (e.g., csrc/ir/builder.cpp:144, csrc/kernel_ir.cpp:1965).
| hashCombine( | |
| hash, | |
| std::hash<int>()(static_cast<int>(std::get<PrimDataType>(dtype_.type)))); | |
| hashCombine(hash, std::hash<ValType>()(vtype_)); | |
| std::visit([&hash](auto&& arg) { | |
| using T = std::decay_t<decltype(arg)>; | |
| if constexpr (std::is_same_v<T, PrimDataType>) { | |
| hashCombine(hash, std::hash<int>()(static_cast<int>(arg))); | |
| } else { | |
| // Handle other DataType variants - for now, hash the type index | |
| hashCombine(hash, std::hash<size_t>()(dtype_.type.index())); | |
| } | |
| }, dtype_.type); | 
| size_t hash() const { | ||
| size_t hash = 0; | ||
| hashCombine(hash, std::hash<StmtNameType>()(name_)); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Hashing name_ means two Fusion objects with semantically identical definitions will have different hashes if their statements have different names (which are assigned sequentially per container). This defeats the purpose of using hash for cache lookups where equivalent fusions should map to the same cache entry.
| 
           !test  | 
    
This PR creates a simple hash function for
Fusion. It will be used to createLRU Cachefor mappingFusiondefinitions toFusionExecutorCache.PR Stack