Skip to content

Poor loop optimization in BilinearInterpol benchmark #31831

Open
@CarolEidt

Description

@CarolEidt

The Vector<T> version of this benchmark (BilinearInterpol_Vector) has a number of weaknesses:

First, although the temporary array doubleTemp is allocated with a constant length:

STMT00015 (IL 0x0AD...  ???)
               [000113] -ACXG-------              *  ASG       ref   
               [000112] D------N----              +--*  LCL_VAR   ref    V24 loc15        
               [000111] --CXG-------              \--*  CALL help ref    HELPER.CORINFO_HELP_NEWARR_1_VC
               [000110] ------------ arg0            +--*  CNS_INT(h) long   0x7ffafe875b00 class
               [000109] ------------ arg1            \--*  CNS_INT   long   4 vector element count

The loop cloning code is unable to determine V24.length:

Considering condition 0: (4 LE V24.Length), could not be evaluated

So it decides to clone the loop, AFAICT so that it can eliminate the range check on doubleTemp, but then it eliminates it from both clones so we have identical loops. Furthermore, although the exact loop count is available, none of the 4 original loops, nor their identical clones, are unrolled.

category:cq
theme:loop-opt
skill-level:expert
cost:large

Metadata

Metadata

Assignees

No one assigned

    Labels

    area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMIoptimizationtenet-performancePerformance related issue

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions