feat: VJP utility based on autodiff_thunk#2309
feat: VJP utility based on autodiff_thunk#2309gdalle wants to merge 27 commits intoEnzymeAD:mainfrom
autodiff_thunk#2309Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2309 +/- ##
=======================================
Coverage 28.01% 28.01%
=======================================
Files 2 2
Lines 207 207
=======================================
Hits 58 58
Misses 149 149 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
In some way, this feels equivalent to implementing I think of |
|
We could also call it |
|
|
And how would you see the order of arguments? autodiff(Reverse, f, seed, args...)
autodiff(Reverse, f, args...; seed=...) |
|
The first variant, since that is already the convention used for the activity of the return. |
|
Your PR requires formatting changes to meet the project's style guidelines. Click here to view the suggested changes.diff --git a/src/sugar.jl b/src/sugar.jl
index f15e6e2..3ed38de 100644
--- a/src/sugar.jl
+++ b/src/sugar.jl
@@ -1367,10 +1367,10 @@ julia> Enzyme.batchify_activity(Duplicated{Vector{Float64}}, Val(2))
BatchDuplicated{Vector{Float64}, 2}""" """ Wrapper for a single adjoint to the return value in reverse mode.
@@ -1395,10 +1395,10 @@ Wrapper for a tuple of adjoints to the return value in reverse mode.
@@ -1417,7 +1417,7 @@ Useful for computing pullbacks / VJPs for functions whose output is not a scalar
@testset "Batchify activity" begin
the base case is a function returning (a(x, y), b(x, y))@@ -56,11 +56,11 @@ dx_ref = da * 2x * y .+ db * abs2(y)
input derivatives, (a+b) case@@ -69,11 +69,11 @@ dx1_ref = (da + db) * (2x * y .+ abs2(y))
output seeds, weird cases@@ -99,7 +99,7 @@ dzs6 = (MyMixedStruct(das[1], [dbs[1]]), MyMixedStruct(das[2], [dbs[2]])) validationfunction validate_seeded_autodiff(f, dz, dzs)
|
|
@vchuravy how do I generalize addition beyond arrays to accumulate the adjoint into the shadow? |
|
@vchuravy any further comments? |
|
Only when looking at the examples I feel like we should fuse |
|
I guess that requires some form of automatic activity detection in the general case. Should I use |
|
Bump @vchuravy, in general do we want automatic activity detection for the seed? |
|
Sorry was on vacation.
Yes I think so. At least it is consistent with current activity deduction for return, but allows specification of the seed value, |
|
I put the activity detection inside |
|
Here's what I get with the julia> autodiff(Reverse, Const(f6), Seed(dz6), Duplicated(x, zero(x)), Active(y))
ERROR: MethodError: no method matching (::Enzyme.Compiler.AdjointThunk{…})(::Const{…}, ::Duplicated{…}, ::Active{…}, ::@NamedTuple{…})
This error has been manually thrown, explicitly, so the method may exist but be intentionally marked as unimplemented.
Closest candidates are:
(::Enzyme.Compiler.AdjointThunk{PT, FA, RT, TT, Width, TapeT})(::FA, ::Any...) where {PT, FA, Width, RT, TT, TapeT}
@ Enzyme ~/Documents/GitHub/Julia/Enzyme.jl/src/compiler.jl:4901
Stacktrace:
[1] macro expansion
@ ~/Documents/GitHub/Julia/Enzyme.jl/src/compiler.jl:5041 [inlined]
[2] enzyme_call(::Val{…}, ::Ptr{…}, ::Type{…}, ::Val{…}, ::Val{…}, ::Type{…}, ::Type{…}, ::Const{…}, ::Type{…}, ::Duplicated{…}, ::Active{…}, ::@NamedTuple{…})
@ Enzyme.Compiler ~/Documents/GitHub/Julia/Enzyme.jl/src/compiler.jl:4997
[3] AdjointThunk
@ ~/Documents/GitHub/Julia/Enzyme.jl/src/compiler.jl:4901 [inlined]
[4] autodiff(::ReverseMode{…}, ::Const{…}, ::Seed{…}, ::Duplicated{…}, ::Active{…})
@ Enzyme ~/Documents/GitHub/Julia/Enzyme.jl/src/sugar.jl:1214
[5] top-level scope
@ ~/Documents/GitHub/Julia/Enzyme.jl/test/seeded.jl:165
Some type information was truncated. Use `show(err)` to see complete types. |
|
I think I figures out the method error, it seems in the Detailsjulia> autodiff(Reverse, Const(f6), BatchSeed(dzs6), BatchDuplicated(x, (zero(x), zero(x))), Active(y))
Stored value type does not match pointer operand type!
store [2 x { double, {} addrspace(10)* }] %102, { double, {} addrspace(10)* }* %103, align 8, !dbg !1119
[2 x { double, {} addrspace(10)* }]; Function Attrs: mustprogress nofree willreturn
define "enzyme_type"="{[0]:Float@double, [8]:Pointer, [8,0]:Pointer, [8,0,-1]:Float@double, [8,8]:Pointer, [8,8,0]:Integer, [8,8,1]:Integer, [8,8,2]:Integer, [8,8,3]:Integer, [8,8,4]:Integer, [8,8,5]:Integer, [8,8,6]:Integer, [8,8,7]:Integer, [8,8,8]:Pointer, [8,8,8,-1]:Float@double, [8,16]:Integer, [8,17]:Integer, [8,18]:Integer, [8,19]:Integer, [8,20]:Integer, [8,21]:Integer, [8,22]:Integer, [8,23]:Integer}" "enzymejl_parmtype"="13289153232" "enzymejl_parmtype_ref"="1" { double, {} addrspace(10)* } @preprocess_julia_f6_43005_inner.3({} addrspace(10)* nocapture nofree noundef nonnull readonly align 8 dereferenceable(24) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@double, [-1,8]:Pointer, [-1,8,0]:Integer, [-1,8,1]:Integer, [-1,8,2]:Integer, [-1,8,3]:Integer, [-1,8,4]:Integer, [-1,8,5]:Integer, [-1,8,6]:Integer, [-1,8,7]:Integer, [-1,8,8]:Pointer, [-1,8,8,-1]:Float@double, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer}" "enzymejl_parmtype"="5136209872" "enzymejl_parmtype_ref"="2" %0, double "enzyme_type"="{[-1]:Float@double}" "enzymejl_parmtype"="5190083824" "enzymejl_parmtype_ref"="0" %1) local_unnamed_addr #10 !dbg !371 {
entry:
%pgcstack.i = call {}*** @julia.get_pgcstack() #20, !noalias !372
%ptls_field.i16 = getelementptr inbounds {}**, {}*** %pgcstack.i, i64 2
%2 = bitcast {}*** %ptls_field.i16 to i64***
%ptls_load.i1718 = load i64**, i64*** %2, align 8, !tbaa !13, !noalias !372
%3 = getelementptr inbounds i64*, i64** %ptls_load.i1718, i64 2
%safepoint.i = load i64*, i64** %3, align 8, !tbaa !17, !noalias !372
fence syncscope("singlethread") seq_cst
call void @julia.safepoint(i64* %safepoint.i) #20, !dbg !376, !noalias !372
fence syncscope("singlethread") seq_cst
%4 = call fastcc double @julia__mapreduce_43055({} addrspace(10)* nocapture nofree noundef nonnull readonly align 8 dereferenceable(24) %0) #21, !dbg !378, !noalias !372
%5 = bitcast {} addrspace(10)* %0 to i8 addrspace(10)*, !dbg !387
%6 = addrspacecast i8 addrspace(10)* %5 to i8 addrspace(11)*, !dbg !387
%7 = getelementptr inbounds i8, i8 addrspace(11)* %6, i64 16, !dbg !387
%8 = bitcast i8 addrspace(11)* %7 to i64 addrspace(11)*, !dbg !387
%9 = load i64, i64 addrspace(11)* %8, align 8, !dbg !387, !tbaa !30, !alias.scope !33, !noalias !401, !enzyme_type !41, !enzymejl_source_type_Int64 !0, !enzymejl_byref_BITS_VALUE !0, !enzyme_inactive !0
switch i64 %9, label %L34.i [
i64 0, label %julia_f6_43005_inner.exit
i64 1, label %L15.i
], !dbg !402
L15.i: ; preds = %entry
%10 = bitcast {} addrspace(10)* %0 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !403
%11 = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %10 to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !403
%12 = bitcast {} addrspace(10)* %0 to {} addrspace(10)** addrspace(10)*, !dbg !403
%13 = addrspacecast {} addrspace(10)** addrspace(10)* %12 to {} addrspace(10)** addrspace(11)*, !dbg !403
%14 = load {} addrspace(10)**, {} addrspace(10)** addrspace(11)* %13, align 8, !dbg !403, !tbaa !49, !alias.scope !33, !noalias !401, !enzyme_type !51, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Ptr\7BFloat64\7D !0, !enzyme_nocache !0
%15 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %11, i64 0, i32 1, !dbg !403
%16 = load {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %15, align 8, !dbg !403, !tbaa !49, !alias.scope !33, !noalias !401, !dereferenceable_or_null !54, !align !55, !enzyme_type !56, !enzymejl_source_type_Memory\7BFloat64\7D !0, !enzymejl_byref_MUT_REF !0
%17 = call "enzyme_type"="{[-1]:Pointer, [-1,-1]:Float@double}" {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* noundef %16, {} addrspace(10)** noundef %14) #20, !dbg !403
%18 = bitcast {} addrspace(10)* addrspace(13)* %17 to double addrspace(13)*, !dbg !403
%19 = load double, double addrspace(13)* %18, align 8, !dbg !403, !tbaa !58, !alias.scope !61, !noalias !405, !enzyme_type !63, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Float64 !0
br label %julia_f6_43005_inner.exit, !dbg !406
L34.i: ; preds = %entry
%20 = icmp sgt i64 %9, 15, !dbg !407
br i1 %20, label %L99.i, label %L36.i, !dbg !409
L36.i: ; preds = %L34.i
%21 = bitcast {} addrspace(10)* %0 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !410
%22 = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %21 to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !410
%23 = bitcast {} addrspace(10)* %0 to {} addrspace(10)** addrspace(10)*, !dbg !410
%24 = addrspacecast {} addrspace(10)** addrspace(10)* %23 to {} addrspace(10)** addrspace(11)*, !dbg !410
%25 = load {} addrspace(10)**, {} addrspace(10)** addrspace(11)* %24, align 8, !dbg !410, !tbaa !49, !alias.scope !33, !noalias !401, !enzyme_type !51, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Ptr\7BFloat64\7D !0, !enzyme_nocache !0
%26 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %22, i64 0, i32 1, !dbg !410
%27 = load {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %26, align 8, !dbg !410, !tbaa !49, !alias.scope !33, !noalias !401, !enzyme_type !56, !enzymejl_source_type_Memory\7BFloat64\7D !0, !enzymejl_byref_MUT_REF !0
%28 = call "enzyme_type"="{[-1]:Pointer, [-1,-1]:Float@double}" {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* noundef %27, {} addrspace(10)** noundef %25) #20, !dbg !410
%29 = bitcast {} addrspace(10)* addrspace(13)* %28 to double addrspace(13)*, !dbg !410
%30 = load double, double addrspace(13)* %29, align 8, !dbg !410, !tbaa !58, !alias.scope !61, !noalias !405, !enzyme_type !63, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Float64 !0
%31 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(13)* %28, i64 1, !dbg !412
%32 = bitcast {} addrspace(10)* addrspace(13)* %31 to double addrspace(13)*, !dbg !412
%33 = load double, double addrspace(13)* %32, align 8, !dbg !412, !tbaa !58, !alias.scope !61, !noalias !405, !enzyme_type !63, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Float64 !0
%34 = fadd double %30, %33, !dbg !414
%.not2223 = icmp sgt i64 %9, 2, !dbg !417
br i1 %.not2223, label %L77.i.preheader, label %julia_f6_43005_inner.exit, !dbg !419
L77.i.preheader: ; preds = %L36.i
br label %L77.i, !dbg !419
L77.i: ; preds = %L77.i.preheader, %L77.i
%iv = phi i64 [ 0, %L77.i.preheader ], [ %iv.next, %L77.i ]
%value_phi3.i24 = phi double [ %40, %L77.i ], [ %34, %L77.i.preheader ]
%35 = add nuw nsw i64 %iv, 2, !dbg !420
%iv.next = add nuw nsw i64 %iv, 1, !dbg !420
%36 = add nuw nsw i64 %35, 1, !dbg !420
%37 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(13)* %28, i64 %35, !dbg !422
%38 = bitcast {} addrspace(10)* addrspace(13)* %37 to double addrspace(13)*, !dbg !422
%39 = load double, double addrspace(13)* %38, align 8, !dbg !422, !tbaa !58, !alias.scope !61, !noalias !405
%40 = fadd double %value_phi3.i24, %39, !dbg !423
%exitcond.not = icmp eq i64 %36, %9, !dbg !417
br i1 %exitcond.not, label %julia_f6_43005_inner.exit.loopexit, label %L77.i, !dbg !419
L99.i: ; preds = %L34.i
%41 = call fastcc double @julia_mapreduce_impl_43034({} addrspace(10)* nocapture nofree noundef nonnull readonly align 8 dereferenceable(24) %0, i64 noundef signext 1, i64 noundef signext %9) #21, !dbg !426, !noalias !372
br label %julia_f6_43005_inner.exit, !dbg !428
julia_f6_43005_inner.exit.loopexit: ; preds = %L77.i
br label %julia_f6_43005_inner.exit, !dbg !429
julia_f6_43005_inner.exit: ; preds = %julia_f6_43005_inner.exit.loopexit, %L99.i, %L36.i, %L15.i, %entry
%value_phi.i = phi double [ %19, %L15.i ], [ %41, %L99.i ], [ 0.000000e+00, %entry ], [ %34, %L36.i ], [ %40, %julia_f6_43005_inner.exit.loopexit ]
%42 = fmul double %4, %1, !dbg !429
%current_task1.i15 = getelementptr inbounds {}**, {}*** %pgcstack.i, i64 -14
%43 = fmul double %1, %1, !dbg !430
%44 = fmul double %43, %value_phi.i, !dbg !432
%45 = call noalias "enzyme_type"="{[-1]:Pointer, [-1,0]:Integer, [-1,1]:Integer, [-1,2]:Integer, [-1,3]:Integer, [-1,4]:Integer, [-1,5]:Integer, [-1,6]:Integer, [-1,7]:Integer, [-1,8]:Pointer, [-1,8,-1]:Float@double}" {} addrspace(10)* @jl_alloc_genericmemory({} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 5135345376 to {}*) to {} addrspace(10)*), i64 noundef 1) #22, !dbg !433, !noalias !372
%46 = bitcast {} addrspace(10)* %45 to { i64, {} addrspace(10)** } addrspace(10)*, !dbg !436
%47 = addrspacecast { i64, {} addrspace(10)** } addrspace(10)* %46 to { i64, {} addrspace(10)** } addrspace(11)*, !dbg !436
%48 = getelementptr inbounds { i64, {} addrspace(10)** }, { i64, {} addrspace(10)** } addrspace(11)* %47, i64 0, i32 1, !dbg !436
%49 = bitcast {} addrspace(10)** addrspace(11)* %48 to i8* addrspace(11)*, !dbg !436
%50 = load i8*, i8* addrspace(11)* %49, align 8, !dbg !436, !tbaa !17, !alias.scope !199, !noalias !438, !nonnull !0, !enzyme_type !51, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Ptr\7BFloat64\7D !0, !enzyme_nocache !0
%51 = bitcast {}*** %current_task1.i15 to {}*, !dbg !439
%52 = call noalias nonnull align 8 dereferenceable(24) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@double, [-1,8]:Pointer, [-1,8,0]:Integer, [-1,8,1]:Integer, [-1,8,2]:Integer, [-1,8,3]:Integer, [-1,8,4]:Integer, [-1,8,5]:Integer, [-1,8,6]:Integer, [-1,8,7]:Integer, [-1,8,8]:Pointer, [-1,8,8,-1]:Float@double, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer}" {} addrspace(10)* @julia.gc_alloc_obj({}* nonnull %51, i64 noundef 24, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 5136209872 to {}*) to {} addrspace(10)*)) #23, !dbg !439, !noalias !372
%53 = bitcast {} addrspace(10)* %52 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !439
%54 = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %53 to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !439
%.repack = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %54, i64 0, i32 0, !dbg !439
store i8* %50, i8* addrspace(11)* %.repack, align 8, !dbg !439, !tbaa !49, !alias.scope !33, !noalias !440
%.repack19 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %54, i64 0, i32 1, !dbg !439
store {} addrspace(10)* %45, {} addrspace(10)* addrspace(11)* %.repack19, align 8, !dbg !439, !tbaa !49, !alias.scope !33, !noalias !440
%55 = bitcast {} addrspace(10)* %52 to i8 addrspace(10)*, !dbg !439
%56 = addrspacecast i8 addrspace(10)* %55 to i8 addrspace(11)*, !dbg !439
%57 = getelementptr inbounds i8, i8 addrspace(11)* %56, i64 16, !dbg !439
%58 = bitcast i8 addrspace(11)* %57 to i64 addrspace(11)*, !dbg !439
store i64 1, i64 addrspace(11)* %58, align 8, !dbg !439, !tbaa !30, !alias.scope !33, !noalias !440
%59 = bitcast i8* %50 to {} addrspace(10)**, !dbg !443
%60 = call "enzyme_type"="{[-1]:Pointer, [-1,-1]:Float@double}" {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* noundef %45, {} addrspace(10)** noundef %59) #20, !dbg !446
%61 = bitcast {} addrspace(10)* addrspace(13)* %60 to double addrspace(13)*, !dbg !446
store double %44, double addrspace(13)* %61, align 8, !dbg !446, !tbaa !58, !alias.scope !61, !noalias !447
%.fca.0.insert = insertvalue { double, {} addrspace(10)* } poison, double %42, 0, !dbg !448
%.fca.1.insert = insertvalue { double, {} addrspace(10)* } %.fca.0.insert, {} addrspace(10)* %52, 1, !dbg !448
ret { double, {} addrspace(10)* } %.fca.1.insert, !dbg !448
}
; Function Attrs: mustprogress nofree
define internal "enzymejl_parmtype"="13289153232" "enzymejl_parmtype_ref"="1" { { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { double, {} addrspace(10)* } } @augmented_julia_f6_43005_inner.3({} addrspace(10)* nocapture nofree noundef nonnull readonly align 8 dereferenceable(24) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@double, [-1,8]:Pointer, [-1,8,0]:Integer, [-1,8,1]:Integer, [-1,8,2]:Integer, [-1,8,3]:Integer, [-1,8,4]:Integer, [-1,8,5]:Integer, [-1,8,6]:Integer, [-1,8,7]:Integer, [-1,8,8]:Pointer, [-1,8,8,-1]:Float@double, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer}" "enzymejl_parmtype"="5136209872" "enzymejl_parmtype_ref"="2" %0, [2 x {} addrspace(10)*] %"'", double "enzyme_type"="{[-1]:Float@double}" "enzymejl_parmtype"="5190083824" "enzymejl_parmtype_ref"="0" %1) local_unnamed_addr #19 !dbg !986 {
entry:
%2 = alloca { { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { double, {} addrspace(10)* } }, align 8
%3 = getelementptr inbounds { { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { double, {} addrspace(10)* } }, { { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { double, {} addrspace(10)* } }* %2, i32 0, i32 0
%4 = getelementptr { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i64 0, i32 2
store {} addrspace(10)* @ejl_jl_nothing, {} addrspace(10)** %4, align 8
%5 = getelementptr { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i64 0, i32 0, i32 2
store {} addrspace(10)* @ejl_jl_nothing, {} addrspace(10)** %5, align 8
%6 = getelementptr { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i64 0, i32 3, i32 0
store {} addrspace(10)* @ejl_jl_nothing, {} addrspace(10)** %6, align 8
%7 = getelementptr { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i64 0, i32 3, i32 1
store {} addrspace(10)* @ejl_jl_nothing, {} addrspace(10)** %7, align 8
%8 = getelementptr { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i64 0, i32 4, i32 0
store {} addrspace(10)* @ejl_jl_nothing, {} addrspace(10)** %8, align 8
%9 = getelementptr { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i64 0, i32 4, i32 1
store {} addrspace(10)* @ejl_jl_nothing, {} addrspace(10)** %9, align 8
%10 = getelementptr { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i64 0, i32 0, i32 0, i32 0
store {} addrspace(10)* @ejl_jl_nothing, {} addrspace(10)** %10, align 8
%11 = getelementptr { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i64 0, i32 0, i32 0, i32 1
store {} addrspace(10)* @ejl_jl_nothing, {} addrspace(10)** %11, align 8
%12 = getelementptr { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i64 0, i32 0, i32 1, i32 0
store {} addrspace(10)* @ejl_jl_nothing, {} addrspace(10)** %12, align 8
%13 = getelementptr { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i64 0, i32 0, i32 1, i32 1
store {} addrspace(10)* @ejl_jl_nothing, {} addrspace(10)** %13, align 8
%"iv'ac" = alloca i64, align 8
%pgcstack.i = call {}*** @julia.get_pgcstack() #20, !noalias !987
%ptls_field.i16 = getelementptr inbounds {}**, {}*** %pgcstack.i, i64 2
%14 = bitcast {}*** %ptls_field.i16 to i64***
%ptls_load.i1718 = load i64**, i64*** %14, align 8, !tbaa !13, !alias.scope !991, !noalias !994
%15 = getelementptr inbounds i64*, i64** %ptls_load.i1718, i64 2
%safepoint.i = load i64*, i64** %15, align 8, !tbaa !17, !alias.scope !1000, !noalias !1003
fence syncscope("singlethread") seq_cst
call void @julia.safepoint(i64* %safepoint.i) #20, !dbg !1006, !noalias !987
fence syncscope("singlethread") seq_cst
%_augmented = call fastcc { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double } @augmented_julia__mapreduce_43055({} addrspace(10)* nocapture nofree readonly align 8 %0, [2 x {} addrspace(10)*] %"'"), !dbg !1008
%subcache = extractvalue { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double } %_augmented, 0, !dbg !1008
%16 = getelementptr inbounds { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i32 0, i32 0, !dbg !1008
store { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* } %subcache, { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }* %16, align 8, !dbg !1008
%17 = extractvalue { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double } %_augmented, 1, !dbg !1008
%18 = getelementptr inbounds { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i32 0, i32 1, !dbg !1017
store double %17, double* %18, align 8, !dbg !1017
%19 = bitcast {} addrspace(10)* %0 to i8 addrspace(10)*, !dbg !1017
%20 = addrspacecast i8 addrspace(10)* %19 to i8 addrspace(11)*, !dbg !1017
%21 = getelementptr inbounds i8, i8 addrspace(11)* %20, i64 16, !dbg !1017
%22 = bitcast i8 addrspace(11)* %21 to i64 addrspace(11)*, !dbg !1017
%23 = load i64, i64 addrspace(11)* %22, align 8, !dbg !1017, !tbaa !30, !alias.scope !1031, !noalias !1034, !enzyme_type !41, !enzymejl_source_type_Int64 !0, !enzymejl_byref_BITS_VALUE !0, !enzyme_inactive !0
switch i64 %23, label %L34.i [
i64 0, label %julia_f6_43005_inner.exit
i64 1, label %L15.i
], !dbg !1037
L15.i: ; preds = %entry
%24 = extractvalue [2 x {} addrspace(10)*] %"'", 0, !dbg !1038
%"'ipc" = bitcast {} addrspace(10)* %24 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !1038
%25 = extractvalue [2 x {} addrspace(10)*] %"'", 1, !dbg !1038
%"'ipc10" = bitcast {} addrspace(10)* %25 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !1038
%26 = bitcast {} addrspace(10)* %0 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !1038
%"'ipc11" = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %"'ipc" to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !1038
%"'ipc12" = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %"'ipc10" to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !1038
%27 = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %26 to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !1038
%28 = extractvalue [2 x {} addrspace(10)*] %"'", 0, !dbg !1038
%"'ipc15" = bitcast {} addrspace(10)* %28 to {} addrspace(10)** addrspace(10)*, !dbg !1038
%29 = extractvalue [2 x {} addrspace(10)*] %"'", 1, !dbg !1038
%"'ipc16" = bitcast {} addrspace(10)* %29 to {} addrspace(10)** addrspace(10)*, !dbg !1038
%30 = bitcast {} addrspace(10)* %0 to {} addrspace(10)** addrspace(10)*, !dbg !1038
%"'ipc17" = addrspacecast {} addrspace(10)** addrspace(10)* %"'ipc15" to {} addrspace(10)** addrspace(11)*, !dbg !1038
%"'ipc18" = addrspacecast {} addrspace(10)** addrspace(10)* %"'ipc16" to {} addrspace(10)** addrspace(11)*, !dbg !1038
%31 = addrspacecast {} addrspace(10)** addrspace(10)* %30 to {} addrspace(10)** addrspace(11)*, !dbg !1038
%"'ipl19" = load {} addrspace(10)**, {} addrspace(10)** addrspace(11)* %"'ipc17", align 8, !dbg !1038, !tbaa !49, !alias.scope !1040, !noalias !1041
%"'ipl20" = load {} addrspace(10)**, {} addrspace(10)** addrspace(11)* %"'ipc18", align 8, !dbg !1038, !tbaa !49, !alias.scope !1042, !noalias !1043
%32 = load {} addrspace(10)**, {} addrspace(10)** addrspace(11)* %31, align 8, !dbg !1038, !tbaa !49, !alias.scope !1031, !noalias !1034, !enzyme_type !51, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Ptr\7BFloat64\7D !0, !enzyme_nocache !0
%"'ipg" = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %"'ipc11", i64 0, i32 1, !dbg !1038
%"'ipg13" = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %"'ipc12", i64 0, i32 1, !dbg !1038
%33 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %27, i64 0, i32 1, !dbg !1038
%"'ipl" = load {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %"'ipg", align 8, !dbg !1038, !tbaa !49, !alias.scope !1040, !noalias !1041, !dereferenceable_or_null !54
%"'ipl14" = load {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %"'ipg13", align 8, !dbg !1038, !tbaa !49, !alias.scope !1042, !noalias !1043, !dereferenceable_or_null !54
%34 = load {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %33, align 8, !dbg !1038, !tbaa !49, !alias.scope !1031, !noalias !1034, !dereferenceable_or_null !54, !align !55, !enzyme_type !56, !enzymejl_source_type_Memory\7BFloat64\7D !0, !enzymejl_byref_MUT_REF !0
%35 = call {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* %"'ipl", {} addrspace(10)** %"'ipl19"), !dbg !1038
%36 = call {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* %"'ipl14", {} addrspace(10)** %"'ipl20"), !dbg !1038
%37 = call "enzyme_type"="{[-1]:Pointer, [-1,-1]:Float@double}" {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* noundef %34, {} addrspace(10)** noundef %32) #20, !dbg !1038
%38 = bitcast {} addrspace(10)* addrspace(13)* %37 to double addrspace(13)*, !dbg !1038
%39 = load double, double addrspace(13)* %38, align 8, !dbg !1038, !tbaa !58, !alias.scope !1044, !noalias !1047, !enzyme_type !63, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Float64 !0
br label %julia_f6_43005_inner.exit, !dbg !1050
L34.i: ; preds = %entry
%40 = icmp sgt i64 %23, 15, !dbg !1051
br i1 %40, label %L99.i, label %L36.i, !dbg !1053
L36.i: ; preds = %L34.i
%41 = extractvalue [2 x {} addrspace(10)*] %"'", 0, !dbg !1054
%"'ipc21" = bitcast {} addrspace(10)* %41 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !1054
%42 = extractvalue [2 x {} addrspace(10)*] %"'", 1, !dbg !1054
%"'ipc22" = bitcast {} addrspace(10)* %42 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !1054
%43 = bitcast {} addrspace(10)* %0 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !1054
%"'ipc23" = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %"'ipc21" to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !1054
%"'ipc24" = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %"'ipc22" to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !1054
%44 = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %43 to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !1054
%45 = extractvalue [2 x {} addrspace(10)*] %"'", 0, !dbg !1054
%"'ipc29" = bitcast {} addrspace(10)* %45 to {} addrspace(10)** addrspace(10)*, !dbg !1054
%46 = extractvalue [2 x {} addrspace(10)*] %"'", 1, !dbg !1054
%"'ipc30" = bitcast {} addrspace(10)* %46 to {} addrspace(10)** addrspace(10)*, !dbg !1054
%47 = bitcast {} addrspace(10)* %0 to {} addrspace(10)** addrspace(10)*, !dbg !1054
%"'ipc31" = addrspacecast {} addrspace(10)** addrspace(10)* %"'ipc29" to {} addrspace(10)** addrspace(11)*, !dbg !1054
%"'ipc32" = addrspacecast {} addrspace(10)** addrspace(10)* %"'ipc30" to {} addrspace(10)** addrspace(11)*, !dbg !1054
%48 = addrspacecast {} addrspace(10)** addrspace(10)* %47 to {} addrspace(10)** addrspace(11)*, !dbg !1054
%"'ipl33" = load {} addrspace(10)**, {} addrspace(10)** addrspace(11)* %"'ipc31", align 8, !dbg !1054, !tbaa !49, !alias.scope !1040, !noalias !1041
%"'ipl34" = load {} addrspace(10)**, {} addrspace(10)** addrspace(11)* %"'ipc32", align 8, !dbg !1054, !tbaa !49, !alias.scope !1042, !noalias !1043
%49 = load {} addrspace(10)**, {} addrspace(10)** addrspace(11)* %48, align 8, !dbg !1054, !tbaa !49, !alias.scope !1031, !noalias !1034, !enzyme_type !51, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Ptr\7BFloat64\7D !0, !enzyme_nocache !0
%"'ipg25" = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %"'ipc23", i64 0, i32 1, !dbg !1054
%"'ipg26" = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %"'ipc24", i64 0, i32 1, !dbg !1054
%50 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %44, i64 0, i32 1, !dbg !1054
%"'ipl27" = load {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %"'ipg25", align 8, !dbg !1054, !tbaa !49, !alias.scope !1040, !noalias !1041
%"'ipl28" = load {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %"'ipg26", align 8, !dbg !1054, !tbaa !49, !alias.scope !1042, !noalias !1043
%51 = load {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %50, align 8, !dbg !1054, !tbaa !49, !alias.scope !1031, !noalias !1034, !enzyme_type !56, !enzymejl_source_type_Memory\7BFloat64\7D !0, !enzymejl_byref_MUT_REF !0
%52 = call {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* %"'ipl27", {} addrspace(10)** %"'ipl33"), !dbg !1054
%53 = call {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* %"'ipl28", {} addrspace(10)** %"'ipl34"), !dbg !1054
%54 = call "enzyme_type"="{[-1]:Pointer, [-1,-1]:Float@double}" {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* noundef %51, {} addrspace(10)** noundef %49) #20, !dbg !1054
%55 = bitcast {} addrspace(10)* addrspace(13)* %54 to double addrspace(13)*, !dbg !1054
%56 = load double, double addrspace(13)* %55, align 8, !dbg !1054, !tbaa !58, !alias.scope !1056, !noalias !1059, !enzyme_type !63, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Float64 !0
%57 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(13)* %54, i64 1, !dbg !1062
%58 = bitcast {} addrspace(10)* addrspace(13)* %57 to double addrspace(13)*, !dbg !1062
%59 = load double, double addrspace(13)* %58, align 8, !dbg !1062, !tbaa !58, !alias.scope !1056, !noalias !1059, !enzyme_type !63, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Float64 !0
%60 = fadd double %56, %59, !dbg !1064
%.not2223 = icmp sgt i64 %23, 2, !dbg !1067
br i1 %.not2223, label %L77.i.preheader, label %julia_f6_43005_inner.exit, !dbg !1069
L77.i.preheader: ; preds = %L36.i
%61 = add i64 %23, -3, !dbg !1069
br label %L77.i, !dbg !1069
L77.i: ; preds = %L77.i, %L77.i.preheader
%iv = phi i64 [ 0, %L77.i.preheader ], [ %iv.next, %L77.i ]
%value_phi3.i24 = phi double [ %67, %L77.i ], [ %60, %L77.i.preheader ]
%iv.next = add nuw nsw i64 %iv, 1, !dbg !1070
%62 = add nuw nsw i64 %iv, 2, !dbg !1070
%63 = add nuw nsw i64 %62, 1, !dbg !1070
%64 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(13)* %54, i64 %62, !dbg !1072
%65 = bitcast {} addrspace(10)* addrspace(13)* %64 to double addrspace(13)*, !dbg !1072
%66 = load double, double addrspace(13)* %65, align 8, !dbg !1072, !tbaa !58, !alias.scope !1056, !noalias !1059
%67 = fadd double %value_phi3.i24, %66, !dbg !1073
%exitcond.not = icmp eq i64 %63, %23, !dbg !1067
br i1 %exitcond.not, label %julia_f6_43005_inner.exit.loopexit, label %L77.i, !dbg !1069
L99.i: ; preds = %L34.i
%_augmented35 = call fastcc { {} addrspace(10)*, double } @augmented_julia_mapreduce_impl_43034({} addrspace(10)* nocapture nofree readonly align 8 %0, [2 x {} addrspace(10)*] %"'", i64 signext 1, i64 signext %23), !dbg !1076
%subcache36 = extractvalue { {} addrspace(10)*, double } %_augmented35, 0, !dbg !1076
%68 = getelementptr inbounds { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i32 0, i32 2, !dbg !1076
store {} addrspace(10)* %subcache36, {} addrspace(10)** %68, align 8, !dbg !1076
%69 = extractvalue { {} addrspace(10)*, double } %_augmented35, 1, !dbg !1076
br label %julia_f6_43005_inner.exit, !dbg !1078
julia_f6_43005_inner.exit.loopexit: ; preds = %L77.i
br label %julia_f6_43005_inner.exit, !dbg !1079
julia_f6_43005_inner.exit: ; preds = %julia_f6_43005_inner.exit.loopexit, %L99.i, %L36.i, %L15.i, %entry
%value_phi.i = phi double [ %39, %L15.i ], [ %69, %L99.i ], [ 0.000000e+00, %entry ], [ %60, %L36.i ], [ %67, %julia_f6_43005_inner.exit.loopexit ]
%70 = getelementptr inbounds { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i32 0, i32 5
store double %value_phi.i, double* %70, align 8
%current_task1.i15 = getelementptr inbounds {}**, {}*** %pgcstack.i, i64 -14
%71 = fmul double %1, %1, !dbg !1080
%72 = fmul double %71, %value_phi.i, !dbg !1082
%73 = call {} addrspace(10)* @jl_alloc_genericmemory({} addrspace(10)* addrspacecast ({}* inttoptr (i64 5135345376 to {}*) to {} addrspace(10)*), i64 1), !dbg !1083
%74 = bitcast {} addrspace(10)* %73 to <{ i64, i8* }> addrspace(10)*, !dbg !1083
%75 = getelementptr inbounds <{ i64, i8* }>, <{ i64, i8* }> addrspace(10)* %74, i32 0, i32 1, !dbg !1083
%76 = load i8*, i8* addrspace(10)* %75, align 8, !dbg !1083
call void @llvm.memset.p0i8.i64(i8* align 8 %76, i8 0, i64 8, i1 false), !dbg !1083
%77 = insertvalue [2 x {} addrspace(10)*] undef, {} addrspace(10)* %73, 0, !dbg !1083
%78 = call {} addrspace(10)* @jl_alloc_genericmemory({} addrspace(10)* addrspacecast ({}* inttoptr (i64 5135345376 to {}*) to {} addrspace(10)*), i64 1), !dbg !1083
%79 = bitcast {} addrspace(10)* %78 to <{ i64, i8* }> addrspace(10)*, !dbg !1083
%80 = getelementptr inbounds <{ i64, i8* }>, <{ i64, i8* }> addrspace(10)* %79, i32 0, i32 1, !dbg !1083
%81 = load i8*, i8* addrspace(10)* %80, align 8, !dbg !1083
call void @llvm.memset.p0i8.i64(i8* align 8 %81, i8 0, i64 8, i1 false), !dbg !1083
%82 = insertvalue [2 x {} addrspace(10)*] %77, {} addrspace(10)* %78, 1, !dbg !1083
%83 = getelementptr inbounds { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i32 0, i32 4, !dbg !1083
store [2 x {} addrspace(10)*] %82, [2 x {} addrspace(10)*]* %83, align 8, !dbg !1083
%84 = call noalias "enzyme_type"="{[-1]:Pointer, [-1,0]:Integer, [-1,1]:Integer, [-1,2]:Integer, [-1,3]:Integer, [-1,4]:Integer, [-1,5]:Integer, [-1,6]:Integer, [-1,7]:Integer, [-1,8]:Pointer, [-1,8,-1]:Float@double}" {} addrspace(10)* @jl_alloc_genericmemory({} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 5135345376 to {}*) to {} addrspace(10)*), i64 noundef 1) #21, !dbg !1083, !noalias !987
%"'ipc61" = bitcast {} addrspace(10)* %73 to { i64, {} addrspace(10)** } addrspace(10)*, !dbg !1086
%"'ipc62" = bitcast {} addrspace(10)* %78 to { i64, {} addrspace(10)** } addrspace(10)*, !dbg !1086
%85 = bitcast {} addrspace(10)* %84 to { i64, {} addrspace(10)** } addrspace(10)*, !dbg !1086
%"'ipc63" = addrspacecast { i64, {} addrspace(10)** } addrspace(10)* %"'ipc61" to { i64, {} addrspace(10)** } addrspace(11)*, !dbg !1086
%"'ipc64" = addrspacecast { i64, {} addrspace(10)** } addrspace(10)* %"'ipc62" to { i64, {} addrspace(10)** } addrspace(11)*, !dbg !1086
%86 = addrspacecast { i64, {} addrspace(10)** } addrspace(10)* %85 to { i64, {} addrspace(10)** } addrspace(11)*, !dbg !1086
%"'ipg65" = getelementptr inbounds { i64, {} addrspace(10)** }, { i64, {} addrspace(10)** } addrspace(11)* %"'ipc63", i64 0, i32 1, !dbg !1086
%"'ipg66" = getelementptr inbounds { i64, {} addrspace(10)** }, { i64, {} addrspace(10)** } addrspace(11)* %"'ipc64", i64 0, i32 1, !dbg !1086
%87 = getelementptr inbounds { i64, {} addrspace(10)** }, { i64, {} addrspace(10)** } addrspace(11)* %86, i64 0, i32 1, !dbg !1086
%"'ipc67" = bitcast {} addrspace(10)** addrspace(11)* %"'ipg65" to i8* addrspace(11)*, !dbg !1086
%"'ipc68" = bitcast {} addrspace(10)** addrspace(11)* %"'ipg66" to i8* addrspace(11)*, !dbg !1086
%88 = bitcast {} addrspace(10)** addrspace(11)* %87 to i8* addrspace(11)*, !dbg !1086
%"'ipl69" = load i8*, i8* addrspace(11)* %"'ipc67", align 8, !dbg !1086, !tbaa !17, !alias.scope !1088, !noalias !1091, !nonnull !0
%"'ipl70" = load i8*, i8* addrspace(11)* %"'ipc68", align 8, !dbg !1086, !tbaa !17, !alias.scope !1094, !noalias !1095, !nonnull !0
%89 = load i8*, i8* addrspace(11)* %88, align 8, !dbg !1086, !tbaa !17, !alias.scope !1096, !noalias !1097, !nonnull !0, !enzyme_type !51, !enzymejl_byref_BITS_VALUE !0, !enzymejl_source_type_Ptr\7BFloat64\7D !0, !enzyme_nocache !0
%90 = bitcast {}*** %current_task1.i15 to {}*, !dbg !1098
%"'mi58" = call noalias nonnull align 8 dereferenceable(24) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@double, [-1,8]:Pointer, [-1,8,0]:Integer, [-1,8,1]:Integer, [-1,8,2]:Integer, [-1,8,3]:Integer, [-1,8,4]:Integer, [-1,8,5]:Integer, [-1,8,6]:Integer, [-1,8,7]:Integer, [-1,8,8]:Pointer, [-1,8,8,-1]:Float@double, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer}" {} addrspace(10)* @julia.gc_alloc_obj({}* nonnull %90, i64 noundef 24, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 5136209872 to {}*) to {} addrspace(10)*)) #22, !dbg !1098
%91 = bitcast {} addrspace(10)* %"'mi58" to i8 addrspace(10)*, !dbg !1098
call void @llvm.memset.p10i8.i64(i8 addrspace(10)* nonnull dereferenceable(24) dereferenceable_or_null(24) %91, i8 0, i64 24, i1 false), !dbg !1098
%92 = insertvalue [2 x {} addrspace(10)*] undef, {} addrspace(10)* %"'mi58", 0, !dbg !1098
%"'mi59" = call noalias nonnull align 8 dereferenceable(24) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@double, [-1,8]:Pointer, [-1,8,0]:Integer, [-1,8,1]:Integer, [-1,8,2]:Integer, [-1,8,3]:Integer, [-1,8,4]:Integer, [-1,8,5]:Integer, [-1,8,6]:Integer, [-1,8,7]:Integer, [-1,8,8]:Pointer, [-1,8,8,-1]:Float@double, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer}" {} addrspace(10)* @julia.gc_alloc_obj({}* nonnull %90, i64 noundef 24, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 5136209872 to {}*) to {} addrspace(10)*)) #22, !dbg !1098
%93 = bitcast {} addrspace(10)* %"'mi59" to i8 addrspace(10)*, !dbg !1098
call void @llvm.memset.p10i8.i64(i8 addrspace(10)* nonnull dereferenceable(24) dereferenceable_or_null(24) %93, i8 0, i64 24, i1 false), !dbg !1098
%94 = insertvalue [2 x {} addrspace(10)*] %92, {} addrspace(10)* %"'mi59", 1, !dbg !1098
%95 = getelementptr inbounds { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }* %3, i32 0, i32 3, !dbg !1098
store [2 x {} addrspace(10)*] %94, [2 x {} addrspace(10)*]* %95, align 8, !dbg !1098
%"'ipc50" = bitcast {} addrspace(10)* %"'mi58" to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !1098
%"'ipc51" = bitcast {} addrspace(10)* %"'mi59" to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !1098
%"'ipc52" = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %"'ipc50" to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !1098
%"'ipc53" = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %"'ipc51" to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !1098
%".repack'ipg" = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %"'ipc52", i64 0, i32 0, !dbg !1098
%".repack'ipg55" = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %"'ipc53", i64 0, i32 0, !dbg !1098
store i8* %"'ipl69", i8* addrspace(11)* %".repack'ipg", align 8, !dbg !1098, !tbaa !49, !alias.scope !1099, !noalias !1102
store i8* %"'ipl70", i8* addrspace(11)* %".repack'ipg55", align 8, !dbg !1098, !tbaa !49, !alias.scope !1107, !noalias !1108
%".repack19'ipg" = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %"'ipc52", i64 0, i32 1, !dbg !1098
%".repack19'ipg54" = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %"'ipc53", i64 0, i32 1, !dbg !1098
store {} addrspace(10)* %73, {} addrspace(10)* addrspace(11)* %".repack19'ipg", align 8, !dbg !1098, !tbaa !49, !alias.scope !1099, !noalias !1102
store {} addrspace(10)* %78, {} addrspace(10)* addrspace(11)* %".repack19'ipg54", align 8, !dbg !1098, !tbaa !49, !alias.scope !1107, !noalias !1108
%"'ipc39" = bitcast {} addrspace(10)* %"'mi58" to i8 addrspace(10)*, !dbg !1098
%"'ipc40" = bitcast {} addrspace(10)* %"'mi59" to i8 addrspace(10)*, !dbg !1098
%"'ipc41" = addrspacecast i8 addrspace(10)* %"'ipc39" to i8 addrspace(11)*, !dbg !1098
%"'ipc42" = addrspacecast i8 addrspace(10)* %"'ipc40" to i8 addrspace(11)*, !dbg !1098
%"'ipg43" = getelementptr inbounds i8, i8 addrspace(11)* %"'ipc41", i64 16, !dbg !1098
%"'ipg44" = getelementptr inbounds i8, i8 addrspace(11)* %"'ipc42", i64 16, !dbg !1098
%"'ipc45" = bitcast i8 addrspace(11)* %"'ipg43" to i64 addrspace(11)*, !dbg !1098
%"'ipc46" = bitcast i8 addrspace(11)* %"'ipg44" to i64 addrspace(11)*, !dbg !1098
store i64 1, i64 addrspace(11)* %"'ipc45", align 8, !dbg !1098, !tbaa !30, !alias.scope !1099, !noalias !1102
store i64 1, i64 addrspace(11)* %"'ipc46", align 8, !dbg !1098, !tbaa !30, !alias.scope !1107, !noalias !1108
%"'ipc37" = bitcast i8* %"'ipl69" to {} addrspace(10)**, !dbg !1109
%"'ipc38" = bitcast i8* %"'ipl70" to {} addrspace(10)**, !dbg !1109
%96 = bitcast i8* %89 to {} addrspace(10)**, !dbg !1109
%97 = call {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* %73, {} addrspace(10)** %"'ipc37"), !dbg !1112
%98 = call {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* %78, {} addrspace(10)** %"'ipc38"), !dbg !1112
%99 = call "enzyme_type"="{[-1]:Pointer, [-1,-1]:Float@double}" {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* noundef %84, {} addrspace(10)** noundef %96) #20, !dbg !1112
%100 = bitcast {} addrspace(10)* addrspace(13)* %99 to double addrspace(13)*, !dbg !1112
store double %72, double addrspace(13)* %100, align 8, !dbg !1112, !tbaa !58, !alias.scope !1113, !noalias !1116
%".fca.1.insert'ipiv" = insertvalue { double, {} addrspace(10)* } zeroinitializer, {} addrspace(10)* %"'mi58", 1, !dbg !1119
%101 = insertvalue [2 x { double, {} addrspace(10)* }] undef, { double, {} addrspace(10)* } %".fca.1.insert'ipiv", 0, !dbg !1119
%".fca.1.insert'ipiv72" = insertvalue { double, {} addrspace(10)* } zeroinitializer, {} addrspace(10)* %"'mi59", 1, !dbg !1119
%102 = insertvalue [2 x { double, {} addrspace(10)* }] %101, { double, {} addrspace(10)* } %".fca.1.insert'ipiv72", 1, !dbg !1119
%103 = getelementptr inbounds { { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { double, {} addrspace(10)* } }, { { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { double, {} addrspace(10)* } }* %2, i32 0, i32 1, !dbg !1119
store [2 x { double, {} addrspace(10)* }] %102, { double, {} addrspace(10)* }* %103, align 8, !dbg !1119
%104 = load { { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { double, {} addrspace(10)* } }, { { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { double, {} addrspace(10)* } }* %2, align 8, !dbg !1119
ret { { { [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], {} addrspace(10)*, i64, double, double, double, double* }, double, {} addrspace(10)*, [2 x {} addrspace(10)*], [2 x {} addrspace(10)*], double }, { double, {} addrspace(10)* } } %104, !dbg !1119
}
ERROR: LLVM error: augmented function failed verification (3)
Stacktrace:
[1] handle_error(reason::Cstring)
@ LLVM ~/.julia/packages/LLVM/2JPxT/src/core/context.jl:194 |
|
Gentle bump here, do you happen to know what this latest error means? Stored value type does not match pointer operand type! |
|
Mini-bump here, happy to take suggestions on test case number 6 |
|
@gdalle so I think the things needed to move the needle here are opening issues [that don't require this PR] for whatever failures exist (ideally as simple as possible) |
|
Opened the first one in #2514 |
|
@gdalle compilation errors are gone, but theres a lot of correctness errors currently in this branch. Can you try to look at/resolve them. We're getting closer |
|
Thanks for the quick fixes, I'll take a look at correctness. It might be me incorrectly defining reference values, I'll double check with another autodiff backend |
|
gentle ping @gdalle |
|
@wsmoses I think the correctness errors are due to using Enzyme
struct MyMixedStruct
bar::Float64
foo::Vector{Float64}
end
shadow_result = Ref(MyMixedStruct(0.0, [0.0]))
dresult_dval = MyMixedStruct(1.0, [2.0])
Enzyme.Compiler.recursive_accumulate(shadow_result, Ref(dresult_dval))The result is julia> shadow_result
Base.RefValue{MyMixedStruct}(MyMixedStruct(1.0, [0.0]))and I don't understand why the second field doesn't get incremented too. |
|
Perhaps it has to do with the |
|
@wsmoses small bump, would love some guidance on the incrementation of the shadow with |
|
recursive_accumulate is an internal function whose semantics only add up values in the top-level pointer data structure |
|
Here is the only user of the utility (and currently defines its necessary semantics): Line 480 in c889e43 |
|
I find it hard to deduce from one use and without documentation what that function is supposed to do. Could you please take a look at https://github.com/gdalle/Enzyme.jl/blob/21873f91f01c4e2a05d489575ce567b015fa9169/src/sugar.jl#L1431-L1434 and help me figure out if I'm using it right? |
|
I mean it is an internal functino, but yeah you're using it for a purpose it is not designed to do. You need to recursively accumulate beyond the first pointer depth so that utility function does not apply, I think you'll need to make a different one |
|
I don't know how to do that. |
|
the problem here is not limited to mixedduplicated, it equally applies to duplicated. For example something that returns Vector{Vector{Float64}} |
|
@vchuravy any idea on how to write the right variant to |
|
I now finally understand @wsmoses objection to #1852 which was essentially trying to implement deep_recursive_accumulate. The challenge here is the treatment of immutable values, and it feels like we are re-implementing https://github.com/JuliaObjects/Accessors.jl |
|
What does it mean for the current PR? I'd rather get something merged with a clean error message for the cases we don't handle than keep the hacky version inside DI. Since it is a new feature I think it's okay to start small and then improve? |
|
Gentle bump here, if no one can lend a hand on recursive accumulation there is no path to getting this merged. |
Fixes #1853
Todo:
Related: