Skip to content

Add rev mode support for memref.alloca_scope#2814

Open
vimarsh6739 wants to merge 11 commits into
memref-allocafrom
alloca_scope
Open

Add rev mode support for memref.alloca_scope#2814
vimarsh6739 wants to merge 11 commits into
memref-allocafrom
alloca_scope

Conversation

@vimarsh6739
Copy link
Copy Markdown
Member

@vimarsh6739 vimarsh6739 commented May 7, 2026

Re-opening the alloca_scope PR in the repo

I dont think this will work if the alloca_scope has a memref.alloca inside it (caching it will be unsafe across the 2 alloca_scope ops that are emitted)

Something like this will fail

func.func @foo(%x : f64) -> f64{
    %out = memref.alloca_scope -> (f64) {
      %buf = memref.alloca() : memref<f64>
      memref.store %x, %buf[] : memref<f64>
      %y = memref.load %buf[] : memref<f64>
      memref.alloca_scope.return %y : f64
    }
    return %out
}

func.func @dfoo(%x, %dout) {
   %dx = enzyme.autodiff @foo(%x,%dout) {act=[active], ret_act=[activenoneed] } 
   return %dx
}

The primal is basically

out = alloca_scope{
buf[] = x
y = buf[]
return y
}

Reverse Mode should give us

dy += dout
d(buf) += dy // load
dx += d(buf) // store
d(buf) = 0 //alloca

We want this to lower to something like this (need to discuss what the caching should look like, since I dont think we can just push the memref to the cache here)

//forward
memref.alloca_scope {
    %buf = memref.alloca() 
    %dbuf = memref.alloca() 
    zero %dbuf
    
    enzyme.push %cache_store %dbuf
    memref.store %x, %buf
    
    enzyme.push %cache_load %dbuf
    %y = memref.load %buf
    
     alloca_scope.return %y
}

//reverse
memref.alloca_scope{
  //terminator
   dy += dout
   //load
   dbuf1 = enzyme.pop %cache_load
   old = memref.load dbuf1
   memref.store  (old + dy), dbuf1 
   //store
   dbuf2 = enzyme.pop %cache_store
   %old2 = memref.load dbuf2
   dx += old2
   
   memref.store 0, dbuf2
}

The issue is that the cache push/pop will not be valid across the 2 alloca_scopes here

@vimarsh6739 vimarsh6739 marked this pull request as draft May 7, 2026 19:54
@xys-syx xys-syx force-pushed the alloca_scope branch 2 times, most recently from ad79e42 to 1f42ce5 Compare May 14, 2026 05:14
@xys-syx
Copy link
Copy Markdown
Collaborator

xys-syx commented May 14, 2026

For %dbuf's lifetime, I feel like there are two ways to fix it:

  • Hoist the shadow to a block dominating both forward and reverse block
  • Refering to scf.for and affine.for which build a new region for the reverse loop, and emit everything into its body, we could stop pushing the shadow into the cache, load and store adjoints inside the reverse builder, and modify the cache protocol.

@xys-syx xys-syx changed the base branch from main to memref-alloca May 21, 2026 06:19
@xys-syx xys-syx marked this pull request as ready for review May 21, 2026 06:39
@xys-syx xys-syx requested a review from wsmoses May 21, 2026 06:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants