Commit df4f24d
Add V cache write-back with interleaved KV cache layout
Extend the KV cache prefill design to write both K and V caches to DDR
during flash attention computation. Uses a single CacheWB channel with
an interleaved KV cache layout [K_c0, V_c0, K_c1, V_c1, ...] where
both K and V data are staged through kwb_buf before DMA transfer.
Key design choices:
- Single CacheWB channel avoids shim S2MM channel exhaustion (no packet
switching needed)
- Shared kwb_buf staging buffer prevents DMA race between CacheWB read
and V2L1 write on the v buffer
- scf.for loop in launch body enables compiler BD folding, preventing
BD exhaustion at large sequence lengths (tested up to 12h x 4096)
Compiler changes (AIRToAIEPass.cpp):
- Fix packet BD attribute lookup for L1-to-L3 dma_packet channels
(getExistingPacketFlowOpFromDevice searches both flow maps)
- Place outbound MM2S lock acquire before channel put and release after
channel put, enabling interleaved lock pattern for multiple puts
sharing the same staging buffer
Performance: 12 heads x 4096 seq_len achieves 2460 peak GFLOPS with
zero overhead vs K-only writeback.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent bc95ac7 commit df4f24d
4 files changed
Lines changed: 344 additions & 149 deletions
File tree
- mlir
- lib/Conversion
- test/Conversion/AIRToAIE
- programming_examples/flash_attention/kv_cache_prefill
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4685 | 4685 | | |
4686 | 4686 | | |
4687 | 4687 | | |
4688 | | - | |
4689 | | - | |
4690 | | - | |
4691 | | - | |
4692 | | - | |
4693 | | - | |
| 4688 | + | |
4694 | 4689 | | |
4695 | 4690 | | |
4696 | 4691 | | |
| |||
4701 | 4696 | | |
4702 | 4697 | | |
4703 | 4698 | | |
4704 | | - | |
4705 | | - | |
| 4699 | + | |
| 4700 | + | |
| 4701 | + | |
| 4702 | + | |
| 4703 | + | |
| 4704 | + | |
| 4705 | + | |
| 4706 | + | |
4706 | 4707 | | |
4707 | 4708 | | |
4708 | 4709 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1764 | 1764 | | |
1765 | 1765 | | |
1766 | 1766 | | |
| 1767 | + | |
| 1768 | + | |
| 1769 | + | |
| 1770 | + | |
| 1771 | + | |
| 1772 | + | |
| 1773 | + | |
| 1774 | + | |
| 1775 | + | |
| 1776 | + | |
| 1777 | + | |
| 1778 | + | |
| 1779 | + | |
| 1780 | + | |
| 1781 | + | |
| 1782 | + | |
| 1783 | + | |
| 1784 | + | |
| 1785 | + | |
| 1786 | + | |
| 1787 | + | |
| 1788 | + | |
| 1789 | + | |
| 1790 | + | |
| 1791 | + | |
| 1792 | + | |
| 1793 | + | |
| 1794 | + | |
| 1795 | + | |
| 1796 | + | |
| 1797 | + | |
| 1798 | + | |
| 1799 | + | |
| 1800 | + | |
| 1801 | + | |
| 1802 | + | |
| 1803 | + | |
| 1804 | + | |
| 1805 | + | |
| 1806 | + | |
| 1807 | + | |
| 1808 | + | |
| 1809 | + | |
| 1810 | + | |
| 1811 | + | |
0 commit comments