Skip to content

Commit 367a9bf

Browse files
committed
Add documentation on dCache
1 parent 92475ad commit 367a9bf

File tree

8 files changed

+283
-66
lines changed

8 files changed

+283
-66
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,8 +27,8 @@ FRISCV is a SystemVerilog implementation of the [RISCV ISA](https://riscv.org):
2727
- Support global and software interrupts
2828
- Clint extension
2929
- In-order execution
30-
- Instruction & data cache
3130
- AXI4-lite for instruction and data bus
31+
- Instruction & data cache units
3232

3333
The core is [compliant](./test/riscv-tests/README.md) with the official RISCV
3434
testsuite.

doc/architecture.md

Lines changed: 118 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
# Architecture
22

3+
All draws have been created with draw.io, the document being stored in `doc` folder.
4+
35
## Interfaces & Communication Protocol
46

57
The core connects internally its modules with AMBA philosophy. AMBA proposes a simple way to connect
@@ -136,6 +138,7 @@ Features:
136138
- Direct-mapped placement policy
137139
- Parametrizable cache depth
138140
- Parametrizable cache line width
141+
- Parametrizable number of outstanding requests
139142
- Software-based flush control with FENCE.i instruction
140143
- Transparent operation for user, no need of any kind of management
141144
- Cache prefetch can be activated in the internal memory controller to enhance efficiency
@@ -174,6 +177,75 @@ addr = | tag | index | offset |
174177
- tag: the remaining MSBs, the part helping to determine a cache hit/miss
175178

176179

180+
#### Data Cache
181+
182+
<p align="center"> <img src="./assets/dCache-top.png"> </p>
183+
184+
The data cache (dCache) relies on the same read flow than iCache. The differences are the dCache
185+
implements a write flow and manages read re-ordering.
186+
187+
Features:
188+
189+
- Direct-mapped placement policy
190+
- Write-through policy for write management
191+
- Parametrizable cache depth
192+
- Parametrizable cache line width
193+
- Parametrizable number of outstanding requests
194+
- IO Region configurable to manage uncachable requests
195+
- Cache prefetch can be activated in the internal memory controller to enhance efficiency
196+
- AXI4-lite slave interface to fetch an instruction
197+
- AXI4 master interface to read/write the system memory
198+
199+
##### Write Path
200+
201+
<p align="center"> <img src="./assets/dCache-pusher.png"> </p>
202+
203+
Pusher stage manages the write path, updating the cache blocks if the address to write is cached
204+
and issuing write request to the memory. It can buffer a certain number of write requests to unleash
205+
performance, this number being configurable with a parameter. If a write request targets an IO
206+
region, the application indicates with AWCACHE the request is not cachable and need to be directly
207+
written in the system memory and not in the cache blocks.
208+
209+
##### Read Path
210+
211+
<p align="center"> <img src="./assets/dCache-read-path.png"> </p>
212+
213+
The read path, if needs to manage IO region (uncachable) read multiplex the Block-Fetcher and
214+
IO-Fetcher modules based on the ARCACHE attribute. IO-Fetcher is always serviced first to issue
215+
request to the memory controller.
216+
217+
218+
##### Read Out-Of-Order Management
219+
220+
<p align="center"> <img src="./assets/dCache-ooo.png"> </p>
221+
222+
Read request can target either an IO region or a cachable region, the application needs to
223+
indicate this information with ARCACHE. Block-Fetcher stage (same module than iCache) manages the
224+
read request in the cache blocks, IO-Fetcher manages the IO request to route directly in the memory
225+
with the memory controller. Because read request can come back out-of-order with the latency
226+
different between block and memory, the dCache uses one more module to manage that. The OoO Manager
227+
module substitutes ARID to make it unique for each read request and uses them to reorder the read
228+
data completion to the application. This stage can be deactivated if not necessary, if the
229+
application can manage by itself the reordering or if doesn't target IO region (Block-Fetcher always
230+
completes requets in-order).
231+
232+
The module also manages the data interface resizing, the cache block and memory interface being
233+
always wider than XLEN (32 or 64 bits).
234+
235+
##### AXI4 Ordering Rules
236+
237+
AXI doesn't provide advanced ordering rules and instructs the user to issue first a sequence of
238+
write then a sequence of read only once write completions have been all received (and vice versa).
239+
Internally, the cache could still processing or waiting for write requests while the application is
240+
already able to issue new series of R/W requests. The cache manages that situation by monitoring all
241+
read and write modules and block any situation that could lead to read / write collision and data
242+
integrity corruption.
243+
244+
However, the read and write path always buffer request with FIFO, preventing to slow down the
245+
application performance. Only the processing of the request will be stopped, the communication with
246+
the cache will remain active as long the FIFO are not full.
247+
248+
177249
### CSR Unit
178250

179251
The core implements in a dedicated module the supported registers described in the ISA manuel volume
@@ -192,6 +264,9 @@ The core implements the following CSR registers into the dedicated module:
192264
- mepc (RW)
193265
- mcause (RW)
194266
- mtval (RW)
267+
- rdcycle (RO)
268+
- rdtime (RO)
269+
- rdinstret (RO)
195270

196271
Next CSRs are available as a memory-mapped peripheral:
197272

@@ -220,25 +295,25 @@ The core always handles in its clock domain the interrupts by synchronizing them
220295
FFDs.
221296

222297

223-
##### Register 0: MSIP Output [RW] - Address 0x0
298+
#### MSIP Output [RW] - Address 0x0
224299

225300
Output software interrupt MSIP to trigger another core (1 bit)
226301

227-
##### Register 1: MTIME LSB [RW] - Address 0x4
302+
#### MTIME LSB [RW] - Address 0x4
228303

229-
MTIME CSR, bits 0 to 31
304+
MTIME CSR, `bit 31:0`
230305

231-
##### Register 1: MTIME MSB [RW] - Address 0x8
306+
#### MTIME MSB [RW] - Address 0x8
232307

233-
MTIME CSR, bits 32 to 63
308+
MTIME CSR, `bit 63:32`
234309

235-
##### Register 1: MTIMECMP LSB [RW] - Address 0xC
310+
#### MTIMECMP LSB [RW] - Address 0xC
236311

237-
MTIMECMP CSR, bits 0 to 31
312+
MTIMECMP CSR, `bit 31:0`
238313

239-
##### Register 1: MTIMECMP MSB [RW] - Address 0x10
314+
#### MTIMECMP MSB [RW] - Address 0x10
240315

241-
MTIMECMP CSR, bits 32 to 63
316+
MTIMECMP CSR, `bit 63:32`
242317

243318

244319
### IO Peripherals
@@ -251,11 +326,11 @@ an APB interconnect
251326

252327
The GPIOs are binded behind two registers:
253328

254-
##### Register 0: Outputs [RW] - Address 0x0
329+
##### OUTPUTS [RW] - Address 0x0
255330

256331
XLEN wide general purpose outputs
257332

258-
##### Register 1: Inputs [RW] - Address 0x4
333+
##### INPUTS [RW] - Address 0x4
259334

260335
XLEN wide general purpose inputs
261336

@@ -266,70 +341,70 @@ Reading and writing a GPIOs' register is never blocking.
266341

267342
The UART uses few IOs:
268343

269-
- rx: serial input, data from an external transmitter
270-
- tx: serial output, data to an external receiver
271-
- rts: back-pressure flag to indicate the core can't receive anymore data
272-
- cts: back-pressure flag to indicate the external receiver can't receive data anymore
344+
- `rx`: serial input, data from an external transmitter
345+
- `tx`: serial output, data to an external receiver
346+
- `rts`: back-pressure flag to indicate the core can't receive anymore data
347+
- `cts`: back-pressure flag to indicate the external receiver can't receive data anymore
273348

274349
The UART uses a FIFO to store data to transmit, and another to store data received. If the FIFOs are
275350
full, the UART can't receive anymore data and rises the RTS flag, or can't transmit anymore and
276351
block the APB bus until the receiver desasserts its CTS flag.
277352

278-
The UART owns few registers. Any attempt to write in a read-only (RO) register or a reserved field
353+
The UART owns few registers. Any attempt to write in a read-only (`RO`) register or a reserved field
279354
will be without effect and can't change the register content neither the engine behavior. Read-write
280-
(RW) registers can be written partially by setting properly the WSTRB signal. A read in a write-only
281-
(WO) register is not garanteed to return a valid value written previously.
355+
(`RW`) registers can be written partially by setting properly the WSTRB signal. A read in a write-only
356+
(`WO`) register is not garanteed to return a valid value written previously.
282357

283358
If a transfer (RX or TX) is active and the enable bit is setup back to 0, the transfer will
284359
terminate only after the complete frame transmission.
285360

286361

287-
##### Register 0: Control and Status [RW/RO] - Address 0x0
362+
##### CONTROL AND STATUS [RW/RO] - Address 0x0
288363

289-
- bit 0 : Enable the UART engine (both RX and TX) [RW]
290-
- bit 1 : Loopback mode, every received data will be stored in RX FIFO and forwarded back to TX [RW]
291-
- bit 2 : Enable parity bit [RW]
292-
- bit 3 : 0 for even parity, 1 for odd parity [RW]
293-
- bit 4 : 0 for one stop bit, 1 for two stop bits [RW]
294-
- bit 7:5 : Reserved
295-
- bit 8 : Busy flag, the UART engine is processing (RX or TX) [RO]
296-
- bit 9 : TX FIFO is empty [RO]
297-
- bit 10 : TX FIFO is full [RO]
298-
- bit 11 : RX FIFO is empty [RO]
299-
- bit 12 : RX FIFO is full [RO]
300-
- bit 13 : UART RTS, flagging it can't receive anymore data [RO]
301-
- bit 14 : UART CTS, flagging it can't send anymore data [RO]
302-
- bit 15 : Parity error of the last RX transaction [RO]
303-
- bit 31:16 : Reserved
364+
- `Bit 0` : Enable the UART engine (both RX and TX) [RW]
365+
- `Bit 1` : Loopback mode, every received data will be stored in RX FIFO and forwarded back to TX [RW]
366+
- `Bit 2` : Enable parity bit [RW]
367+
- `Bit 3` : 0 for even parity, 1 for odd parity [RW]
368+
- `Bit 4` : 0 for one stop bit, 1 for two stop bits [RW]
369+
- `Bit 7:5` : Reserved
370+
- `Bit 8` : Busy flag, the UART engine is processing (RX or TX) [RO]
371+
- `Bit 9` : TX FIFO is empty [RO]
372+
- `Bit 10` : TX FIFO is full [RO]
373+
- `Bit 11` : RX FIFO is empty [RO]
374+
- `Bit 12` : RX FIFO is full [RO]
375+
- `Bit 13` : UART RTS, flagging it can't receive anymore data [RO]
376+
- `Bit 14` : UART CTS, flagging it can't send anymore data [RO]
377+
- `Bit 15` : Parity error of the last RX transaction [RO]
378+
- `Bit 31:16` : Reserved
304379

305380

306-
##### Register 1: UART Clock Divider [RW] - Address 0x4
381+
##### UART CLOCK DIVIDER [RW] - Address 0x4
307382

308383
The number of CPU core cycles to divide down to get the UART data bit rate (baud rate).
309384

310-
- Bit 15:0 : Clock divider
311-
- Bit 31:16 : Reserved
385+
- `Bit 15:0` : Clock divider
386+
- `Bit 31:16` : Reserved
312387

313388
An update during an ongoing operation will certainly lead to compromise the transfer integrity and
314389
possibly make unstable the UART engine. The user is advised to configure the baud rate during
315390
start-up and be sure the engine is disabled before changing this value.
316391

317-
##### Register 2: TX FIFO [WO] - Address 0x8
392+
##### TX FIFO [WO] - Address 0x8
318393

319394
Push data into TX FIFO. Writing into this register will block the APB write request if TX FIFO is
320395
full, until the engine transmit a new word.
321396

322-
- Bit 7:0 : data to write
323-
- Bit 31:8 : Reserved
397+
- `Bit 7:0` : data to write
398+
- `Bit 31:8` : Reserved
324399

325400

326-
##### Register 3: RX FIFO [RO] - Address 0xC
401+
##### RX FIFO [RO] - Address 0xC
327402

328403
Pull data from RX FIFO. Reading into this register will block the APB read request if FIFO is empty,
329404
until the engine receives a new word.
330405

331-
- Bit 7:0 : data ready to be read
332-
- Bit 31:8 : Reserved
406+
- `Bit 7:0` : data ready to be read
407+
- `Bit 31:8` : Reserved
333408

334409
Current limitations:
335410
- only support 8 bits wide data word

doc/assets/dCache-ooo.png

35.2 KB
Loading

doc/assets/dCache-pusher.png

84.6 KB
Loading

doc/assets/dCache-read-path.png

22.9 KB
Loading

doc/assets/dCache-top.png

109 KB
Loading

doc/friscv.drawio

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)