Skip to content

Commit a1ddf8b

Browse files
committed
Add detailed MAC RX example
1 parent bb0c3f0 commit a1ddf8b

File tree

2 files changed

+227
-73
lines changed

2 files changed

+227
-73
lines changed
Lines changed: 208 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,3 @@
1-
{-# language FlexibleContexts #-}
21
{-# LANGUAGE ViewPatterns #-}
32
{-# OPTIONS_GHC -fplugin Protocols.Plugin #-}
43

@@ -7,12 +6,110 @@ Copyright : (C) 2024, QBayLogic B.V.
76
License : BSD2 (see the file LICENSE)
87
Maintainer : QBayLogic B.V. <[email protected]>
98
10-
Provides the entire receive stack as a circuit.
9+
This module contains an example of a fully modular Ethernet MAC receive stack
10+
which allows the reception of packets over Ethernet II and supports any
11+
output data width bigger than zero.
12+
13+
Example usage:
14+
15+
>>> :set -XViewPatterns
16+
>>> import Clash.Cores.Crc (HardwareCrc, deriveHardwareCrc)
17+
>>> import Clash.Cores.Crc.Catalog (Crc32_ethernet(..))
18+
>>> import Clash.Cores.Ethernet.Mac
19+
>>> import Clash.Prelude
20+
>>> import Protocols
21+
>>> import Protocols.PacketStream
22+
23+
The Ethernet RX PHY is completely interchangeable with this stack. In the
24+
example below, we use a dummy. You have to replace this dummy variable with
25+
an Ethernet RX PHY circuit for your specific hardware (e.g. RGMII, MII or SGMII)
26+
that is adapted to the `PacketStream` protocol, i.e. with type:
27+
28+
>>> :{
29+
dummyRxPhy ::
30+
(HiddenClockResetEnable domEthRx) =>
31+
Circuit (PacketStream domEthRx 1 ()) (PacketStream domEthRx 1 ())
32+
dummyRxPhy = undefined
33+
:}
34+
35+
For example, the Lattice ECP5 Colorlight 5A-75B board uses an RGMII PHY,
36+
found at `Clash.Cores.Ethernet.Rgmii.unsafeRgmiiRxC`.
37+
38+
`macRxStack` is the most common Ethernet MAC RX stack that will be sufficient
39+
for most people. That is, it assumes that you want to process the received
40+
bytes in a different clock domain than the Ethernet RX domain. To use it,
41+
all you have to do is specify the data width (in this example 4), the clock
42+
domains, and the RX PHY you want to use.
43+
44+
The stack uses `Clash.Cores.Crc.crcValidator` internally to validate the frame
45+
check sequence of each transmitted Ethernet frame. To be able to use this
46+
component, we need to use `Clash.Cores.Crc.deriveHardwareCrc` to derive the
47+
necessary instance.
48+
49+
>>> :{
50+
$(deriveHardwareCrc Crc32_ethernet d8 d1)
51+
myRxStack ::
52+
(HiddenClockResetEnable dom) =>
53+
(KnownDomain domEthRx) =>
54+
(Clock domEthRx) ->
55+
(Reset domEthRx) ->
56+
(Enable domEthRx) ->
57+
Signal dom MacAddress ->
58+
Circuit (PacketStream domEthRx 1 ()) (PacketStream dom 4 EthernetHeader)
59+
myRxStack ethRxClk ethRxRst ethRxEn ourMacS =
60+
exposeClockResetEnable dummyRxPhy ethRxClk ethRxRst ethRxEn
61+
|> macRxStack @4 ethRxClk ethRxRst ethRxEn ourMacS
62+
:}
63+
64+
While this pre-defined stack is very simple to use, it might not be want you
65+
want. Maybe you want to use a vendor-specific async fifo, or maybe you want
66+
some components that are currently operating in the internal domain @dom@ to
67+
operate in the Ethernet RX domain @domEthRx@ (or vice versa). Timing
68+
requirements differ greatly across different PHY protocols and FPGA boards
69+
or ASICs. Maybe you need to add skid buffers (`registerBoth`, `registerBwd`,
70+
or `registerFwd`) between components to make timing pass, or maybe you can
71+
remove them if they are not necessary in order to save resources.
72+
73+
In our standard stack, FCS validation is done in the Ethernet RX domain,
74+
because that allows us to do it at data width 1. This saves a significant
75+
amount of logic resources, even when having to place extra skid buffers to make
76+
timing pass. For very high speed Ethernet standards you might have to do less
77+
work in the Ethernet RX clock domain.
78+
79+
In any case, it is easy to create a custom stack. All you have to do is import
80+
all the necessary components and connect them with the `|>` operator, creating
81+
one big `Circuit`. For example:
82+
83+
>>> :{
84+
$(deriveHardwareCrc Crc32_ethernet d8 d8)
85+
myCustomRxStack ::
86+
(HiddenClockResetEnable dom) =>
87+
(KnownDomain domEthRx) =>
88+
(Clock domEthRx) ->
89+
(Reset domEthRx) ->
90+
(Enable domEthRx) ->
91+
Signal dom MacAddress ->
92+
Circuit (PacketStream domEthRx 1 ()) (PacketStream dom 8 EthernetHeader)
93+
myCustomRxStack ethRxClk ethRxRst ethRxEn ourMacS =
94+
exposeClockResetEnable dummyRxPhy ethRxClk ethRxRst ethRxEn
95+
|> exposeClockResetEnable preambleStripperC ethRxClk ethRxRst ethRxEn
96+
|> exposeClockResetEnable upConverterC ethRxClk ethRxRst ethRxEn
97+
|> asyncFifoC d4 ethRxClk ethRxRst ethRxEn hasClock hasReset hasEnable
98+
|> fcsValidatorC
99+
|> fcsStripperC
100+
|> macDepacketizerC
101+
|> filterMetaS (isForMyMac <$> ourMacS)
102+
where
103+
isForMyMac myMac (_macDst -> to) = to == myMac || to == broadcastMac
104+
:}
105+
106+
This custom RX stack does almost everything in the internal domain. It also
107+
doesn't use any skid buffers.
11108
-}
12-
module Clash.Cores.Ethernet.Examples.RxStacks
13-
( macRxStack
14-
, ipRxStack
15-
) where
109+
module Clash.Cores.Ethernet.Examples.RxStacks (
110+
macRxStack,
111+
ipRxStack,
112+
) where
16113

17114
import Clash.Cores.Crc
18115
import Clash.Cores.Crc.Catalog
@@ -23,57 +120,114 @@ import Protocols.PacketStream
23120

24121
import Clash.Cores.Ethernet.IP.IPPacketizers
25122
import Clash.Cores.Ethernet.IP.IPv4Types
26-
import Clash.Cores.Ethernet.Mac.EthernetTypes
27-
import Clash.Cores.Ethernet.Mac.FrameCheckSequence ( fcsValidatorC, fcsStripperC )
28-
import Clash.Cores.Ethernet.Mac.MacPacketizers ( macDepacketizerC )
29-
import Clash.Cores.Ethernet.Mac.Preamble ( preambleStripperC )
30-
31-
-- | Processes received ethernet frames
32-
macRxStack
33-
:: forall (dataWidth :: Nat) (dom :: Domain) (domEth :: Domain)
34-
. ( HiddenClockResetEnable dom
35-
, KnownDomain domEth
36-
, HardwareCrc Crc32_ethernet 8 dataWidth
37-
, KnownNat dataWidth
38-
, 1 <= dataWidth
39-
)
40-
=> Clock domEth
41-
-> Reset domEth
42-
-> Enable domEth
43-
-> Signal dom MacAddress
44-
-> Circuit (PacketStream domEth 1 ()) (PacketStream dom dataWidth EthernetHeader)
45-
macRxStack ethClk ethRst ethEn macAddressS =
46-
exposeClockResetEnable preambleStripperC ethClk ethRst ethEn
47-
|> upConverterC'
48-
|> asyncFifoC'
49-
|> fcsValidatorC
123+
import Clash.Cores.Ethernet.Mac
124+
125+
{- |
126+
Ethernet MAC RX block. Assumes @dom@ is a different domain than
127+
@domEthRx@. For this stack to work, the input @dataWidth@
128+
__MUST__ satisfy the following formula:
129+
130+
@dataWidth * DomainPeriod dom <= DomainPeriod domEthRx@
131+
132+
Processing is done in the following way, in order:
133+
134+
1. The PHY passes raw Ethernet packets to us, which first arrive at
135+
`preambleStripperC`. This component removes the preamble and SFD from each
136+
packet in the stream.
137+
138+
2. A pipeline skid buffer ('registerBoth') is inserted along the path in order
139+
to improve timing.
140+
141+
3. `fcsValidatorC` computes the FCS of each packet in the resulting stream.
142+
If the FCS did not match, the packet is aborted (but not dropped, yet).
143+
144+
4. `upConverterC` upsizes the stream from @1@ byte to @dataWidth@ bytes wide.
145+
This is necessary for full throughput, because we will operate in a slower
146+
clock domain soon.
147+
148+
5. `asyncFifoC` is used to cross clock domains, because the clock domain of
149+
the Ethernet RX PHY is usually different from the clock domain that is used
150+
internally.
151+
152+
6. `fcsStripperC` removes the FCS field from each packet in the stream
153+
(that is, the last 4 bytes).
154+
155+
7. `macDepacketizerC` parses the first 14 bytes of each packet in the stream
156+
into an Ethernet MAC header and puts it in the metadata.
157+
158+
8. Lastly, we drop any packets that are not destined for either our MAC
159+
address or the broadcast MAC address.
160+
161+
The output stream is now ready for further higher-level processing.
162+
For example, it may be routed via the EtherType field in the metadata to a
163+
network layer in hardware. It could also be written straight to RAM via DMA
164+
for further processing by a CPU.
165+
-}
166+
macRxStack ::
167+
forall
168+
(dataWidth :: Nat)
169+
(dom :: Domain)
170+
(domEthRx :: Domain).
171+
(HiddenClockResetEnable dom) =>
172+
(KnownDomain domEthRx) =>
173+
(KnownNat dataWidth) =>
174+
(1 <= dataWidth) =>
175+
(HardwareCrc Crc32_ethernet 8 1) =>
176+
-- | Clock signal in the Ethernet RX domain
177+
Clock domEthRx ->
178+
-- | Reset signal in the Ethernet RX domain
179+
Reset domEthRx ->
180+
-- | Enable signal in the Ethernet RX domain
181+
Enable domEthRx ->
182+
-- | Our MAC address
183+
Signal dom MacAddress ->
184+
Circuit
185+
(PacketStream domEthRx 1 ())
186+
(PacketStream dom dataWidth EthernetHeader)
187+
macRxStack ethRxClk ethRxRst ethRxEn ourMacS =
188+
withClockResetEnable ethRxClk ethRxRst ethRxEn ethRxCkt
189+
|> asyncFifoC d4 ethRxClk ethRxRst ethRxEn hasClock hasReset hasEnable
50190
|> fcsStripperC
51191
|> macDepacketizerC
52-
|> filterMetaS (isForMyMac <$> macAddressS)
53-
where
54-
upConverterC' = exposeClockResetEnable upConverterC ethClk ethRst ethEn
55-
asyncFifoC' = asyncFifoC d4 ethClk ethRst ethEn hasClock hasReset hasEnable
56-
isForMyMac myMac (_macDst -> to) = to == myMac || to == broadcastMac
192+
|> filterMetaS (isForMyMac <$> ourMacS)
193+
where
194+
ethRxCkt ::
195+
(HiddenClockResetEnable domEthRx) =>
196+
Circuit (PacketStream domEthRx 1 ()) (PacketStream domEthRx dataWidth ())
197+
ethRxCkt =
198+
preambleStripperC
199+
|> registerBoth
200+
|> fcsValidatorC
201+
|> upConverterC
202+
203+
isForMyMac myMac (_macDst -> to) = to == myMac || to == broadcastMac
57204

58205
-- | Processes received IP packets
59-
ipRxStack
60-
:: forall (dataWidth :: Nat) (dom :: Domain) (domEth :: Domain)
61-
. ( HiddenClockResetEnable dom
62-
, KnownDomain domEth
63-
, HardwareCrc Crc32_ethernet 8 dataWidth
64-
, KnownNat dataWidth
65-
, 1 <= dataWidth
66-
)
67-
=> Clock domEth
68-
-> Reset domEth
69-
-> Enable domEth
70-
-> Signal dom MacAddress
71-
-> Signal dom (IPv4Address, IPv4SubnetMask)
72-
-> Circuit (PacketStream domEth 1 ()) (PacketStream dom dataWidth IPv4HeaderLite)
73-
ipRxStack ethClk ethRst ethEn macAddressS ipS = circuit $ \raw -> do
74-
ethernetFrames <- macRxStack ethClk ethRst ethEn macAddressS -< raw
206+
ipRxStack ::
207+
forall (dataWidth :: Nat) (dom :: Domain) (domEthRx :: Domain).
208+
(HiddenClockResetEnable dom) =>
209+
(KnownDomain domEthRx) =>
210+
(HardwareCrc Crc32_ethernet 8 1) =>
211+
(KnownNat dataWidth) =>
212+
(1 <= dataWidth) =>
213+
-- | Clock signal in the Ethernet RX domain
214+
Clock domEthRx ->
215+
-- | Reset signal in the Ethernet RX domain
216+
Reset domEthRx ->
217+
-- | Enable signal in the Ethernet RX domain
218+
Enable domEthRx ->
219+
-- | Our MAC address
220+
Signal dom MacAddress ->
221+
-- | (Our IPv4, Our subnet mask)
222+
Signal dom (IPv4Address, IPv4SubnetMask) ->
223+
Circuit
224+
(PacketStream domEthRx 1 ())
225+
(PacketStream dom dataWidth IPv4HeaderLite)
226+
ipRxStack ethRxClk ethRxRst ethRxEn ourMacS ipS = circuit $ \raw -> do
227+
ethernetFrames <- macRxStack ethRxClk ethRxRst ethRxEn ourMacS -< raw
75228
[ip] <- packetDispatcherC (isIpv4 :> Nil) -< ethernetFrames
76229
ipDepacketizerLiteC |> filterMetaS (isForMyIp <$> ipS) -< ip
77-
where
78-
isIpv4 = (== 0x0800) . _etherType
79-
isForMyIp (ip, subnet) (_ipv4lDestination -> to) = to == ip || to == subnetBroadcast subnet ip
230+
where
231+
isIpv4 = (== 0x0800) . _etherType
232+
isForMyIp (ip, subnet) (_ipv4lDestination -> to) =
233+
to == ip || to == subnetBroadcast subnet ip

src/Clash/Cores/Ethernet/Examples/TxStacks.hs

Lines changed: 19 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,8 @@ License : BSD2 (see the file LICENSE)
66
Maintainer : QBayLogic B.V. <[email protected]>
77
88
This module contains an example of a fully modular MAC transmit stack which
9-
allows the transmission of packets over Ethernet II and supports any data width
10-
bigger than zero.
9+
allows the transmission of packets over Ethernet II and supports any input
10+
data width bigger than zero.
1111
1212
Example usage:
1313
@@ -30,8 +30,8 @@ dummyTxPhy ::
3030
dummyTxPhy = undefined
3131
:}
3232
33-
For example, the Lattice ECP5 board uses an RGMII PHY, found at
34-
'Clash.Cores.Ethernet.Rgmii.rgmiiTxC'.
33+
For example, the Lattice ECP5 Colorlight 5A-75B board uses an RGMII PHY,
34+
found at 'Clash.Cores.Ethernet.Rgmii.rgmiiTxC'.
3535
3636
'macTxStack' is the most common Ethernet MAC TX stack that will be sufficient
3737
for most people. That is, it inserts an interpacket gap of 12 bytes, pads the
@@ -43,7 +43,7 @@ and the TX PHY you want to use.
4343
The stack uses 'Clash.Cores.Crc.crcEngine' internally to calculate the frame
4444
check sequence of each transmitted Ethernet frame, so that it can be appended
4545
to the packet. To be able to use this component, we need to use
46-
'Clash.Cores.Crc.deriveHardwareCrc' to derive a necessary instance.
46+
'Clash.Cores.Crc.deriveHardwareCrc' to derive the necessary instance.
4747
4848
>>> :{
4949
$(deriveHardwareCrc Crc32_ethernet d8 d1)
@@ -62,11 +62,11 @@ myTxStack ethTxClk ethTxRst ethTxEn =
6262
While this pre-defined stack is very simple to use, it might not be want you
6363
want. Maybe you want to use a vendor-specific async fifo, or maybe you want
6464
some components that are currently operating in the internal domain @dom@ to
65-
operate in the Ethernet TX domain @domEthTx@ (or vice versa). Timing requirements
66-
differ greatly across different PHY protocols and FPGA boards or ASICs. Maybe
67-
you need to add skid buffers ('registerBoth') between components to make timing
68-
pass, or maybe you can remove them if they are not necessary in order to save
69-
resources.
65+
operate in the Ethernet TX domain @domEthTx@ (or vice versa). Timing
66+
requirements differ greatly across different PHY protocols and FPGA boards or
67+
ASICs. Maybe you need to add skid buffers (`registerBoth`, `registerBwd`, or
68+
`registerFwd`) between components to make timing pass, or maybe you can remove
69+
them if they are not necessary in order to save resources.
7070
7171
In our standard stack, FCS insertion is done in the Ethernet TX domain, because
7272
that allows us to do it at data width 1. This saves a significant amount of
@@ -119,8 +119,8 @@ import Protocols (Circuit, (|>))
119119
import Protocols.PacketStream
120120

121121
{- |
122-
Processes bytes to transmit over Ethernet. Assumes @dom@ is a slower clock
123-
domain than @domEthTx@. For this stack to work, the input @dataWidth@
122+
Ethernet MAC TX block. Assumes @dom@ is a different domain than
123+
@domEthTx@. For this stack to work, the input @dataWidth@
124124
__MUST__ satisfy the following formula:
125125
126126
@DomainPeriod dom <= DomainPeriod domEthTx * dataWidth@
@@ -132,16 +132,16 @@ at 'macPacketizerC', which prepends this header to the stream. This header
132132
contains the source and destination MAC addresses, and the EtherType of the
133133
payload.
134134
135-
2. Because the clock domain of the Ethernet TX PHY is usually different from
136-
the clock domain that is used internally, `asyncFifoC` is used to cross clock
137-
domains.
135+
5. `asyncFifoC` is used to cross clock domains, because the clock domain of
136+
the Ethernet TX PHY is usually different from the clock domain that is used
137+
internally.
138138
139139
3. A pipeline skid buffer ('registerBoth') is inserted along the path in order
140140
to improve timing.
141141
142-
4. 'downConverterC' downsizes the stream from @n@ bytes to @1@ byte wide. This
143-
makes the coming upcoming components more resource-efficient, and it is
144-
possible because we now operate in a faster domain.
142+
4. 'downConverterC' downsizes the stream from @dataWidth@ bytes to @1@ byte
143+
wide. This makes the coming upcoming components more resource-efficient, and
144+
it is possible because we now operate in a faster domain.
145145
146146
5. 'paddingInserterC' pads the Ethernet frame to 60 bytes with null bytes if
147147
necessary. Just 60 bytes, because the FCS is not inserted yet. Inserting that
@@ -195,7 +195,7 @@ macTxStack ethTxClk ethTxRst ethTxEn =
195195
|> preambleInserterC
196196
|> interpacketGapInserterC d12
197197

198-
-- | Sends IP packets to a known mac address
198+
-- | Sends IP packets to a known MAC address
199199
ipTxStack ::
200200
forall
201201
(dataWidth :: Nat)

0 commit comments

Comments
 (0)