Skip to content

Commit e3a0b10

Browse files
committed
Add SCEP certificate enrollment and 802.1x port-based network access control (PNAC)
SCEP (Simple Certificate Enrollment Protocol) enables automated certificate enrollment from a CA server. IEEE 802.1x is a port-based network access control (PNAC) standard that restricts network access on a switch port until the device authenticates using a certificate (or other EAP method). The 802.1x authentication is implemented using wpa_supplicant. The SCEP client is implemented using github.com/smallstep/scep. The end-to-end workflow is as follows: 1. Device sends a DHCP request with a vendor class identifier that identifies it as EVE OS. 2. The network switch places the port into a non-authenticated VLAN. Because it detects EVE OS, it allows the device to reach the controller and fetch network and SCEP configuration (needed to bootstrap the enrollment). 3. Device follows the SCEP profile to enroll a certificate, either by talking directly to the SCEP server or through a controller-provided SCEP proxy (essentially an HTTP proxy). 4. Device uses the enrolled certificate to authenticate the port via 802.1x. The switch then moves the port to the authenticated VLAN. 5. After a configurable delay, the device repeats the DHCP request to obtain an IP address from the authenticated VLAN. EVE publishes PNAC status, enrolled certificate status and PNAC metrics to the controller. Signed-off-by: Milan Lenco <milan@zededa.com>
1 parent b9201b1 commit e3a0b10

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

70 files changed

+10451
-198
lines changed

.github/CODEOWNERS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@
3131
/pkg/pillar/cmd/downloader/ @milan-zededa @rouming
3232
/pkg/pillar/cmd/ledmanager/ @rucoder @rene
3333
/pkg/pillar/cmd/nim/ @milan-zededa
34+
/pkg/pillar/cmd/scepclient/ @milan-zededa
3435
/pkg/pillar/cmd/tpmmgr/ @rucoder @shjala
3536
/pkg/pillar/cmd/volumemgr/ @OhmSpectator @rouming @europaul
3637
/pkg/pillar/cmd/zedagent/ @OhmSpectator @milan-zededa @rouming @uncleDecart

docs/CONFIG-PROPERTIES.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,9 @@
9191
| diag.probe.remote.http.endpoint | string | `"http://www.google.com"` | - | - | Remote endpoint (URL, IP instead of hostname is accepted) queried over HTTP to assess the state of network connectivity whenever the controller is not reachable. Used only for diagnostics (no functional impact). Set to an empty string to disable. |
9292
| diag.probe.remote.https.endpoint | string | `"https://www.google.com"` | - | - | Remote endpoint (URL, IP instead of hostname is NOT accepted) queried over HTTPS to assess the state of network connectivity whenever the controller is not reachable. Used only for diagnostics (no functional impact). Set to an empty string to disable. |
9393
| app.enable.tcp.mss.clamping | bool | true | - | - | Configuration property that enables EVE to automatically adjust (clamp) the TCP MSS on forwarded application traffic to match the path MTU, preventing fragmentation and connectivity issues on lower-MTU links. |
94+
| scep.retry.interval | timer in seconds | 300 (5 minutes) | 60 (1 minute) | 3600 (1 hour) | Interval between retry attempts for certificates that previously failed to enroll/renew or returned PENDING from the SCEP server. |
95+
| pnac.dhcp.reacquire.delay | integer (seconds) | 5 | 0 | 60 (1 minute) | Delay before the DHCP client reacquires a lease after a PNAC (802.1X) port authentication state change. This is needed when the network switch reassigns the port to a different access VLAN based on the authentication result. Setting this value to 0 disables DHCP reacquire. |
96+
| dhcp.enable.vendorclassid | bool | true | - | - | Enables sending the DHCP Vendor Class Identifier (Option 60) to identify the device as EVE OS. This allows networks or DHCP servers to apply policies such as VLAN assignment or granting access to the EVE controller. Some badly configured DHCP servers may reject unknown vendor class IDs. Setting this to false disables sending the vendor class ID. |
9497

9598
## Log levels
9699

@@ -156,3 +159,4 @@ Right now the following agents support per-agent log level settings:
156159
* msrv
157160
* domainmgr
158161
* diag
162+
* scepclient

docs/DEVICE-CONNECTIVITY.md

Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -604,6 +604,102 @@ There are two levels of errors:
604604
- A particular management port could not be used to reach the controller. In that case
605605
the `ErrorInfo` for the particular `DevicePort` is set to indicate the error and timestamp.
606606

607+
## Port-Based Network Access Control (802.1X) and SCEP Certificate Enrollment
608+
609+
EVE supports IEEE 802.1X Port-Based Network Access Control (PNAC), allowing network switches
610+
to restrict port-level access until the device authenticates with a valid certificate.
611+
IEEE 802.1X is a standard for port-based network access control that works at Layer 2
612+
of the network stack. A switch port starts in an unauthorized state and only grants full
613+
network access after the connected device (the supplicant) successfully authenticates
614+
against an authentication server (typically a RADIUS server) via the switch (the authenticator).
615+
616+
To obtain the certificate required for authentication, EVE implements SCEP (Simple Certificate
617+
Enrollment Protocol), a protocol designed for automated certificate enrollment from
618+
a Certificate Authority (CA). SCEP allows a device to generate a key pair, submit
619+
a Certificate Signing Request (CSR) to a SCEP server, and receive a signed certificate
620+
in return.
621+
622+
The 802.1X supplicant is implemented using [wpa_supplicant](https://w1.fi/wpa_supplicant/)
623+
with EAP-TLS as the authentication method. The SCEP client is implemented using
624+
the [github.com/smallstep/scep](https://github.com/smallstep/scep) Go library.
625+
626+
### Bootstrapping workflow
627+
628+
The full workflow from an unauthenticated device to an authenticated network port is:
629+
630+
1. **DHCP with vendor class identification**: The device sends a DHCP request that includes
631+
a Vendor Class Identifier (DHCP Option 60) set to `LFEDGE-EVE`. This identifies the device
632+
as running EVE OS to the network infrastructure.
633+
634+
2. **Non-authenticated VLAN access**: The network switch places the port into a non-authenticated
635+
(bootstrap) VLAN. Because the switch detects the EVE OS vendor class identifier, it allows
636+
the device to reach the controller and fetch the network configuration including the SCEP
637+
enrollment profile. This step is critical for bootstrapping — the device needs connectivity
638+
to obtain the certificate it will later use for authentication.
639+
640+
3. **SCEP certificate enrollment**: The device follows the SCEP profile received from
641+
the controller to enroll a certificate. It can communicate with the SCEP server in one
642+
of two ways:
643+
- **Directly**: The device contacts the SCEP server URL specified in the profile.
644+
- **Via controller proxy**: The device routes SCEP requests through a controller-provided
645+
SCEP proxy (essentially an HTTP proxy), which is useful when the SCEP server is not
646+
directly reachable from the bootstrap VLAN.
647+
648+
4. **802.1X port authentication**: Once the certificate is enrolled, the device uses it
649+
to authenticate the port via 802.1X EAP-TLS. Upon successful authentication, the switch
650+
moves the port to the authenticated VLAN, granting full network access.
651+
652+
5. **DHCP reacquisition**: After a configurable delay
653+
([`pnac.dhcp.reacquire.delay`](CONFIG-PROPERTIES.md), default 5 seconds), the device
654+
repeats the DHCP request to obtain an IP address from the authenticated VLAN.
655+
The delay ensures the switch has completed the VLAN transition before the DHCP client
656+
attempts to acquire a new lease.
657+
658+
### Configuration
659+
660+
PNAC and SCEP are configured through the controller using the device API:
661+
662+
- **SCEP profiles** are defined in `EdgeDevConfig.ScepProfiles` and specify the SCEP server URL,
663+
whether to use the controller proxy, a challenge password (encrypted), trusted CA certificates,
664+
and CSR parameters (subject DN, SANs, key type, hash algorithm, renewal period).
665+
666+
- **PNAC configurations** are defined in `EdgeDevConfig.Pnacs`, each referencing network adapter
667+
and a SCEP profile by logical names. They specify the EAP method (currently EAP-TLS),
668+
an optional EAP identity, and trusted CA certificates for verifying the authentication
669+
server's TLS certificate.
670+
671+
Relevant [configuration properties](CONFIG-PROPERTIES.md):
672+
673+
| Property | Default | Description |
674+
|---|---|---|
675+
| `scep.retry.interval` | 300s (5 min) | Interval between retry attempts for failed or pending SCEP enrollments |
676+
| `pnac.dhcp.reacquire.delay` | 5s | Delay before DHCP reacquire after 802.1X authentication state change. Set to 0 to disable |
677+
| `dhcp.enable.vendorclassid` | true | Enables sending DHCP Vendor Class Identifier (Option 60) as `LFEDGE-EVE` |
678+
679+
### Certificate lifecycle
680+
681+
The enrolled certificate is stored on the device along with its private key (kept in the vault
682+
for protection). EVE monitors the certificate's validity and automatically initiates renewal
683+
when the configured percentage of the certificate's lifetime has elapsed
684+
(controlled by `RenewPeriodPercent` in the CSR profile). If the SCEP server or CSR profile
685+
configuration changes, EVE will re-enroll the certificate against the new parameters.
686+
687+
### Status and metrics reporting
688+
689+
EVE publishes the following information to the controller:
690+
691+
- **PNAC status** (per-port): Whether 802.1X is enabled, the current supplicant state
692+
(e.g. associating, authenticating, connected, disconnected), the timestamp of the last
693+
successful authentication, and any authentication errors.
694+
695+
- **Enrolled certificate status**: Details of the installed certificate including subject,
696+
issuer, SANs, validity period, SHA-256 fingerprint, key type, and current certificate status
697+
(e.g. valid, expired, pending enrollment).
698+
699+
- **PNAC metrics** (per-port): EAPOL frame counters including frames received/transmitted,
700+
EAPOL-Start and EAPOL-Logoff frames, EAP-Request/Response frames, and counts of invalid
701+
or malformed frames.
702+
607703
## Air-Gap Mode
608704

609705
Air-Gap mode allows a device to operate without connectivity to the main controller,

pkg/edgeview/src/network.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1415,7 +1415,7 @@ func runWireless() {
14151415
_, _ = runCmd(prog, args, true)
14161416

14171417
retbytes, err = os.ReadFile(
1418-
fmt.Sprintf("/run/nim/wpa_supplicant.%s.conf", port.IfName))
1418+
fmt.Sprintf("/run/nim/wpa_supplicant-%s.conf", port.IfName))
14191419
if err != nil {
14201420
continue
14211421
}

pkg/pillar/base/logobjecttypes.go

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -179,6 +179,8 @@ const (
179179
KubeAppFailover LogObjectType = "kube_app_failover"
180180
// EvalStatusLogType : type for EvalStatus log entries
181181
EvalStatusLogType LogObjectType = "eval_status"
182+
// EnrolledCertStatusLogType : type for EnrolledCertificateStatus log entries.
183+
EnrolledCertStatusLogType LogObjectType = "enrolled_cert_status"
182184
)
183185

184186
// RelationObjectType :

pkg/pillar/cipher/cipher.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@ func getEncryptionBlock(
3131
decBlock.ProtectedUserData = zconfigDecBlockPtr.ProtectedUserData
3232
decBlock.ClusterToken = zconfigDecBlockPtr.ClusterToken
3333
decBlock.GzipRegistrationManifestYaml = zconfigDecBlockPtr.GzipRegistrationManifestYaml
34+
decBlock.SCEPChallengePassword = zconfigDecBlockPtr.ScepChallengePassword
3435
return decBlock
3536
}
3637

pkg/pillar/cmd/nim/nim.go

Lines changed: 119 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ import (
1313
"strings"
1414
"time"
1515

16+
eveinfo "github.com/lf-edge/eve-api/go/info"
1617
"github.com/lf-edge/eve/pkg/pillar/agentbase"
1718
"github.com/lf-edge/eve/pkg/pillar/agentlog"
1819
"github.com/lf-edge/eve/pkg/pillar/base"
@@ -80,6 +81,8 @@ type nim struct {
8081
subNetworkInstanceConfig pubsub.Subscription
8182
subEdgeNodeClusterStatus pubsub.Subscription
8283
subKubeUserServices pubsub.Subscription
84+
subVaultStatus pubsub.Subscription
85+
subEnrolledCertStatus pubsub.Subscription
8386

8487
// Publications
8588
pubDummyDevicePortConfig pubsub.Publication // For logging
@@ -91,10 +94,13 @@ type nim struct {
9194
pubCipherMetrics pubsub.Publication
9295
pubCachedResolvedIPs pubsub.Publication
9396
pubWwanConfig pubsub.Publication
97+
pubPNACMetrics pubsub.Publication
9498

9599
// Metrics
96-
agentMetrics *controllerconn.AgentMetrics
97-
cipherMetrics *cipher.AgentMetrics
100+
agentMetrics *controllerconn.AgentMetrics
101+
cipherMetrics *cipher.AgentMetrics
102+
metricInterval uint32 // In seconds
103+
publishTicker *flextimer.FlexTickerHandle
98104

99105
// Configuration
100106
globalConfig types.ConfigItemValueMap
@@ -219,11 +225,12 @@ func (n *nim) run(ctx context.Context) (err error) {
219225
stillRunning := time.NewTicker(stillRunTime)
220226
n.PubSub.StillRunning(agentName, warningTime, errorTime)
221227

222-
// Publish metrics for zedagent every 10 seconds
223-
interval := 10 * time.Second
228+
// Publish network metrics
229+
interval := time.Duration(n.metricInterval) * time.Second
224230
max := float64(interval)
225231
min := max * 0.3
226-
publishTimer := flextimer.NewRangeTicker(time.Duration(min), time.Duration(max))
232+
publishTicker := flextimer.NewRangeTicker(time.Duration(min), time.Duration(max))
233+
n.publishTicker = &publishTicker
227234

228235
// Periodically resolve the controller hostname to keep its DNS entry cached,
229236
// reducing the need for DNS lookups on every controller API request.
@@ -243,6 +250,8 @@ func (n *nim) run(ctx context.Context) (err error) {
243250
n.subWwanStatus,
244251
n.subNetworkInstanceConfig,
245252
n.subKubeUserServices,
253+
n.subVaultStatus,
254+
n.subEnrolledCertStatus,
246255
}
247256
for _, sub := range inactiveSubs {
248257
if err = sub.Activate(); err != nil {
@@ -292,8 +301,16 @@ func (n *nim) run(ctx context.Context) (err error) {
292301
case change := <-n.subKubeUserServices.MsgChan():
293302
n.subKubeUserServices.ProcessChange(change)
294303

295-
case <-publishTimer.C:
304+
case change := <-n.subVaultStatus.MsgChan():
305+
n.subVaultStatus.ProcessChange(change)
306+
307+
case change := <-n.subEnrolledCertStatus.MsgChan():
308+
n.subEnrolledCertStatus.ProcessChange(change)
309+
n.handleEnrolledCertUpdate()
310+
311+
case <-publishTicker.C:
296312
start := time.Now()
313+
n.publishPNACMetrics()
297314
err = n.cipherMetrics.Publish(n.Log, n.pubCipherMetrics, "global")
298315
if err != nil {
299316
n.Log.Error(err)
@@ -408,6 +425,14 @@ func (n *nim) initPublications() (err error) {
408425
if err != nil {
409426
return err
410427
}
428+
429+
n.pubPNACMetrics, err = n.PubSub.NewPublication(pubsub.PublicationOptions{
430+
AgentName: agentName,
431+
TopicType: types.PNACMetricsList{},
432+
})
433+
if err != nil {
434+
return err
435+
}
411436
return nil
412437
}
413438

@@ -613,6 +638,27 @@ func (n *nim) initSubscriptions() (err error) {
613638
if err != nil {
614639
return err
615640
}
641+
642+
n.subVaultStatus, err = n.PubSub.NewSubscription(pubsub.SubscriptionOptions{
643+
AgentName: "vaultmgr",
644+
MyAgentName: agentName,
645+
TopicImpl: types.VaultStatus{},
646+
Activate: false,
647+
CreateHandler: n.handleVaultStatusCreate,
648+
ModifyHandler: n.handleVaultStatusModify,
649+
WarningTime: warningTime,
650+
ErrorTime: errorTime,
651+
})
652+
653+
n.subEnrolledCertStatus, err = n.PubSub.NewSubscription(pubsub.SubscriptionOptions{
654+
AgentName: "scepclient",
655+
MyAgentName: agentName,
656+
TopicImpl: types.EnrolledCertificateStatus{},
657+
Activate: false,
658+
Persistent: true,
659+
WarningTime: warningTime,
660+
ErrorTime: errorTime,
661+
})
616662
return nil
617663
}
618664

@@ -661,6 +707,17 @@ func (n *nim) applyGlobalConfig(gcp *types.ConfigItemValueMap) {
661707
timeout := gcp.GlobalValueInt(types.NetworkTestTimeout)
662708
n.connTester.TestTimeout = time.Second * time.Duration(timeout)
663709
n.connTester.DiagRemoteEndpoints = types.GetDiagRemoteEndpointURLs(n.Log, gcp)
710+
metricInterval := gcp.GlobalValueInt(types.MetricInterval)
711+
if metricInterval != 0 && n.metricInterval != metricInterval {
712+
if n.publishTicker != nil {
713+
interval := time.Duration(metricInterval) * time.Second
714+
maxTime := float64(interval)
715+
minTime := maxTime * 0.3
716+
n.publishTicker.UpdateRangeTicker(
717+
time.Duration(minTime), time.Duration(maxTime))
718+
}
719+
n.metricInterval = metricInterval
720+
}
664721
n.gcInitialized = true
665722
}
666723

@@ -850,6 +907,31 @@ func (n *nim) handleKubeUserServicesDelete(_ interface{}, _ string, _ interface{
850907
n.dpcManager.UpdateKubeUserServices(types.KubeUserServices{})
851908
}
852909

910+
func (n *nim) handleVaultStatusCreate(_ interface{}, key string, statusArg interface{}) {
911+
n.handleVaultStatusImpl(key, statusArg)
912+
}
913+
914+
func (n *nim) handleVaultStatusModify(_ interface{}, key string, statusArg, _ interface{}) {
915+
n.handleVaultStatusImpl(key, statusArg)
916+
}
917+
918+
func (n *nim) handleVaultStatusImpl(_ string, statusArg interface{}) {
919+
status := statusArg.(types.VaultStatus)
920+
vaultIsReady := status.Name == types.DefaultVaultName &&
921+
status.ConversionComplete &&
922+
status.Status != eveinfo.DataSecAtRestStatus_DATASEC_AT_REST_ERROR
923+
n.dpcManager.UpdateVaultReadiness(vaultIsReady)
924+
}
925+
926+
func (n *nim) handleEnrolledCertUpdate() {
927+
var enrolledCerts []types.EnrolledCertificateStatus
928+
for _, item := range n.subEnrolledCertStatus.GetAll() {
929+
certStatus := item.(types.EnrolledCertificateStatus)
930+
enrolledCerts = append(enrolledCerts, certStatus)
931+
}
932+
n.dpcManager.UpdateEnrolledCerts(enrolledCerts)
933+
}
934+
853935
func (n *nim) listPublishedDPCs(directory string) (dpcFilePaths []string) {
854936
locations, err := os.ReadDir(directory)
855937
if err != nil {
@@ -962,3 +1044,34 @@ func (n *nim) ingestDevicePortConfigFile(oldDirname string, newDirname string, n
9621044
filename, err)
9631045
}
9641046
}
1047+
1048+
func (n *nim) publishPNACMetrics() {
1049+
var pnacMetricsList types.PNACMetricsList
1050+
dnsObj, err := n.pubDeviceNetworkStatus.Get("global")
1051+
if err != nil {
1052+
return
1053+
}
1054+
dns, ok := dnsObj.(types.DeviceNetworkStatus)
1055+
if !ok {
1056+
return
1057+
}
1058+
for _, port := range dns.Ports {
1059+
if port.IfName == "" || !port.PNAC.Enabled {
1060+
continue
1061+
}
1062+
ifIndex, exists, err := n.networkMonitor.GetInterfaceIndex(port.IfName)
1063+
if !exists || err != nil {
1064+
continue
1065+
}
1066+
metrics, err := n.networkMonitor.GetPNACMetrics(ifIndex)
1067+
if err != nil {
1068+
n.Log.Error(err)
1069+
} else {
1070+
pnacMetricsList.Ports = append(pnacMetricsList.Ports, metrics)
1071+
}
1072+
}
1073+
err = n.pubPNACMetrics.Publish(pnacMetricsList.Key(), pnacMetricsList)
1074+
if err != nil {
1075+
n.Log.Error(err)
1076+
}
1077+
}

0 commit comments

Comments
 (0)