Skip to content

Add SCEP certificate enrollment and 802.1x port-based network access control (PNAC)#5691

Open
milan-zededa wants to merge 2 commits intolf-edge:masterfrom
milan-zededa:scep-and-pnac
Open

Add SCEP certificate enrollment and 802.1x port-based network access control (PNAC)#5691
milan-zededa wants to merge 2 commits intolf-edge:masterfrom
milan-zededa:scep-and-pnac

Conversation

@milan-zededa
Copy link
Contributor

@milan-zededa milan-zededa commented Mar 19, 2026

Description

SCEP (Simple Certificate Enrollment Protocol) enables automated certificate
enrollment from a CA server. IEEE 802.1x is a port-based network access control
(PNAC) standard that restricts network access on a switch port until the device
authenticates using a certificate (or other EAP method).

The 802.1x authentication is implemented using wpa_supplicant.
The SCEP client is implemented using github.com/smallstep/scep.

The end-to-end workflow is as follows:

  1. Device sends a DHCP request with a vendor class identifier that identifies
    it as EVE OS: LFEDGE-EVE.
  2. The network switch places the port into a non-authenticated VLAN. Because
    it detects EVE OS, it allows the device to reach the controller and fetch
    network and SCEP configuration (needed to bootstrap the enrollment).
  3. Device follows the SCEP profile to enroll a certificate, either by talking
    directly to the SCEP server or through a controller-provided SCEP proxy
    (essentially an HTTP proxy).
  4. Device uses the enrolled certificate to authenticate the port via 802.1x.
    The switch then moves the port to the authenticated VLAN.
  5. After a configurable delay, the device repeats the DHCP request to obtain
    an IP address from the authenticated VLAN.

EVE publishes PNAC status, enrolled certificate status and PNAC metrics
to the controller.

How to test and validate this PR

Prerequisites

Network switch with 802.1X support

A managed network switch capable of IEEE 802.1X port-based authentication is required.
The switch must be configured with:

  • 802.1X enabled on the port(s) connected to the EVE device.
  • RADIUS server configured as the authentication backend (e.g. FreeRADIUS).
    The RADIUS server must be set up to validate EAP-TLS client certificates.
  • Two VLANs:
    • Non-authenticated (bootstrap) VLAN: Provides limited network access -- enough
      for the device to reach the EVE controller and the SCEP server (or the
      controller's SCEP proxy). The switch should assign this VLAN to the port by default
      when the device is not yet authenticated (i.e. as the "guest" or "auth-fail" VLAN).
    • Authenticated VLAN: Provides full network access. The switch moves the port
      to this VLAN after a successful 802.1X authentication.
  • Vendor Class ID-based policy (optional but recommended): Configure the switch
    or DHCP server to recognize the vendor class identifier LFEDGE-EVE (DHCP Option 60)
    and allow the device to reach the controller from the non-authenticated VLAN.
    Without this, bootstrap connectivity must be ensured by other means (e.g. allowing
    all traffic to the controller IP from the guest VLAN).

SCEP server

A SCEP server must be deployed and reachable from the device -- either directly or via
the controller-provided SCEP proxy. The SCEP server must be configured with:

  • A CA certificate and key for signing enrolled certificates.
  • A challenge password (shared secret) that devices use to authenticate their enrollment requests.

The CA certificate used by the SCEP server should also be configured in the RADIUS server's
EAP-TLS settings as a trusted CA for client certificate validation.

Configuration

1. SCEP profile

Configure a SCEP enrollment profile on the controller (EdgeDevConfig.ScepProfiles) with:

  • Profile name: A logical identifier (e.g. pnac-cert).
  • SCEP server URL: Full URL including path (e.g. https://scep.example.com/scep).
  • Controller proxy: Enable UseControllerProxy if the SCEP server is not directly
    reachable from the non-authenticated VLAN. In this case, the controller acts as an HTTP
    proxy for SCEP traffic.
  • Challenge password: The SCEP challenge, encrypted using the device's cipher context.
  • CA certificate(s): Trusted CA chain for validating SCEP server responses.
  • CSR parameters: Subject DN, SANs, key type (e.g. RSA 2048 or ECDSA P-256),
    hash algorithm, and renewal period percentage.

2. PNAC configuration

Configure 802.1X PNAC for the device port (EdgeDevConfig.Pnacs) with:

  • Network adapter reference: The logical label of the port to authenticate.
  • EAP method: EAP_TLS.
  • Certificate enrollment profile: Reference the SCEP profile name created above
    (e.g. pnac-cert).
  • EAP identity (optional): If not set, EVE derives it from the enrolled certificate's
    subject CN or SAN URI.
  • CA certificate(s): Trusted CA chain for verifying the authentication server's
    (switch/RADIUS) TLS certificate during the EAP-TLS handshake.

Validation steps

1. Verify bootstrap connectivity

  • Confirm that the device boots with the port in the non-authenticated VLAN.
  • Verify that the device can reach the controller and retrieve its configuration
    (check DPC verify logs and device info messages in the controller).

2. Verify SCEP certificate enrollment

  • Check the enrolled certificate status published by the device:
    cat /persist/status/scepclient/EnrolledCertificateStatus/*.json | jq .
  • Confirm CertStatus is CERT_STATUS_AVAILABLE.
  • Verify that Subject, Issuer, SAN, IssueTimestamp, ExpirationTimestamp,
    and SHA256Fingerprint are populated and match the expected values from the SCEP server.
  • Confirm the certificate and private key files exist at the paths indicated by
    CertFilepath and PrivateKeyFilepath.

3. Verify 802.1X authentication

  • Check the PNAC state file to confirm the port is authenticated:
    cat /run/nim/pnac.state/<interface-name>
    Expected: STATE: CONNECTED with a recent timestamp.
  • Check wpa_supplicant status via the control socket:
    wpa_cli -p /run/nim/wpa_supplicant -i <interface-name> status
    Confirm wpa_state=COMPLETED and EAP state=SUCCESS.

4. Verify VLAN transition and IP assignment

  • After successful authentication, the switch should move the port to the authenticated VLAN.
  • After the DHCP reacquire delay (default 5 seconds, configurable via
    pnac.dhcp.reacquire.delay), the device should obtain a new IP address from the
    authenticated VLAN's DHCP range.
  • Verify the assigned IP:
    ip addr show <interface-name>
    Confirm the IP belongs to the authenticated VLAN subnet, not the bootstrap VLAN subnet.

5. Verify full network access

  • Confirm that the device can access networks and services that are only reachable from
    the authenticated VLAN (e.g. external internet, internal services blocked on the guest VLAN).
  • Verify that applications deployed on the device have connectivity through the authenticated
    network.

6. Verify published status and metrics on the controller

  • PNAC status: Check that the controller shows the correct supplicant state,
    last authentication timestamp, and no errors for the port.
  • Enrolled certificate status: Verify the certificate details (subject, issuer, validity,
    fingerprint, status) are visible on the controller.
  • PNAC metrics: Confirm that EAPOL frame counters are being reported and incrementing
    (EAPOLFramesRx/Tx, EAPOLStartFramesTx, EAPOLReqIdFramesRx, EAPOLRespFramesTx).
    Check that EAPOLInvalidFramesRx and EAPLengthErrorFramesRx remain at zero.

7. Verify certificate renewal

  • To test renewal without waiting for the full certificate lifetime, either:
    • Issue a short-lived certificate from the SCEP server (e.g. 10 minutes validity), or
    • Set RenewPeriodPercent to a low value (e.g. 10) so renewal triggers early.
  • Confirm that the device re-enrolls the certificate before expiration.
  • Verify that 802.1X re-authentication succeeds with the renewed certificate and the port
    remains in the authenticated VLAN without disruption.

8. Verify error recovery

  • Lost certificate: Remove the certificate file from the device and wait for the next
    retry interval (scep.retry.interval, default 5 minutes). Confirm the device detects
    the missing file, sets CERT_STATUS_ENROLLMENT_FAILED with an appropriate error message,
    and re-enrolls automatically.
  • Lost private key: Remove the private key file and verify the same recovery behavior,
    including cleanup of the orphaned certificate file.
  • SCEP server unavailable: Stop the SCEP server and trigger enrollment (e.g. by modifying
    the SCEP profile). Confirm the device sets a failure status with an error and retries
    at the configured interval. Restart the SCEP server and confirm enrollment succeeds
    on the next retry.

Changelog notes

  • New feature: IEEE 802.1X Port-Based Network Access Control (PNAC)

    • Added 802.1X supplicant support using wpa_supplicant with EAP-TLS authentication
    • Supports per-port PNAC configuration referencing a certificate enrollment profile
    • DHCP Vendor Class Identifier (LFEDGE-EVE, Option 60) allows network infrastructure to identify EVE devices and grant bootstrap access
    • Configurable DHCP lease reacquisition delay after VLAN transition (pnac.dhcp.reacquire.delay)
    • Publishes per-port PNAC status (supplicant state, last auth timestamp, errors) and EAPOL metrics to the controller
  • New feature: SCEP certificate enrollment

    • Added SCEP client (scepclient microservice) using github.com/smallstep/scep library
    • Supports direct SCEP server communication or routing through a controller-provided SCEP proxy
    • Configurable CSR parameters: subject DN, SANs, key type, hash algorithm
    • Automatic certificate renewal based on configurable lifetime percentage (RenewPeriodPercent)
    • Automatic retry on enrollment failure or SCEP PENDING response (scep.retry.interval)
    • Automatic recovery when certificate or private key files are lost
    • Publishes enrolled certificate status (subject, issuer, validity, fingerprint, cert status) to the controller
  • New configuration properties

    • scep.retry.interval: interval between SCEP enrollment retry attempts (default: 5 min)
    • pnac.dhcp.reacquire.delay: delay before DHCP reacquire after 802.1X auth state change (default: 5s)
    • dhcp.enable.vendorclassid: toggle DHCP Vendor Class Identifier (default: enabled)

PR Backports

New feature, not to be backported.

Checklist

  • I've provided a proper description
  • I've added the proper documentation
  • I've tested my PR on amd64 device
  • I've tested my PR on arm64 device
  • I've written the test verification instructions
  • I've set the proper labels to this PR
  • I've checked the boxes above, or I've provided a good reason why I didn't check them.

@milan-zededa milan-zededa self-assigned this Mar 19, 2026
@milan-zededa milan-zededa added main-quest The fate of the project rests on this PR. Prioritise review to advance the storyline! new feature Introduces a new feature labels Mar 19, 2026
@uncleDecart
Copy link
Member

Prerequisites
Network switch with 802.1X support
A managed network switch capable of IEEE 802.1X port-based authentication is required.
The switch must be configured with:

Won't it be possible to test this with software defined switch?

@milan-zededa
Copy link
Contributor Author

milan-zededa commented Mar 19, 2026

Prerequisites
Network switch with 802.1X support
A managed network switch capable of IEEE 802.1X port-based authentication is required.
The switch must be configured with:

Won't it be possible to test this with software defined switch?

Yes, it is possible to test this with a software-defined switch. The PR description does not impose any requirement for physical hardware, only that the switch supports IEEE 802.1X functionality.

@milan-zededa milan-zededa force-pushed the scep-and-pnac branch 2 times, most recently from e3a0b10 to fa82d0f Compare March 19, 2026 14:13
AgentName: "zedagent",
MyAgentName: agentName,
TopicImpl: types.ConfigItemValueMap{},
Persistent: true,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will update this once #5584 is merged

@codecov
Copy link

codecov bot commented Mar 19, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 29.45%. Comparing base (2281599) to head (ef34639).
⚠️ Report is 346 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5691      +/-   ##
==========================================
+ Coverage   19.52%   29.45%   +9.92%     
==========================================
  Files          19       18       -1     
  Lines        3021     2417     -604     
==========================================
+ Hits          590      712     +122     
+ Misses       2310     1554     -756     
- Partials      121      151      +30     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@milan-zededa milan-zededa force-pushed the scep-and-pnac branch 2 times, most recently from 6009d12 to 2a0c6e0 Compare March 19, 2026 16:54
@milan-zededa milan-zededa requested a review from shjala March 20, 2026 09:02
| diag.probe.remote.http.endpoint | string | `"http://www.google.com"` | - | - | Remote endpoint (URL, IP instead of hostname is accepted) queried over HTTP to assess the state of network connectivity whenever the controller is not reachable. Used only for diagnostics (no functional impact). Set to an empty string to disable. |
| diag.probe.remote.https.endpoint | string | `"https://www.google.com"` | - | - | Remote endpoint (URL, IP instead of hostname is NOT accepted) queried over HTTPS to assess the state of network connectivity whenever the controller is not reachable. Used only for diagnostics (no functional impact). Set to an empty string to disable. |
| app.enable.tcp.mss.clamping | bool | true | - | - | Configuration property that enables EVE to automatically adjust (clamp) the TCP MSS on forwarded application traffic to match the path MTU, preventing fragmentation and connectivity issues on lower-MTU links. |
| scep.retry.interval | timer in seconds | 300 (5 minutes) | 60 (1 minute) | 3600 (1 hour) | Interval between retry attempts for certificates that previously failed to enroll/renew or returned PENDING from the SCEP server. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you get PENDING will you wait for 5 minutes by default before checking the status? That seems like a long time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PENDING is returned when certificate enrollment requires manual approval from an administrator. Given this, it’s reasonable to expect the process to take at least a few minutes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: the SCEP RFC does not define a recommended polling interval, thus I made it configurable.

| diag.probe.remote.https.endpoint | string | `"https://www.google.com"` | - | - | Remote endpoint (URL, IP instead of hostname is NOT accepted) queried over HTTPS to assess the state of network connectivity whenever the controller is not reachable. Used only for diagnostics (no functional impact). Set to an empty string to disable. |
| app.enable.tcp.mss.clamping | bool | true | - | - | Configuration property that enables EVE to automatically adjust (clamp) the TCP MSS on forwarded application traffic to match the path MTU, preventing fragmentation and connectivity issues on lower-MTU links. |
| scep.retry.interval | timer in seconds | 300 (5 minutes) | 60 (1 minute) | 3600 (1 hour) | Interval between retry attempts for certificates that previously failed to enroll/renew or returned PENDING from the SCEP server. |
| pnac.dhcp.reacquire.delay | integer (seconds) | 5 | 0 | 60 (1 minute) | Delay before the DHCP client reacquires a lease after a PNAC (802.1X) port authentication state change. This is needed when the network switch reassigns the port to a different access VLAN based on the authentication result. Setting this value to 0 disables DHCP reacquire. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will setting this to a too low value mean that DHCP might acquire an IP address on the unauthenticated VLAN, hence communication will fail?
Can we assume that we will get an IP address in a different subnet after the authentication or use the fact that we got the same subnet as a hint that we should ask DHCP to validate its lease or get a new lease later?

Copy link
Contributor Author

@milan-zededa milan-zededa Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this could indeed happen.
However, a switch does not have to use different VLANs and IP subnets between authenticated and non-authenticated ports. It can keep devices in the same VLAN and enforce access via ACL rules. So the IP address could remain the same even after successful port authentication. So I'm not sure if it is a good idea to keep reacquiring a new DHCP lease until we get IP from a different subnet. I guess we could do that but with some configurable retry limit (where 0 would mean to disable re-leasing).

Copy link
Contributor Author

@milan-zededa milan-zededa Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rewrote this and replaced the delay interval, which was quite fragile, with pnac.dhcp.reacquire.max.retries: After the port authentication state changes, the device retries the DHCP request with exponential backoff (2s, 4s, 8s, ...) to obtain an IP address from the authenticated VLAN. Retries continue until the IP subnet changes (indicating the VLAN transition completed) or the configured maximum number of retries is reached.

Comment on lines +648 to +650
4. **802.1X port authentication**: Once the certificate is enrolled, the device uses it
to authenticate the port via 802.1X EAP-TLS. Upon successful authentication, the switch
moves the port to the authenticated VLAN, granting full network access.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that the port will be part of the unautenticated VLAN while the 802.1X is doing multiple round trips to authenticate?

Does DHCP get triggered pnac.dhcp.reacquire.delay after the802.1X exchange is successful?

If the port is connected to no VLAN while the 802.1X is progressing, then there will be less concerns about needing to delay the DHCP request.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First we need IP from the unautenticated VLAN to access the cloud, get config and enroll certificate from a SCEP server.
Then we start wpa_supplicant configured with the enrolled certificate, and once the authentication succeeds, we wait for pnac.dhcp.reacquire.delay, then obtain a new DHCP lease.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rewrote this and replaced the delay interval, which was quite fragile, with pnac.dhcp.reacquire.max.retries: After the port authentication state changes, the device retries the DHCP request with exponential backoff (2s, 4s, 8s, ...) to obtain an IP address from the authenticated VLAN. Retries continue until the IP subnet changes (indicating the VLAN transition completed) or the configured maximum number of retries is reached.


### Certificate lifecycle

The enrolled certificate is stored on the device along with its private key (kept in the vault
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does "the vault" mean /persist/vault?

Can the private key be needed immediately after a reboot to (re)authenticiate over 802.1X?

Copy link
Contributor Author

@milan-zededa milan-zededa Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does "the vault" mean /persist/vault?

Yes

Port authentication is needed for application connectivity.
Controller is accessible even through the non-authenticated management ports.
So we do not need to perform 802.1X until we are about to start applications. Therefore we do not need to access the private key before the vault is unlocked.

}

func (c *SCEPClient) getPrivateKeyFilePath(profileName string) string {
return filepath.Join(privateKeyDir, profileName+"-key.pem")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the 'profileName' probably should be sanitized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I fixed it by using URL-safe base64 encoding of the profile name.

# Auto-generated by NIM.
# Invoked by: wpa_cli -a <script>

IFACE="%s"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not used anymore

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, removed.

@naiming-zededa
Copy link
Contributor

naiming-zededa commented Mar 21, 2026

eve-scep-pnac-workflow

Upload the AI generated diagram flow, to see the correctness, and if want to use this.

@milan-zededa
Copy link
Contributor Author

milan-zededa commented Mar 23, 2026

Upload the AI generated diagram flow, to see the correctness, and if want to use this.

@naiming-zededa Do you have source code for this diagram? I would need to make few edits to make it 100% correct and up-to-date.

Update EVE-API to include the recently added SCEP
(Simple Certificate Enrollment Protocol) and PNAC (802.1x) support.

Signed-off-by: Milan Lenco <milan@zededa.com>

### Bootstrapping workflow

The full workflow from an unauthenticated device to an authenticated network port is:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it be default workflow for every port, or just for the ones we marked as PNAC-required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for those with PNAC enabled.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we notify if "non-PNAC enabled" port has no connectivity because it's most likely in a PNAC-enabled network?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Detection of 802.1X is not included in this PR and may be added as a future enhancement, as it was not required for the current scope.
It should be feasible to implement (e.g., by running wpa_supplicant temporarily and inspecting EAP RX metrics), but I chose not to introduce additional complexity in this already sizable PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, was just poking around use case, thanks

cipherMetrics *cipher.AgentMetrics
agentMetrics *controllerconn.AgentMetrics
cipherMetrics *cipher.AgentMetrics
metricInterval uint32 // In seconds
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically it's a limitation but I don't think anyone would setup to collect metrics less frequent than 136 years :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Max allowed value for timer.metric.interval is 3600 (1 hour)

Comment on lines +32 to +33
errorTime = 3 * time.Minute
warningTime = 40 * time.Second
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know other agents also have those hardcoded, but in case of PNAC we also don't want those to be configurable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for process/task watchdog, not related to PNAC or certificate enrollment.

privateKeyDir = "/persist/vault/pnac"

defaultKeyType = eveconfig.KeyType_KEY_TYPE_RSA_2048
defaultHashAlgorithm = eveconfig.HashAlgorithm_HASH_ALGORITHM_SHA256
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we allow to change HashAlgorithms and KeyType? If yes, how do we handle if user changes those?

Copy link
Contributor Author

@milan-zededa milan-zededa Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we allow such change. Device will create new private+public keys and enroll a new certificate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would we need to get inside "quarantine VPN" for that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, authenticated ("production") VLAN also provides access to the SCEP server. This is required for certificate renewal to work.

…control (PNAC)

SCEP (Simple Certificate Enrollment Protocol) enables automated certificate
enrollment from a CA server. IEEE 802.1x is a port-based network access control
(PNAC) standard that restricts network access on a switch port until the device
authenticates using a certificate (or other EAP method).

The 802.1x authentication is implemented using wpa_supplicant.
The SCEP client is implemented using github.com/smallstep/scep.

The end-to-end workflow is as follows:
1. Device sends a DHCP request with a vendor class identifier that identifies
   it as EVE OS.
2. The network switch places the port into a non-authenticated VLAN. Because
   it detects EVE OS, it allows the device to reach the controller and fetch
   network and SCEP configuration (needed to bootstrap the enrollment).
3. Device follows the SCEP profile to enroll a certificate, either by talking
   directly to the SCEP server or through a controller-provided SCEP proxy
   (essentially an HTTP proxy).
4. Device uses the enrolled certificate to authenticate the port via 802.1x.
   The switch then moves the port to the authenticated VLAN.
5. The device retries the DHCP request with exponential backoff (2s, 4s, 8s, ...)
   to obtain an IP address from the authenticated VLAN. Retries continue until
   the IP subnet changes (indicating the VLAN transition completed) or the
   configured maximum number of retries (pnac.dhcp.reacquire.max.retries,
   default 4) is reached. Setting this to 0 disables DHCP reacquire.

EVE publishes PNAC status, enrolled certificate status and PNAC metrics
to the controller.

Signed-off-by: Milan Lenco <milan@zededa.com>
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for SCEP-based certificate enrollment and 802.1X Port-Based Network Access Control (PNAC) to EVE’s networking stack, including status/metrics reporting to the controller and DHCP reacquisition logic to handle VLAN changes after port authentication.

Changes:

  • Introduces scepclient agent and vendors github.com/smallstep/scep (+ pkcs7) for SCEP enrollment.
  • Adds PNAC configuration/status/metrics plumbing through NIM/netmonitor → DPC manager/reconciler → zedagent reporting.
  • Extends DHCP client management (vendor class ID option + declarative “reacquire” restarts) and documents new global properties.

Reviewed changes

Copilot reviewed 41 out of 80 changed files in this pull request and generated no comments.

Show a summary per file
File Description
pkg/pillar/zedbox/zedbox.go Registers scepclient as a runnable zedbox entrypoint.
pkg/pillar/vendor/modules.txt Updates vendored module list (eve-api bump + smallstep deps).
pkg/pillar/vendor/github.com/smallstep/scep/x509util/x509util.go Vendored SCEP helper for CSR/challengePassword handling.
pkg/pillar/vendor/github.com/smallstep/scep/logger.go Vendored SCEP logging interface.
pkg/pillar/vendor/github.com/smallstep/scep/cryptoutil/cryptoutil.go Vendored SCEP crypto utilities.
pkg/pillar/vendor/github.com/smallstep/scep/certs_selector.go Vendored SCEP certificate selection helpers.
pkg/pillar/vendor/github.com/smallstep/scep/README.md Vendored SCEP README.
pkg/pillar/vendor/github.com/smallstep/scep/Makefile Vendored SCEP build tooling.
pkg/pillar/vendor/github.com/smallstep/scep/LICENSE Vendored SCEP license.
pkg/pillar/vendor/github.com/smallstep/scep/.gitignore Vendored SCEP gitignore.
pkg/pillar/vendor/github.com/smallstep/pkcs7/verify.go Vendored PKCS7 verification support.
pkg/pillar/vendor/github.com/smallstep/pkcs7/pkcs7.go Vendored PKCS7 core implementation (+ legacy x509 fallback toggle).
pkg/pillar/vendor/github.com/smallstep/pkcs7/internal/legacy/x509/verify.go Vendored legacy x509 parsing helpers for PKCS7.
pkg/pillar/vendor/github.com/smallstep/pkcs7/internal/legacy/x509/pkcs1.go Vendored legacy x509 parsing helper types.
pkg/pillar/vendor/github.com/smallstep/pkcs7/internal/legacy/x509/oid.go Vendored legacy OID parsing/representation.
pkg/pillar/vendor/github.com/smallstep/pkcs7/internal/legacy/x509/doc.go Vendored legacy x509 package docs.
pkg/pillar/vendor/github.com/smallstep/pkcs7/internal/legacy/x509/debug.go Vendored legacy godebug stub for parser behavior.
pkg/pillar/vendor/github.com/smallstep/pkcs7/encrypt.go Vendored PKCS7 encryption support.
pkg/pillar/vendor/github.com/smallstep/pkcs7/decrypt.go Vendored PKCS7 decryption support.
pkg/pillar/vendor/github.com/smallstep/pkcs7/ber.go Vendored BER→DER conversion helper.
pkg/pillar/vendor/github.com/smallstep/pkcs7/README.md Vendored PKCS7 README.
pkg/pillar/vendor/github.com/smallstep/pkcs7/Makefile Vendored PKCS7 build tooling.
pkg/pillar/vendor/github.com/smallstep/pkcs7/LICENSE Vendored PKCS7 license.
pkg/pillar/vendor/github.com/smallstep/pkcs7/.gitignore Vendored PKCS7 gitignore.
pkg/pillar/vendor/github.com/lf-edge/eve-api/go/evecommon/acipherinfo.pb.go Updates eve-api proto bindings to carry SCEP challenge password in encryption block.
pkg/pillar/utils/proc/process.go Adds optional stdout/stderr capture and non-forking child reaping/IsRunning improvements.
pkg/pillar/utils/proc/logger.go Adds processLogger to forward child stdout/stderr into EVE logs.
pkg/pillar/types/scep.go Introduces SCEP profile types and enrolled certificate status model.
pkg/pillar/types/pnac.go Introduces PNAC config/status/metrics types + state/ctrl-socket constants.
pkg/pillar/types/global_test.go Extends global settings test coverage for new keys.
pkg/pillar/types/global.go Adds global properties for SCEP retry interval, PNAC DHCP reacquire retries, vendor class ID toggle.
pkg/pillar/types/dpc.go Adds PNAC config into NetworkPortConfig and includes it in DPC equality logic.
pkg/pillar/types/dns.go Adds PNAC status into port status and includes it in DNS equality logic.
pkg/pillar/types/cipherinfotypes.go Extends decrypted encryption block with SCEP challenge password field.
pkg/pillar/scripts/device-steps.sh Adds scepclient to device steps agent list.
pkg/pillar/netmonitor/netmonitor.go Extends NetworkMonitor interface with PNAC status/metrics APIs + PNAC events.
pkg/pillar/netmonitor/mock.go Adds mock PNAC status/metrics support and PNAC event publishing.
pkg/pillar/netmonitor/linux.go Implements PNAC status/metrics via wpa_cli and watches PNAC state directory to emit PNAC events.
pkg/pillar/go.sum Updates sums for eve-api bump and new smallstep deps.
pkg/pillar/go.mod Adds smallstep/scep (+ pkcs7 indirect) and bumps eve-api/go version.
pkg/pillar/dpcreconciler/linuxitems/adapter.go Adds GetWirelessType accessor used by reconciler logic.
pkg/pillar/dpcreconciler/linux.go Wires PNAC into intended physical interface graph (wpa_supplicant 802.1x) and DHCP reacquire into L3 graph.
pkg/pillar/dpcreconciler/genericitems/dhcpcd_test.go Updates unit tests for vendor class ID argument support.
pkg/pillar/dpcreconciler/genericitems/dhcpcd.go Adds DHCP “reacquire counter” restarts and optional vendor class ID argument.
pkg/pillar/dpcreconciler/dpcreconciler.go Extends reconciler args with vault readiness, enrolled certs, and DHCP reacquire counters.
pkg/pillar/dpcmanager/verify.go Cancels DHCP reacquire tracking when switching DPC verify target.
pkg/pillar/dpcmanager/dpcmanager_test.go Adds tests for PNAC-driven DHCP reacquire behavior and cancellation.
pkg/pillar/dpcmanager/dpcmanager.go Implements PNAC event handling, DHCP reacquire tracking, and passes new args into reconciler.
pkg/pillar/dpcmanager/dpc.go Cancels DHCP reacquire tracking when a new DPC is applied.
pkg/pillar/dpcmanager/dns.go Populates per-port PNAC status (including missing cert conditions) into DNS/device network status.
pkg/pillar/controllerconn/send.go Adds RequestOptions.CustomHeader to allow callers to override HTTP headers.
pkg/pillar/cmd/zedagent/zedagent.go Subscribes to enrolled cert status + PNAC metrics for controller publishing.
pkg/pillar/cmd/zedagent/reportinfo.go Publishes enrolled cert info and PNAC status into device info protobuf.
pkg/pillar/cmd/zedagent/handlemetrics.go Publishes PNAC metrics into device metrics protobuf.
pkg/pillar/cmd/zedagent/handleconfig.go Adds scaffolding for parsed PNAC config and SCEP profile publication tracking.
pkg/pillar/cmd/nim/nim.go Tracks vault readiness + enrolled certs for DPC reconciliation and periodically publishes PNAC metrics.
pkg/pillar/cipher/cipher.go Maps decrypted SCEP challenge password from protobuf into types.
pkg/edgeview/src/network.go Updates path used to read wpa_supplicant config file in edgeview output.
docs/DEVICE-CONNECTIVITY.md Documents the PNAC+SCEP workflow, configuration, and reporting.
docs/CONFIG-PROPERTIES.md Documents new global properties and adds scepclient to per-agent log-level list.
.github/CODEOWNERS Adds code ownership for pkg/pillar/cmd/scepclient/.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@naiming-zededa
Copy link
Contributor

Upload the AI generated diagram flow, to see the correctness, and if want to use this.

@naiming-zededa Do you have source code for this diagram? I would need to make few edits to make it 100% correct and up-to-date.

eve-scep-pnac-workflow.html

@milan-zededa this is the html file of this workflow.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

main-quest The fate of the project rests on this PR. Prioritise review to advance the storyline! new feature Introduces a new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants