Skip to content

Windows MDM enrollment row is not linked to the Fleet host record at enrollment time #45380

@getvictor

Description

@getvictor

💥 Actual behavior

On Windows automatic enrollment (Autopilot, Entra-join-during-OOBE, BYOD via Settings > Access work or school > Connect), the WSTEP RequestSecurityToken does not include the SMBIOS UUID. The server inserts the mdm_windows_enrollments row with host_uuid='' and relies on osquery's directIngestMDMDeviceIDWindows to backfill that linkage on the next distributed-read cycle (~10s after orbit check-in). During that window, any code that asks "find this host's Windows MDM enrollment" via MDMWindowsGetEnrolledDeviceWithHostUUID returns NotFound.

During that window, any server-side code that asks "find this host's Windows MDM enrollment" via MDMWindowsGetEnrolledDeviceWithHostUUID(host.UUID) returns NotFound, even though the enrollment exists. Whatever decision that code is making — picking ESP behavior, gating cancellation, reading the enrollment mode flags, anything else keyed on the enrollment — runs against missing data.

PR #45331's BYOD-no-cancel short-circuit is one place this surfaces today: a fast-failing install that reports back before osquery's direct-ingest linkage runs hits the gap and ends up cancelling pending steps + emitting canceled_setup_experience even though the host is BYOD. We patched that one call site with a name-based fallback, but the underlying race affects any future lookup of the same shape.

The current TODO comments in the code acknowledge the gap:

  • server/service/microsoft_mdm.go"TODO: Add check here to determine if MDM DeviceID is connected with Smbios UUID present on ./DevDetail/Ext/Microsoft/SMBIOSSerialNumber..."
  • server/service/microsoft_mdm.go"TODO: azure enrollments come with an empty uuid, I haven't figured out a good way to identify the device here. Note that we currently do the Enrollment->Host mapping during the next refetch of the host"

🛠️ Expected behavior

mdm_windows_enrollments.host_uuid should be populated server-side at (or shortly after) enrollment, so any subsequent server-side lookup keyed on the host's UUID finds the enrollment row deterministically. The osquery direct-ingest path can remain as a backstop, but should not be the primary linkage mechanism.

🧑‍💻 Steps to reproduce

These steps:

  • Have been confirmed to consistently lead to reproduction in multiple Fleet instances.
  • Describe the workflow that led to the error, but have not yet been reproduced in multiple Fleet instances.
  1. Wipe a Windows VM. Walk OOBE with a personal/local account so the desktop is reachable without MDM.
  2. On the Fleet server, enable Windows MDM and add at least one setup-experience software item flagged install_during_setup.
  3. On the VM: Settings > Accounts > Access work or school > Connect and sign in with an Entra user whose tenant routes MDM discovery to the Fleet server.
  4. Immediately after enrollment, query mdm_windows_enrollments for the new row. Observe host_uuid=''.
  5. Wait roughly 10 seconds (one osquery distributed-read cycle after fleetd starts). Re-query the same row. host_uuid is now populated.

During the window between steps 4 and 5, any server-side code that joins on host_uuid misses the row.

🕯️ More info (optional)

Metadata

Metadata

Assignees

No one assigned

    Labels

    #g-power-to-pcPower to the PC working groupbugSomething isn't working as documented~released bugThis bug was found in a stable release.

    Type

    No type

    Projects

    Status

    📨 Inbox

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions