You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* fix(agentless-azure): handle unpurged secret key during re-setup with a different RG
* fix(agentless-azure): handle resource group mismatch on deploy/destroy
* test(agentless-azure): improve test coverage
* docs(agentless-azure): update readme
* refactor(agentless-azure): remove dead code
* add(agentless-azure): make permissions check softer in preflight if RG already exists
* perf(agentless-azure): parallelize lookup checks and resource creation
* fix(agentless-azure): skip Key Vault secret retries when Secrets Officer already exists
* perf(agentless-azure): bump Terraform parallelism 10 -> 20
* perf(agentless-azure): run preflight checks in parallel
* test(agentless-azure): improve test coverage
* fix(agentless): mark active workflow step FAILED when Azure deploy exits with error
* add(agentless-azure): add roleDefinitions/write permission in preflight
* agentless azure: derive install_id and discover existing deployments via RG tag
* agentless azure: scope storage account and key vault names to install_id
* docs(agentless-azure): update readme with required permissions and document new RG behavior
* fix(agentless-azure): drop unused local in cmd_deploy
* fix(agentless-azure): treat missing state storage account as missing metadata on first deploy
* build(agentless-azure): re-build dist scripts
Copy file name to clipboardExpand all lines: azure/agentless/README.md
+97-14Lines changed: 97 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,9 +7,9 @@ This script automates the deployment of Datadog Agentless Scanner on Azure using
7
7
- Azure Cloud Shell or a machine with:
8
8
-`az` CLI installed and authenticated (`az login`)
9
9
-`terraform` CLI installed (>= 1.0)
10
-
- Azure subscriptions with appropriate permissions:
11
-
-`Owner`(or a role granting role-assignment write + resource creation) on the scanner subscription
12
-
- A role granting `Microsoft.Authorization/roleAssignments/write` on each scanned subscription (e.g., `User Access Administrator` or `Owner`), so the scanner's managed identity can be granted the roles it needs to snapshot and read disks
10
+
- Azure permissions (see [Permissions](#permissions) for the delegated, least-privilege variant if the SWE running the script is not subscription `Owner`):
11
+
-`Owner`on the scanner subscription, **or** the custom role described below at the scanner subscription scope plus both `Contributor` and `User Access Administrator`on the scanner resource group (the latter is required so the script can grant itself `Storage Blob Data Contributor` on the state Storage Account and `Key Vault Secrets Officer` on the Key Vault)
12
+
- A role granting `Microsoft.Authorization/roleAssignments/write`and `Microsoft.Authorization/roleDefinitions/write`on each scanned subscription (e.g., `User Access Administrator` or `Owner`), so the scanner's managed identity can be granted scan permissions and the custom scanning role can list the scan-target subscription in its `assignableScopes`
13
13
- The following resource providers registered in the scanner subscription (auto-registered by the script when possible): `Microsoft.Compute`, `Microsoft.Network`, `Microsoft.ManagedIdentity`, `Microsoft.Storage`, `Microsoft.KeyVault`, `Microsoft.Authorization`
|`SCANNER_SUBSCRIPTION`| Yes | Azure subscription ID where the scanner will be deployed |
41
41
|`SCANNER_LOCATIONS`| Yes | Comma-separated list of Azure locations (max 4) for scanners (e.g., `eastus` or `eastus,westeurope`) |
42
42
|`SUBSCRIPTIONS_TO_SCAN`| Yes | Comma-separated list of Azure subscription IDs to scan |
43
-
|`SCANNER_RESOURCE_GROUP`| No | Resource group name for scanner resources (default: `datadog-agentless-scanner`) |
43
+
|`SCANNER_RESOURCE_GROUP`| No | Resource group name for scanner resources (default: `datadog-agentless-scanner`). When unset, the script auto-discovers the resource group from the `DatadogAgentlessScanner=true` tag the previous deploy applied — this is what makes re-runs from a fresh Cloud Shell session work without re-setting the env var. To relocate an existing deployment to a different resource group, run `destroy` first.|
44
44
|`TF_STATE_STORAGE_ACCOUNT`| No | Custom Azure Storage Account for Terraform state (see below) |
45
45
46
46
Re-running `deploy` with new `SCANNER_LOCATIONS` or `SUBSCRIPTIONS_TO_SCAN` values merges them with the existing deployment (stored in the Terraform state storage account) instead of replacing it.
47
47
48
+
Only one Agentless Scanner deployment is supported per scanner subscription. If the script detects more than one tagged deployment in the scanner subscription, `deploy` fails fast and lists the resource groups so you can `destroy` the ones you no longer need — it never picks one silently. (On `destroy`, set `SCANNER_RESOURCE_GROUP` to choose which one to remove.)
49
+
48
50
### Destroy
49
51
50
52
To remove the scanner infrastructure:
@@ -67,22 +69,25 @@ If only one installation exists locally, `SCANNER_SUBSCRIPTION` can be omitted a
67
69
|`DD_APP_KEY`| Yes | Datadog Application key |
68
70
|`DD_SITE`| Yes | Datadog site |
69
71
|`SCANNER_SUBSCRIPTION`| No*| Scanner subscription ID (*inferred if only one installation exists locally) |
70
-
|`SCANNER_RESOURCE_GROUP`| No*| Resource group name (*required only if metadata does not contain it, e.g., installations created before this field was added) |
72
+
|`SCANNER_RESOURCE_GROUP`| No*| Resource group name (*auto-discovered from the `DatadogAgentlessScanner=true` tag when exactly one tagged deployment exists in the scanner subscription; required when the resource group was not tagged at deploy time — typically an admin-pre-created resource group — or when multiple tagged deployments exist) |
71
73
|`TF_STATE_STORAGE_ACCOUNT`| No | Custom Storage Account (if used during deploy) |
72
74
|`SCANNER_LOCATIONS`| No*| Locations to destroy (*fallback only if deployment metadata cannot be read) |
73
75
|`SUBSCRIPTIONS_TO_SCAN`| No*| Subscriptions to clean up (*fallback only if deployment metadata cannot be read) |
74
76
75
77
The destroy command will:
76
78
1. Run `terraform destroy` (prompts for confirmation)
77
79
2. Disable the Agentless scan options in Datadog for each previously configured subscription
78
-
3. Ask if you want to delete the Key Vault holding the API key (kept by default to allow reuse)
79
-
4. Leave the resource group and Terraform state storage account intact (manual deletion instructions provided)
80
+
3. Remove the deployment metadata blob from the state Storage Account on a successful run; if scan-options cleanup partially failed, the metadata is kept so a follow-up `destroy` can still find the subscription list
81
+
4. Ask if you want to delete the Key Vault holding the API key (kept by default to allow reuse)
82
+
5. Leave the resource group and Terraform state storage account intact (manual deletion instructions provided)
83
+
84
+
Unlike `deploy`, `destroy` does not run a permission preflight: missing role assignments surface as `terraform destroy` errors rather than a pre-run summary.
80
85
81
86
### Terraform State Storage
82
87
83
88
Terraform state is stored in an Azure Storage Account (blob container `tfstate`, key `datadog-agentless.tfstate`) to ensure persistence across runs and enable future updates or teardown.
84
89
85
-
**Default behavior:** A storage account with a deterministic name derived from the scanner subscription ID (e.g., `datadog<hash>`) is automatically created inside the scanner resource group. If it already exists (e.g., from a previous run), it is reused. The `azurerm` backend is configured with `use_azuread_auth = true`, and the script grants the current user the `Storage Blob Data Contributor` role on the account.
90
+
**Default behavior:** A storage account named `datadog<install-id>` (where `install-id` is the first 12 hex chars of `sha256("<scanner-subscription>|<resource-group>")`) is created inside the scanner resource group. Two deploys against the same `(SCANNER_SUBSCRIPTION, SCANNER_RESOURCE_GROUP)` pair resolve to the same Storage Account name and are therefore the same install. Re-running with a different `SCANNER_RESOURCE_GROUP` resolves to a different Storage Account, which is why this combination uniquely identifies an installation. The `azurerm` backend is configured with `use_azuread_auth = true`, and the script grants the current user the `Storage Blob Data Contributor` role on the account.
86
91
87
92
**Custom storage account:** Set `TF_STATE_STORAGE_ACCOUNT` to use your own account:
88
93
```bash
@@ -95,13 +100,14 @@ The custom storage account must already exist in `SCANNER_RESOURCE_GROUP`; the s
2.**Creates state storage** - Ensures the resource group, Storage Account, and `tfstate` blob container exist, and grants the current user blob data access
100
-
3.**Stores API key in Key Vault** - Creates an RBAC-authorized Key Vault (or recovers a soft-deleted one) and stores the Datadog API key as a secret
101
-
4.**Generates Terraform configuration** - Creates `main.tf` referencing the `terraform-module-datadog-agentless-scanner` Azure sub-modules (managed identity, roles, custom data, virtual network, VMSS), one virtual network + VMSS per location
102
-
5.**Runs Terraform** - Executes `terraform init` and `terraform apply`
103
+
1.**Discovers existing deployment** - Lists resource groups in the scanner subscription tagged `DatadogAgentlessScanner=true`. If exactly one is found and `SCANNER_RESOURCE_GROUP` is unset, the deployment is silently reused; if it disagrees with an explicitly set `SCANNER_RESOURCE_GROUP`, deploy fails with guidance to either reuse it or destroy first; if more than one is found, deploy fails (single-install policy)
3.**Creates state storage** - Ensures the resource group, Storage Account, and `tfstate` blob container exist, and grants the current user blob data access. The resource group is tagged with `Datadog=true` and `DatadogAgentlessScanner=true` only when the script creates it; resource groups that already exist (e.g., admin-pre-created) are left untagged so the marker never appears on resources the script does not own
106
+
4.**Stores API key in Key Vault** - Creates an RBAC-authorized Key Vault (or recovers a soft-deleted one) and stores the Datadog API key as a secret
107
+
5.**Generates Terraform configuration** - Creates `main.tf` referencing the `terraform-module-datadog-agentless-scanner` Azure sub-modules (managed identity, roles, custom data, virtual network, VMSS), one virtual network + VMSS per location
108
+
6.**Runs Terraform** - Executes `terraform init` and `terraform apply`
103
109
104
-
Deployment metadata (locations, subscriptions, resource group) is written to the state storage account after a successful apply so that later `deploy` runs can merge new inputs and `destroy` runs can recover the full configuration without local state.
110
+
Deployment metadata (locations, subscriptions, resource group, `install-id`) is written to the state storage account after a successful apply so that later `deploy` runs can merge new inputs and `destroy` runs can recover the full configuration without local state.
105
111
106
112
## Resources Created
107
113
@@ -115,6 +121,83 @@ Deployment metadata (locations, subscriptions, resource group) is written to the
115
121
-**Scanned Subscriptions:**
116
122
- Role assignments granting the scanner's managed identity the permissions needed to snapshot and read disks
117
123
124
+
## Permissions
125
+
126
+
The simplest path is to run the setup as `Owner` on the scanner subscription and on every scanned subscription. Many enterprise tenants instead pre-create the resource group and grant the engineer running the setup a least-privilege custom role; this section documents that delegated path.
127
+
128
+
The setup needs three independent grants:
129
+
130
+
1.**Scanner resource group** — write access to the resources created inside the RG (Storage Account, Key Vault, managed identity, VNets, VMSS) and the ability to grant the running user data-plane access on the SA and KV.
131
+
2.**Scanner subscription** — read access for discovery + write access to create the custom scanning role definition at the subscription scope.
132
+
3.**Each scanned subscription** — write access to attach the scanning role to the managed identity at the scan target's scope, plus the matching `roleDefinitions/write` so the custom role can declare the scan target in its `assignableScopes`.
133
+
134
+
### 1. Scanner resource group
135
+
136
+
Pre-create the resource group with the desired name and grant the engineer:
137
+
138
+
-`Contributor` on the RG — covers Storage Account, Key Vault, managed identity, virtual network, NAT gateway, and VMSS creation.
139
+
-`User Access Administrator` on the RG — covers the `roleAssignments/write` needed by the script to grant itself `Storage Blob Data Contributor` on the state Storage Account and `Key Vault Secrets Officer` on the Key Vault.
140
+
141
+
The Terraform-state Storage Account is created **inside this RG** by default, so the engineer does not need any additional subscription-wide Storage permissions for state.
142
+
143
+
### 2. Scanner subscription — custom role for the engineer
144
+
145
+
Create the following custom role at the scanner subscription scope. It bundles every read action the setup performs at the subscription level plus the `roleDefinitions/write` introduced by the custom scanning role:
-`resourceProviders/register/action` is only exercised when the required providers are not pre-registered. You can drop it from the role and register the providers manually instead (see the prerequisites list).
173
+
-`subscriptions/resourceGroups/read` is needed for tag-based discovery of existing deployments (the `az group list --tag DatadogAgentlessScanner=true` lookup).
174
+
-`roleDefinitions/write` is needed because the Terraform module creates a custom scanning role whose primary scope is the scanner subscription.
175
+
176
+
### 3. Each scanned subscription — custom role for the engineer
177
+
178
+
For every subscription listed in `SUBSCRIPTIONS_TO_SCAN` (other than the scanner subscription), create the same custom role at the scan-target scope with just the role-management actions:
"Description": "Permissions required by the engineer running the Datadog Agentless Scanner setup, on each scanned subscription.",
184
+
"Actions": [
185
+
"Microsoft.Authorization/permissions/read",
186
+
"Microsoft.Authorization/roleAssignments/read",
187
+
"Microsoft.Authorization/roleAssignments/write",
188
+
"Microsoft.Authorization/roleAssignments/delete",
189
+
"Microsoft.Authorization/roleDefinitions/read",
190
+
"Microsoft.Authorization/roleDefinitions/write",
191
+
"Microsoft.Authorization/roleDefinitions/delete"
192
+
],
193
+
"AssignableScopes": [
194
+
"/subscriptions/<scan-target-subscription-id>"
195
+
]
196
+
}
197
+
```
198
+
199
+
The same role is sufficient for both `deploy` and `destroy`. On `deploy`, the preflight will fail fast with a clear error listing the missing actions if any of the above are not granted; on `destroy`, missing actions surface during `terraform destroy` (no preflight is run).
0 commit comments