Skip to content

Commit 627763d

Browse files
feat(tracing): opt-in Application Insights / OpenTelemetry support (#42)
Wires azure-monitor-opentelemetry as an opt-in [tracing] extra. When APPLICATIONINSIGHTS_CONNECTION_STRING is set, src.common.tracing.configure_tracing() initialises Azure Monitor and instruments FastAPI + httpx + logging spans; otherwise it is a no-op so the offline path is unaffected. Bicep gains an enableAppInsights switch that provisions a workspace-based App Insights resource and injects the connection string + OTEL_SERVICE_NAME via Container App secrets. Docker image now installs the [tracing] extra by default so cloud deployments work out of the box. Closes the v0.8 roadmap item.
1 parent 14008d7 commit 627763d

9 files changed

Lines changed: 273 additions & 9 deletions

File tree

Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ WORKDIR /app
1313
COPY pyproject.toml README.md LICENSE ./
1414
COPY src ./src
1515

16-
RUN pip install --upgrade pip && pip install .
16+
RUN pip install --upgrade pip && pip install ".[tracing]"
1717

1818
# Create non-root user.
1919
RUN useradd --create-home --uid 10001 appuser

README.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,8 @@ One `Deploy to Azure` click → Container Apps (scale-to-zero, < $5/month idle)
6060
| `GREYNOISE_API_KEY` | `/greynoise/*` | Free Community key from <https://viz.greynoise.io/signup>*Account → API Key*. Required for GreyNoise classification. |
6161
| `ABUSEIPDB_API_KEY` | `/abuseipdb/*` | Free key from <https://www.abuseipdb.com/register>*API → Create Key* (1000 req/day). Required for AbuseIPDB checks. |
6262
| `OTX_API_KEY` | `/otx/*` | Free key from <https://otx.alienvault.com/>*Settings → API Integration*. Required for AlienVault OTX indicator lookups. |
63+
| `APPLICATIONINSIGHTS_CONNECTION_STRING` | All routes (tracing) | Optional. When set, the app initialises [Azure Monitor OpenTelemetry](https://learn.microsoft.com/azure/azure-monitor/app/opentelemetry-enable?tabs=python) and ships FastAPI / `httpx` / logging spans to Application Insights. Leave unset to keep the stack fully offline. The Bicep template can provision an App Insights resource and inject this automatically — pass `enableAppInsights=true`. |
64+
| `OTEL_SERVICE_NAME` | Tracing | Optional override for the OpenTelemetry `service.name` resource attribute. Defaults to `copilot-mcp-soc-pack`. |
6365

6466
> **Conditional registration**: Tools requiring an upstream key
6567
> (`abusech`, `greynoise`, `abuseipdb`, `otx`) are registered **only**
@@ -232,7 +234,7 @@ See [mcp-client-config/](./mcp-client-config/) for ready-to-use configurations.
232234
- [x] v0.5 AlienVault OTX + Have I Been Pwned, smoke harness, `#ExamplePrompts` planner hints, **Public Preview**
233235
- [x] v0.6 Reliability hardening (httpx retries with backoff, LRU-bounded TTL cache, `/ready` probe), per-tool unit tests, mypy in CI, Dependabot, single-source version, PR-based workflow
234236
- [x] v0.7 OSV.dev + CIRCL hashlookup + MITRE D3FEND, full upstream-retry coverage across every tool, codified `request_with_retry` convention
235-
- [ ] v0.8 Promptbook samples, structured eval harness, Application Insights tracing
237+
- [x] v0.8 Promptbook samples ([docs/promptbook.md](./docs/promptbook.md)), structured live eval harness ([docs/eval.md](./docs/eval.md)), opt-in Application Insights tracing (`APPLICATIONINSIGHTS_CONNECTION_STRING` + Bicep `enableAppInsights`)
236238
- [ ] v1.0 Hardening (Managed Identity inbound, custom metrics, Sentinel Workbook), GA based on Preview feedback
237239

238240
## Known limitations
@@ -242,7 +244,7 @@ This is a **Public Preview**. The following are intentional gaps today; PRs and
242244
- **Inbound auth is API key only.** No Managed Identity, no Entra ID inbound, no per-caller RBAC. Rotate the shared `MCP_SOC_PACK_API_KEY` regularly.
243245
- **In-memory TTL cache only.** Cache resets on every cold start (which is expected at scale-to-zero). v0.6 added an LRU eviction cap (default 1024 entries) so long-running replicas no longer leak memory; there is still no Redis or shared cache across replicas.
244246
- **Single region.** The `Deploy to Azure` button provisions one Container Apps environment. There is no multi-region active-active sample yet.
245-
- **Observability is logs only.** Container App logs land in a Log Analytics workspace; there are no custom metrics, traces, or a Workbook yet.
247+
- **Observability ships logs + opt-in traces.** Container App logs land in a Log Analytics workspace. Application Insights distributed tracing is opt-in via `APPLICATIONINSIGHTS_CONNECTION_STRING` (or the Bicep `enableAppInsights=true` switch). There are no custom metrics or a packaged Sentinel Workbook yet.
246248
- **`/health` and `/openapi.json` are intentionally un-authenticated** to support Container App probes and OpenAPI ingestion. Restrict ingress (Front Door, IP allow-list, private endpoint) if this is unacceptable.
247249
- **OpenAPI is downgraded to 3.0.1 at runtime.** Microsoft Security Copilot rejects 3.1; downstream tools that rely on 3.1 features should consume the FastAPI source instead of `/openapi.json`.
248250
- **No Sentinel Workbook / Foundry agent sample bundled yet.** Planned for v0.8+.

deploy/azuredeploy.json

Lines changed: 36 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@
44
"metadata": {
55
"_generator": {
66
"name": "bicep",
7-
"version": "0.42.1.51946",
8-
"templateHash": "678075599944877869"
7+
"version": "0.41.2.15936",
8+
"templateHash": "16082163654774733647"
99
}
1010
},
1111
"parameters": {
@@ -67,6 +67,13 @@
6767
"description": "Free AlienVault OTX API key (https://otx.alienvault.com/, Settings -> API Integration). Required for /otx/* endpoints. Leave empty to disable."
6868
}
6969
},
70+
"enableAppInsights": {
71+
"type": "bool",
72+
"defaultValue": false,
73+
"metadata": {
74+
"description": "Provision a workspace-based Application Insights resource and inject APPLICATIONINSIGHTS_CONNECTION_STRING into the container. Requires the image to ship the optional `tracing` extra (default image does)."
75+
}
76+
},
7077
"minReplicas": {
7178
"type": "int",
7279
"defaultValue": 0,
@@ -88,7 +95,8 @@
8895
},
8996
"variables": {
9097
"logAnalyticsName": "[format('{0}-logs', parameters('containerAppName'))]",
91-
"environmentName": "[format('{0}-env', parameters('containerAppName'))]"
98+
"environmentName": "[format('{0}-env', parameters('containerAppName'))]",
99+
"appInsightsName": "[format('{0}-ai', parameters('containerAppName'))]"
92100
},
93101
"resources": [
94102
{
@@ -106,6 +114,24 @@
106114
}
107115
}
108116
},
117+
{
118+
"condition": "[parameters('enableAppInsights')]",
119+
"type": "Microsoft.Insights/components",
120+
"apiVersion": "2020-02-02",
121+
"name": "[variables('appInsightsName')]",
122+
"location": "[parameters('location')]",
123+
"kind": "web",
124+
"properties": {
125+
"Application_Type": "web",
126+
"WorkspaceResourceId": "[resourceId('Microsoft.OperationalInsights/workspaces', variables('logAnalyticsName'))]",
127+
"IngestionMode": "LogAnalytics",
128+
"publicNetworkAccessForIngestion": "Enabled",
129+
"publicNetworkAccessForQuery": "Enabled"
130+
},
131+
"dependsOn": [
132+
"[resourceId('Microsoft.OperationalInsights/workspaces', variables('logAnalyticsName'))]"
133+
]
134+
},
109135
{
110136
"type": "Microsoft.App/managedEnvironments",
111137
"apiVersion": "2024-03-01",
@@ -138,7 +164,7 @@
138164
"transport": "auto",
139165
"allowInsecure": false
140166
},
141-
"secrets": "[concat(if(empty(parameters('apiKey')), createArray(), createArray(createObject('name', 'api-key', 'value', parameters('apiKey')))), if(empty(parameters('abuseChAuthKey')), createArray(), createArray(createObject('name', 'abusech-auth-key', 'value', parameters('abuseChAuthKey')))), if(empty(parameters('greynoiseApiKey')), createArray(), createArray(createObject('name', 'greynoise-api-key', 'value', parameters('greynoiseApiKey')))), if(empty(parameters('abuseIpdbApiKey')), createArray(), createArray(createObject('name', 'abuseipdb-api-key', 'value', parameters('abuseIpdbApiKey')))), if(empty(parameters('otxApiKey')), createArray(), createArray(createObject('name', 'otx-api-key', 'value', parameters('otxApiKey')))))]"
167+
"secrets": "[concat(if(empty(parameters('apiKey')), createArray(), createArray(createObject('name', 'api-key', 'value', parameters('apiKey')))), if(empty(parameters('abuseChAuthKey')), createArray(), createArray(createObject('name', 'abusech-auth-key', 'value', parameters('abuseChAuthKey')))), if(empty(parameters('greynoiseApiKey')), createArray(), createArray(createObject('name', 'greynoise-api-key', 'value', parameters('greynoiseApiKey')))), if(empty(parameters('abuseIpdbApiKey')), createArray(), createArray(createObject('name', 'abuseipdb-api-key', 'value', parameters('abuseIpdbApiKey')))), if(empty(parameters('otxApiKey')), createArray(), createArray(createObject('name', 'otx-api-key', 'value', parameters('otxApiKey')))), if(parameters('enableAppInsights'), createArray(createObject('name', 'applicationinsights-connection-string', 'value', reference(resourceId('Microsoft.Insights/components', variables('appInsightsName')), '2020-02-02').ConnectionString)), createArray()))]"
142168
},
143169
"template": {
144170
"containers": [
@@ -149,7 +175,7 @@
149175
"cpu": "[json('0.5')]",
150176
"memory": "1.0Gi"
151177
},
152-
"env": "[concat(if(empty(parameters('apiKey')), createArray(), createArray(createObject('name', 'MCP_SOC_PACK_API_KEY', 'secretRef', 'api-key'))), if(empty(parameters('abuseChAuthKey')), createArray(), createArray(createObject('name', 'ABUSE_CH_AUTH_KEY', 'secretRef', 'abusech-auth-key'))), if(empty(parameters('greynoiseApiKey')), createArray(), createArray(createObject('name', 'GREYNOISE_API_KEY', 'secretRef', 'greynoise-api-key'))), if(empty(parameters('abuseIpdbApiKey')), createArray(), createArray(createObject('name', 'ABUSEIPDB_API_KEY', 'secretRef', 'abuseipdb-api-key'))), if(empty(parameters('otxApiKey')), createArray(), createArray(createObject('name', 'OTX_API_KEY', 'secretRef', 'otx-api-key'))))]",
178+
"env": "[concat(if(empty(parameters('apiKey')), createArray(), createArray(createObject('name', 'MCP_SOC_PACK_API_KEY', 'secretRef', 'api-key'))), if(empty(parameters('abuseChAuthKey')), createArray(), createArray(createObject('name', 'ABUSE_CH_AUTH_KEY', 'secretRef', 'abusech-auth-key'))), if(empty(parameters('greynoiseApiKey')), createArray(), createArray(createObject('name', 'GREYNOISE_API_KEY', 'secretRef', 'greynoise-api-key'))), if(empty(parameters('abuseIpdbApiKey')), createArray(), createArray(createObject('name', 'ABUSEIPDB_API_KEY', 'secretRef', 'abuseipdb-api-key'))), if(empty(parameters('otxApiKey')), createArray(), createArray(createObject('name', 'OTX_API_KEY', 'secretRef', 'otx-api-key'))), if(parameters('enableAppInsights'), createArray(createObject('name', 'APPLICATIONINSIGHTS_CONNECTION_STRING', 'secretRef', 'applicationinsights-connection-string'), createObject('name', 'OTEL_SERVICE_NAME', 'value', parameters('containerAppName'))), createArray()), createArray(createObject('name', 'MCP_SOC_PACK_PUBLIC_BASE_URL', 'value', format('https://{0}.{1}', parameters('containerAppName'), reference(resourceId('Microsoft.App/managedEnvironments', variables('environmentName')), '2024-03-01').defaultDomain))))]",
153179
"probes": [
154180
{
155181
"type": "Liveness",
@@ -179,6 +205,7 @@
179205
}
180206
},
181207
"dependsOn": [
208+
"[resourceId('Microsoft.Insights/components', variables('appInsightsName'))]",
182209
"[resourceId('Microsoft.App/managedEnvironments', variables('environmentName'))]"
183210
]
184211
}
@@ -199,6 +226,10 @@
199226
"mcpSseUrl": {
200227
"type": "string",
201228
"value": "[format('https://{0}/mcp/', reference(resourceId('Microsoft.App/containerApps', parameters('containerAppName')), '2024-03-01').configuration.ingress.fqdn)]"
229+
},
230+
"appInsightsName": {
231+
"type": "string",
232+
"value": "[if(parameters('enableAppInsights'), variables('appInsightsName'), '')]"
202233
}
203234
}
204235
}

deploy/main.bicep

Lines changed: 34 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,8 @@ param abuseIpdbApiKey string = ''
3535
@description('Free AlienVault OTX API key (https://otx.alienvault.com/, Settings -> API Integration). Required for /otx/* endpoints. Leave empty to disable.')
3636
@secure()
3737
param otxApiKey string = ''
38+
@description('Provision a workspace-based Application Insights resource and inject APPLICATIONINSIGHTS_CONNECTION_STRING into the container. Requires the image to ship the optional `tracing` extra (default image does).')
39+
param enableAppInsights bool = false
3840
@description('Minimum number of replicas. Set 0 for scale-to-zero.')
3941
@minValue(0)
4042
@maxValue(5)
@@ -47,6 +49,7 @@ param maxReplicas int = 3
4749

4850
var logAnalyticsName = '${containerAppName}-logs'
4951
var environmentName = '${containerAppName}-env'
52+
var appInsightsName = '${containerAppName}-ai'
5053

5154
resource logAnalytics 'Microsoft.OperationalInsights/workspaces@2023-09-01' = {
5255
name: logAnalyticsName
@@ -62,6 +65,19 @@ resource logAnalytics 'Microsoft.OperationalInsights/workspaces@2023-09-01' = {
6265
}
6366
}
6467

68+
resource appInsights 'Microsoft.Insights/components@2020-02-02' = if (enableAppInsights) {
69+
name: appInsightsName
70+
location: location
71+
kind: 'web'
72+
properties: {
73+
Application_Type: 'web'
74+
WorkspaceResourceId: logAnalytics.id
75+
IngestionMode: 'LogAnalytics'
76+
publicNetworkAccessForIngestion: 'Enabled'
77+
publicNetworkAccessForQuery: 'Enabled'
78+
}
79+
}
80+
6581
resource environment 'Microsoft.App/managedEnvironments@2024-03-01' = {
6682
name: environmentName
6783
location: location
@@ -118,7 +134,13 @@ resource containerApp 'Microsoft.App/containerApps@2024-03-01' = {
118134
name: 'otx-api-key'
119135
value: otxApiKey
120136
}
121-
]
137+
],
138+
enableAppInsights ? [
139+
{
140+
name: 'applicationinsights-connection-string'
141+
value: appInsights!.properties.ConnectionString
142+
}
143+
] : []
122144
)
123145
}
124146
template: {
@@ -161,6 +183,16 @@ resource containerApp 'Microsoft.App/containerApps@2024-03-01' = {
161183
secretRef: 'otx-api-key'
162184
}
163185
],
186+
enableAppInsights ? [
187+
{
188+
name: 'APPLICATIONINSIGHTS_CONNECTION_STRING'
189+
secretRef: 'applicationinsights-connection-string'
190+
}
191+
{
192+
name: 'OTEL_SERVICE_NAME'
193+
value: containerAppName
194+
}
195+
] : [],
164196
// Public base URL injected into the OpenAPI `servers[]` block.
165197
// Required for Microsoft Security Copilot's Agent Builder
166198
// API Tool importer to resolve operation base URLs (the
@@ -207,3 +239,4 @@ output fqdn string = containerApp.properties.configuration.ingress.fqdn
207239
output endpoint string = 'https://${containerApp.properties.configuration.ingress.fqdn}'
208240
output openApiUrl string = 'https://${containerApp.properties.configuration.ingress.fqdn}/openapi.json'
209241
output mcpSseUrl string = 'https://${containerApp.properties.configuration.ingress.fqdn}/mcp/'
242+
output appInsightsName string = enableAppInsights ? appInsights!.name : ''

docs/tracing.md

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
# Application Insights tracing
2+
3+
The SOC Pack ships an **opt-in** [Azure Monitor OpenTelemetry](https://learn.microsoft.com/azure/azure-monitor/app/opentelemetry-enable?tabs=python) integration that emits FastAPI HTTP server spans, outbound `httpx` client spans, and structured logs to Application Insights. It is **disabled by default** — leaving the connection string unset keeps the stack fully offline.
4+
5+
## What gets instrumented
6+
7+
`src/common/tracing.py` calls `azure.monitor.opentelemetry.configure_azure_monitor()` plus `FastAPIInstrumentor.instrument_app()`. With those two switches you get:
8+
9+
- One server span per inbound HTTP request to any FastAPI route (including `/openapi.json` and the `/mcp/` SSE endpoint).
10+
- One client span per outbound call made through the shared `httpx.AsyncClient` in `src/common/http.py` (so every upstream call to KEV, EPSS, abuse.ch, OTX, OSV, ransomware.live, etc. is traced).
11+
- Standard `logging` records emitted by the app are forwarded as Application Insights traces.
12+
13+
The `service.name` resource attribute defaults to `copilot-mcp-soc-pack` and can be overridden with `OTEL_SERVICE_NAME`.
14+
15+
## Enabling tracing
16+
17+
### 1. Bicep (recommended)
18+
19+
`deploy/main.bicep` now supports a `enableAppInsights` parameter. When set to `true` it provisions a workspace-based Application Insights resource backed by the same Log Analytics workspace, then injects `APPLICATIONINSIGHTS_CONNECTION_STRING` and `OTEL_SERVICE_NAME` into the Container App as a secret-backed env var pair.
20+
21+
```bash
22+
az deployment group create \
23+
--resource-group rg-copilot-mcp-soc-pack \
24+
--template-file deploy/main.bicep \
25+
--parameters apiKey=$(openssl rand -hex 32) \
26+
enableAppInsights=true
27+
```
28+
29+
### 2. Bring-your-own connection string
30+
31+
If you already have an Application Insights resource (or are running outside Bicep), set the env var directly on the Container App / your local shell:
32+
33+
```bash
34+
export APPLICATIONINSIGHTS_CONNECTION_STRING="InstrumentationKey=...;IngestionEndpoint=https://..."
35+
export OTEL_SERVICE_NAME="copilot-mcp-soc-pack-prod"
36+
uvicorn src.app:app --host 0.0.0.0 --port 8080
37+
```
38+
39+
The published Docker image (`ghcr.io/nobufumimurata/copilot-mcp-soc-pack:latest`) is already built with the `[tracing]` extra, so no extra install steps are needed in the cloud. For local installs from source, use `pip install ".[tracing]"`.
40+
41+
## Verifying it works
42+
43+
After enabling tracing and triggering a couple of requests (`scripts/smoke.ps1` or any tool call), open the App Insights resource in the Azure portal and check:
44+
45+
- **Application map**: `copilot-mcp-soc-pack` should appear with downstream nodes for each upstream API hostname (e.g. `api.first.org`, `urlhaus-api.abuse.ch`).
46+
- **Transaction search → Dependencies**: filter by `target` to see per-upstream latency and HTTP status counts.
47+
- **Failures**: any 4xx/5xx surfaced by `request_with_retry` will show up here with the retry count visible in the span attributes.
48+
49+
A useful KQL starter query (in the App Insights Logs blade):
50+
51+
```kusto
52+
dependencies
53+
| where cloud_RoleName == "copilot-mcp-soc-pack"
54+
| summarize count(), avg(duration), percentiles(duration, 50, 95) by target, resultCode
55+
| order by count_ desc
56+
```
57+
58+
## Disabling
59+
60+
Unset `APPLICATIONINSIGHTS_CONNECTION_STRING` (or set it to an empty string) and restart the app. `configure_tracing()` will log `Application Insights tracing disabled: APPLICATIONINSIGHTS_CONNECTION_STRING is not set.` at startup and skip all OpenTelemetry initialisation.
61+
62+
If you want to drop the dependency entirely, install without the extra (`pip install .` instead of `pip install ".[tracing]"`). The lazy import in `src/common/tracing.py` will detect the missing module and log a warning, but the app still starts normally.

pyproject.toml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,14 @@ dependencies = [
2222
]
2323

2424
[project.optional-dependencies]
25+
tracing = [
26+
# Application Insights / OpenTelemetry. Imported lazily by
27+
# src.common.tracing and only initialised when
28+
# APPLICATIONINSIGHTS_CONNECTION_STRING is set. Install with
29+
# `pip install ".[tracing]"` (the published Docker image already
30+
# ships this extra).
31+
"azure-monitor-opentelemetry>=1.6",
32+
]
2533
dev = [
2634
"pytest>=8.0",
2735
"pytest-asyncio>=0.24",

src/app.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121
from src import __version__
2222
from src.common.http import get_client
2323
from src.common.openapi_compat import downgrade_to_3_0_1
24+
from src.common.tracing import configure_tracing
2425
from src.tools import (
2526
abusech,
2627
abuseipdb,
@@ -206,6 +207,11 @@ def _clean_operation_id(route: APIRoute) -> str:
206207
allow_headers=["Authorization", "Content-Type", "X-API-Key"],
207208
)
208209

210+
# Optional Application Insights tracing. No-op when
211+
# APPLICATIONINSIGHTS_CONNECTION_STRING is unset or the
212+
# `azure-monitor-opentelemetry` package is not installed.
213+
configure_tracing(app)
214+
209215

210216
@app.get("/health", tags=["meta"], summary="Liveness probe")
211217
def health() -> dict[str, str]:

0 commit comments

Comments
 (0)