Skip to content

Commit 8676b83

Browse files
alfarahnCopilot
andcommitted
feat: integrate OpenTelemetry for enhanced telemetry and monitoring in Copilot CLI
Co-authored-by: Copilot <copilot@github.com>
1 parent d784769 commit 8676b83

6 files changed

Lines changed: 131 additions & 25 deletions

File tree

.github/copilot-instructions.md

Lines changed: 16 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -14,25 +14,18 @@ Write-Host "All consent grants revoked — user will see fresh consent screens"
1414

1515
This ensures the incremental consent flow works as expected and the user sees the real Microsoft Entra ID consent UI for each tier.
1616

17-
always check if you are logged into:
18-
Basic information
19-
Name
20-
Contoso
21-
Tenant ID
22-
51650aad-d085-4ecb-8b07-d7ed4f5355e0
23-
Primary domain
24-
MngEnvMCAP237604.onmicrosoft.com
25-
License
26-
Microsoft Entra ID Free
27-
Users
28-
2
29-
Groups
30-
3
31-
Applications
32-
3
33-
Devices
34-
2
35-
admin@MngEnvMCAP237604.onmicrosoft.com
17+
always check if you are logged into the right tenant + subscription before running Azure CLI commands:
18+
19+
- **Tenant**: Contoso — `51650aad-d085-4ecb-8b07-d7ed4f5355e0` (`MngEnvMCAP237604.onmicrosoft.com`)
20+
- **Subscription**: `ME-MngEnvMCAP237604-alfarahn-1``3f359915-1adb-4464-a4c0-8b0bc65c7959`
21+
- **Account**: `admin@MngEnvMCAP237604.onmicrosoft.com`
22+
23+
If `az account show` returns a different tenant, run:
24+
25+
```powershell
26+
az login --tenant 51650aad-d085-4ecb-8b07-d7ed4f5355e0
27+
az account set --subscription 3f359915-1adb-4464-a4c0-8b0bc65c7959
28+
```
3629

3730
## Project Purpose
3831

@@ -58,7 +51,7 @@ The agent acts as a frontend on top of Azure Cost Management, Billing, ARM REST
5851
- **AI**: GitHub Copilot SDK (`GitHub.Copilot.SDK`) with BYOK (Bring Your Own Key) using Azure OpenAI via Entra ID bearer tokens. Sessions managed via `CopilotClient` / `CopilotSession`. Reasoning effort set to `xhigh`. The Copilot CLI provides built-in tools (file operations, bash, grep, glob, web fetch, memory) — custom tools handle Azure-specific APIs.
5952
- **Auth**: Auto-assigned anonymous sessions (no login required for chat); Microsoft Entra ID OAuth (multi-tenant) for Azure ARM, Microsoft Graph, and Log Analytics APIs
6053
- **Data Sources**: Azure Retail Prices API (no auth), Azure Service Health (no auth), Azure Cost Management APIs, Microsoft Graph APIs, Azure Monitor / Log Analytics APIs, ECharts visualization
61-
- **Observability**: OpenTelemetry + Azure Monitor (Application Insights) — structured traces via `ActivitySource("AzureFinOps.AI")` and custom metrics via `Meter("AzureFinOps.AI")` (chat requests, tool calls, errors, token refreshes, session lifecycle, duration histograms). Frontend telemetry in `client/src/main.js` captures page views, failed browser dependencies, uncaught JS errors, unhandled promise rejections, Vue component errors, and CSP violations. Third-party correlation headers are excluded for `cdn.jsdelivr.net` and `js.monitor.azure.com` so browser telemetry does not break public fetches.
54+
- **Observability**: OpenTelemetry end-to-end. The .NET app uses `UseAzureMonitor()` (auto-instruments HttpClient, ASP.NET Core, custom `ActivitySource("AzureFinOps.AI")` + `Meter("AzureFinOps.AI")`). The Copilot CLI subprocess emits OTLP via the SDK's built-in `TelemetryConfig` (GenAI + MCP semantic conventions — every tool call, LLM round-trip, prompt, tool args, result, token usage). Both feeds reach Application Insights via an in-container **OpenTelemetry Collector** (`otel/opentelemetry-collector-contrib`) using the `azuremonitor` exporter — config at `src/Dashboard/otel-collector-config.yaml`, launched by `entrypoint.sh` before the .NET app. Trace context (W3C `traceparent`) is auto-propagated SDK→CLI so Application Map shows one continuous transaction. Custom metrics (`finops.chat.requests`, `finops.tool.calls`, `finops.sessions.active`, etc.) keep flowing through the .NET exporter. Frontend telemetry in `client/src/main.js` captures page views, failed browser dependencies, uncaught JS errors, unhandled promise rejections, Vue component errors, and CSP violations. Third-party correlation headers are excluded for `cdn.jsdelivr.net` and `js.monitor.azure.com`.
6255
- **Deployment**: Azure App Service (Linux, P0v3 Premium) via Docker container image from Azure Container Registry (ACR). Multi-stage Dockerfile bakes Python 3, pip packages (python-pptx, matplotlib, pandas, numpy, lxml), and CLI tools into the image — no runtime install needed. Legacy zip deployment via `deploy.ps1` still supported for the original `finops-agent` app.
6356
- **Container Registry**: Azure Container Registry (`crfinopsagent.azurecr.io`) — Basic SKU, admin credentials, images built via `az acr build`
6457
- **Container App (staging)**: `finops-agent-container.azurewebsites.net` — Docker container on same P0v3 plan, used for testing before swapping to production
@@ -91,7 +84,9 @@ src/Dashboard/
9184
│ ├── ScriptTools.cs # GenerateScript — generates downloadable Azure CLI/PowerShell scripts from FinOps recommendations
9285
│ └── UploadedFileTools.cs # QueryUploadedFile — inspect/query files (CSV/TSV/JSON/TXT/XLSX/PDF/Parquet) the user dropped into chat (no Azure consent needed). Backed by AI/Tools/Resources/file_inspect.py (pandas/openpyxl/pyarrow/pdfminer).
9386
│ └── TokenContext.cs # UserTokens — per-user mutable token holder with volatile fields for concurrent access
94-
├── Dockerfile # Multi-stage Docker build (node:22 + dotnet/sdk:10.0 + dotnet/aspnet:10.0 + Python 3)
87+
├── Dockerfile # Multi-stage Docker build (node:22 + dotnet/sdk:10.0 + dotnet/aspnet:10.0 + Python 3 + OTel collector)
88+
├── entrypoint.sh # Container entrypoint — starts OTel collector in background, then exec dotnet
89+
├── otel-collector-config.yaml # OpenTelemetry Collector config — bridges OTLP from Copilot CLI → Application Insights via azuremonitor exporter
9590
├── .dockerignore # Excludes bin/, obj/, node_modules/, wwwroot/ from Docker context
9691
├── client/
9792
│ ├── index.html # SPA entry point

src/Dashboard/AI/CopilotSessionFactory.cs

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -115,7 +115,22 @@ public static async Task<CopilotSessionFactory> CreateAsync(
115115
string azureOpenAIDeployment,
116116
ILoggerFactory loggerFactory)
117117
{
118-
var copilotClient = new CopilotClient();
118+
// Forward CLI telemetry (GenAI + MCP semantic conventions) to the local
119+
// OTel collector when one is configured. The collector translates OTLP into
120+
// Azure Monitor format and ships it to Application Insights so we get full
121+
// tool-call and LLM-roundtrip visibility without any custom span wiring.
122+
var otlpEndpoint = Environment.GetEnvironmentVariable("OTEL_EXPORTER_OTLP_ENDPOINT");
123+
var clientOptions = new CopilotClientOptions();
124+
if (!string.IsNullOrWhiteSpace(otlpEndpoint))
125+
{
126+
clientOptions.Telemetry = new TelemetryConfig
127+
{
128+
OtlpEndpoint = otlpEndpoint,
129+
CaptureContent = true, // include prompts, tool args, results
130+
SourceName = "AzureFinOps.AI.CLI",
131+
};
132+
}
133+
var copilotClient = new CopilotClient(clientOptions);
119134
await copilotClient.StartAsync();
120135

121136
var credential = new ClientSecretCredential(

src/Dashboard/Dockerfile

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,8 +31,15 @@ RUN apt-get update -qq && \
3131
apt-get clean && \
3232
rm -rf /var/lib/apt/lists/* /tmp/* /root/.cache
3333

34+
# OpenTelemetry Collector (contrib build — needed for the azuremonitor exporter).
35+
# Bridges OTLP from the Copilot CLI subprocess into Azure Application Insights.
36+
COPY --from=otel/opentelemetry-collector-contrib:0.108.0 /otelcol-contrib /usr/local/bin/otelcol
37+
COPY otel-collector-config.yaml /etc/otelcol/config.yaml
38+
3439
WORKDIR /app
3540
COPY --from=build /app/publish .
41+
COPY entrypoint.sh /usr/local/bin/entrypoint.sh
42+
RUN chmod +x /usr/local/bin/entrypoint.sh
3643

3744
# Build metadata baked into the image
3845
ARG BUILD_SHA=dev
@@ -42,6 +49,10 @@ ENV BUILD_NUMBER=${BUILD_NUMBER}
4249

4350
# App Service expects port 8080 by default for containers
4451
ENV ASPNETCORE_URLS=http://+:8080
52+
# Tell the Copilot CLI (and any other OTLP-capable child process) to push
53+
# telemetry to the local collector.
54+
ENV OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
55+
ENV OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
4556
EXPOSE 8080
4657

47-
ENTRYPOINT ["dotnet", "Dashboard.dll"]
58+
ENTRYPOINT ["/usr/local/bin/entrypoint.sh"]

src/Dashboard/Program.cs

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -53,8 +53,14 @@
5353
{
5454
builder.Services.AddOpenTelemetry()
5555
.UseAzureMonitor(o => o.ConnectionString = appInsightsCs)
56-
.WithTracing(t => t.AddSource("AzureFinOps.AI"))
57-
.WithMetrics(m => m.AddMeter("AzureFinOps.AI"));
56+
.WithTracing(t => t
57+
.AddSource("AzureFinOps.AI")
58+
// Copilot SDK W3C-propagated tool/LLM spans surface here when the
59+
// SDK's TelemetryConfig.SourceName is set to "AzureFinOps.AI.CLI".
60+
.AddSource("AzureFinOps.AI.CLI"))
61+
.WithMetrics(m => m
62+
.AddMeter("AzureFinOps.AI")
63+
.AddMeter("AzureFinOps.AI.CLI"));
5864
}
5965

6066
var telemetry = new AiTelemetry();

src/Dashboard/entrypoint.sh

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
#!/bin/bash
2+
# Container entrypoint — starts the OTel collector in the background, then
3+
# launches the .NET app in the foreground so its stdout/stderr (and exit
4+
# status) drive container lifecycle.
5+
set -e
6+
7+
if [ -n "$APPLICATIONINSIGHTS_CONNECTION_STRING" ] || [ -n "$ApplicationInsights__ConnectionString" ]; then
8+
# Normalise both env-var spellings so the collector config picks one up.
9+
export APPLICATIONINSIGHTS_CONNECTION_STRING="${APPLICATIONINSIGHTS_CONNECTION_STRING:-$ApplicationInsights__ConnectionString}"
10+
echo "[entrypoint] starting OTel collector → Azure Monitor"
11+
/usr/local/bin/otelcol --config /etc/otelcol/config.yaml &
12+
COLLECTOR_PID=$!
13+
# Forward signals so SIGTERM from App Service shuts both down cleanly.
14+
trap "kill -TERM $COLLECTOR_PID 2>/dev/null || true" TERM INT
15+
else
16+
echo "[entrypoint] APPLICATIONINSIGHTS_CONNECTION_STRING not set — skipping collector"
17+
fi
18+
19+
exec dotnet Dashboard.dll
Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
# OpenTelemetry Collector — bridges OTLP traces/metrics/logs from the Copilot
2+
# CLI subprocess (and any other OTLP source running locally) into Azure
3+
# Application Insights.
4+
#
5+
# Why a sidecar collector?
6+
# - The Copilot CLI subprocess emits OTLP, not Azure Monitor's proprietary
7+
# wire format. App Insights does not accept raw OTLP, so we need a
8+
# translator. The "azuremonitor" exporter in opentelemetry-collector-contrib
9+
# is the supported bridge.
10+
# - It also acts as an out-of-process batch buffer so spans survive .NET
11+
# SSE-stream cancellations and graceful container restarts.
12+
#
13+
# Connection string is picked up from the same env var the .NET app uses, so
14+
# there is exactly one place to configure it (App Service app setting
15+
# APPLICATIONINSIGHTS_CONNECTION_STRING or ApplicationInsights__ConnectionString).
16+
17+
receivers:
18+
otlp:
19+
protocols:
20+
http:
21+
endpoint: 0.0.0.0:4318
22+
grpc:
23+
endpoint: 0.0.0.0:4317
24+
25+
processors:
26+
batch:
27+
send_batch_size: 512
28+
timeout: 5s
29+
# Drop noisy /robots.txt and bot-crawler spans that bloat the bill.
30+
filter/drop_noise:
31+
error_mode: ignore
32+
traces:
33+
span:
34+
- 'attributes["url.path"] == "/robots.txt"'
35+
- 'attributes["url.path"] == "/wp-admin/install.php"'
36+
37+
exporters:
38+
azuremonitor:
39+
connection_string: ${env:APPLICATIONINSIGHTS_CONNECTION_STRING}
40+
# Tiny stdout exporter for local debugging (only enabled when OTEL_DEBUG=1).
41+
debug:
42+
verbosity: basic
43+
44+
service:
45+
telemetry:
46+
logs:
47+
level: warn
48+
pipelines:
49+
traces:
50+
receivers: [otlp]
51+
processors: [filter/drop_noise, batch]
52+
exporters: [azuremonitor]
53+
metrics:
54+
receivers: [otlp]
55+
processors: [batch]
56+
exporters: [azuremonitor]
57+
logs:
58+
receivers: [otlp]
59+
processors: [batch]
60+
exporters: [azuremonitor]

0 commit comments

Comments
 (0)