Project-scoped timeseries interact poorly with grouping

for project-scoped timeseries queries we append [` | filter silo_id == "<your silo>" && project_id == "<your project>"`](https://github.com/oxidecomputer/omicron/blob/289146b/nexus/src/app/metrics.rs#L165-L171) but a query using metrics with those fields may not have those fields by the time the filter is run.

i think this is most easily seen with a query like
```
get virtual_machine:vcpu_usage |
  filter timestamp >= @2025-02-11T00:59:44.938 && timestamp < @2025-02-11T01:20:24.938 &&
    instance_id == \"cdffcf35-6ae3-488d-a03d-64cf45f88fb2\" && state == \"emulation\" |
  align mean_within(20s) | group_by [instance_id], sum"
```

where the extra filters make us error in a pretty confusing way.

of course, if your query ends up retaining silo and project IDs the whole way through, the extra filter is fine, and so

```
> ./target/debug/oxide --profile dogfood experimental timeseries query --project ixi --query "\
        get virtual_machine:vcpu_usage | \
          filter timestamp >= @2025-02-11T00:59:44.938 && timestamp < @2025-02-11T01:20:24.938 && \
            instance_id == \"ad5a6c89-2845-4c2e-b247-8ca034e10597\" && state == \"emulation\" | \
          align mean_within(20s) | group_by [instance_id, project_id, silo_id], sum"
```

or tool of choice works with no issue.

<details><summary>how i got here, a moderately long adventure</summary>

included more because there are several things we could do better along the way and i'm filing other issues out of here..

i'd noticed this from the CLI:
```
./target/debug/oxide --profile dogfood \
    experimental timeseries query \
    --project ixi \
    --query "\
        get virtual_machine:vcpu_usage | \
          filter timestamp >= @2025-02-11T00:59:44.938 && timestamp < @2025-02-11T01:20:24.938 && \
            instance_id == \"cdffcf35-6ae3-488d-a03d-64cf45f88fb2\" && state == \"emulation\" |
          align mean_within(20s) | group_by [instance_id], sum"
```
which got me...
```
Error Response: status: 400 Bad Request; headers: {"content-type":
"application/json", "x-request-id": "f96ba139-2229-4a39-8435-7f6b39d640fb",
"content-length": "551", "date": "Wed, 12 Feb 2025 20:15:45 GMT"}; value: Error
{ error_code: Some("InvalidRequest"), message: "The filter expression
\"(silo_id == \"7bd7623a-68ed-4636-8ecb-b59e3b068787\") && (project_id ==
\"9c4152f9-4317-4269-9018-66142964d21c\")\" is not valid, the following errors
were encountered\n  > The filter expression refers to identifiers that are not
valid for its input table \"virtual_machine:vcpu_usage\". Invalid identifiers:
[\"silo_id\", \"project_id\"], valid identifiers: [\"datum\", \"instance_id\",
\"start_time\", \"timestamp\"]", request_id:
"f96ba139-2229-4a39-8435-7f6b39d640fb" }
```
emphasis on 
```
The filter expression "(silo_id == "7bd7623a-68ed-4636-8ecb-b59e3b068787") && (project_id == "9c4152f9-4317-4269-9018-66142964d21c")" is not valid, 
```
... which i'd never written! unfortunately for the CLI or SDK, it's not obvious to end users that the extra filter expression is an implementation detail of the endpoint, rather than something about the query itself which is wrong. to rule that out i'd run the same query against the API directly:
```
curl --fail-with-body -v -X POST \
    -H 'content-type:application/json' \
    -H 'cookie: session=[snip]' \
    --data "{\"query\": \
        \"get virtual_machine:vcpu_usage | \
            filter timestamp >= @2025-02-11T00:59:44.938 && timestamp < @2025-02-11T01:20:24.938 && \
              instance_id == \\\cdffcf35-6ae3-488d-a03d-64cf45f88fb2\\\" && state == \\\"emulation\\\" | \
            align mean_within(20s) | group_by [instance_id], sum\" \
    }" \
    'https://oxide.sys.rack2.eng.oxide.computer/v1/timeseries/query?project=ixi'
```
which got me the same error. on the Omicron side i pretty quickly found https://github.com/oxidecomputer/omicron/pull/6873 which explains where the extra filter expression came from. but the `group_by` in my query means that `virtual_machine:vcpu_usage` doesn't have all the other fields like `project_id` and `silo_id` anymore, so the project filter will just produce an invalid query.

and indeed, grouping by `[instance_id, project_id, silo_id]` yields output more like you'd expect:
```
> ./target/debug/oxide --profile dogfood     experimental timeseries query --project ixi --query "\
        get virtual_machine:vcpu_usage | \
          filter timestamp >= @2025-02-11T00:59:44.938 && timestamp < @2025-02-11T01:20:24.938 && \
            instance_id == \"ad5a6c89-2845-4c2e-b247-8ca034e10597\" && state == \"emulation\" |
          align mean_within(20s) | group_by [instance_id, project_id, silo_id], sum"
{
  "tables": [
    {
      "name": "virtual_machine:vcpu_usage",
      "timeseries": {
        "8769668217919957407": {
          "fields": {
            "instance_id": {
              "type": "uuid",
              "value": "cdffcf35-6ae3-488d-a03d-64cf45f88fb2"
            },
            "project_id": {
              "type": "uuid",
              "value": "9c4152f9-4317-4269-9018-66142964d21c"
            },
            "silo_id": {
              "type": "uuid",
              "value": "7bd7623a-68ed-4636-8ecb-b59e3b068787"
            }
          },
          "points": {
            "timestamps": [
              "2025-02-11T01:00:04.938Z",
... eliding all the lines of data but it's all there and reasonable ...
```
</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Project-scoped timeseries interact poorly with grouping #7532

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Project-scoped timeseries interact poorly with grouping #7532

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions