Skip to content

wi: new endpoint for listing workload attached ACL policies #25588

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 23 commits into
base: main
Choose a base branch
from

Conversation

pkazmierczak
Copy link
Contributor

@pkazmierczak pkazmierczak commented Apr 2, 2025

This introduces a new HTTP endpoint (and an associated CLI command) for querying
ACL policies associated with a workload identity. It allows users that want
to learn about the ACL capabilities from within WI-tasks to know what sort of
policies are enabled.

Fixes #24663
Internal ref: https://hashicorp.atlassian.net/browse/NMD-423

(reviewers: this requires #25547 to work)

aimeeu
aimeeu previously approved these changes Apr 3, 2025
@aimeeu aimeeu added the theme/docs Documentation issues and enhancements label Apr 3, 2025
Copy link

@arodd arodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to make sure we present enough data to determine whether the policy is applied at the job/group/task level, and it looks like this is potentially covered.

@aimeeu
Copy link
Contributor

aimeeu commented Apr 3, 2025

@pkazmierczak is this going into the 1.10 release?
Nevermind - I saw the Slack thread that this will go into a dot release.

@pkazmierczak
Copy link
Contributor Author

pkazmierczak commented Apr 3, 2025 via email

},
"ModifyIndex": 26,
"Name": "nomad-policy",
"Rules": "# Allow read only access to the default namespace\nnamespace \"default\" {\n policy = \"read\"\n}\n\n# Allow writing to the `foo` namespace\nnamespace \"foo\" {\n policy = \"write\"\n}\n\nagent {\n policy = \"read\"\n}\n\nnode {\n policy = \"read\"\n}\n\nquota {\n policy = \"read\"\n}\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find myself right away also wanting this for non-WI tokens. I know that I can do nomad acl token self, then look at the Policies/Roles, then go inspect those (if I have permission?), but seeing the rules this clearly in a single API call is quite nice (if perhaps a bit of a security concern...)

if the command name was more specific, like nomad acl workload-identity-policy self then I wouldn't have this expectation, but I feel this new command has the same issue as "why doesn't nomad acl token self tell me about my WI token?" only in reverse: "why doesn't nomad acl policy self tell me about my ACL policy?"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, having nomad acl token self and nomad acl policy self both work for both WI and ACL Tokens really seems like the most intuitive behavior.

@gulducat
Copy link
Member

gulducat commented Apr 4, 2025

Anyone wanting an easy way to try this out (after building locally):

wi-policy.hcl:

namespace "shared" {
  variables {
    path "*" {
      capabilities = ["read"]
    }
  }
}

wi-policy.nomad.hcl:

job "wi-policy" {
  type = "batch"
  group "g" {
    restart { attempts = 0 }
    reschedule { attempts = 0 }
    task "t" {
      driver = "raw_exec"
      config {
        command = "bash"
        args    =  ["-xc", "nomad acl policy self; nomad acl policy self -json"]
      }
      env {
        NOMAD_ADDR = "unix:${NOMAD_SECRETS_DIR}/api.sock"
      }
      identity {
        env = true
      }
    }
  }
}
$ nomad acl policy apply -namespace default -job wi-policy shared-policy ./wi-policy.hcl
Successfully wrote "shared-policy" ACL policy!

$ nomad run wi-policy.nomad.hcl
...

$ nomad logs -job wi-policy
Name           Job ID     Group Name       Task Name
shared-policy  wi-policy  <not specified>  <not specified>
{
    "shared-policy": {
        "CreateIndex": 470,
        "Description": "",
        "JobACL": {
            "Group": "",
            "JobID": "wi-policy",
            "Namespace": "default",
            "Task": ""
        },
        "ModifyIndex": 536,
        "Name": "shared-policy",
        "Rules": "namespace \"shared\" {\n  variables {\n    path \"*\" {\n      capabilities = [\"read\"]\n    }\n  }\n}\n\n"
    }
}

@@ -66,7 +66,17 @@ func (c *ACLTokenSelfCommand) Run(args []string) int {
// Get the specified token information
token, _, err := client.ACLTokens().Self(nil)
if err != nil {
c.Ui.Error(fmt.Sprintf("Error fetching self token: %s", err))
if !strings.Contains(err.Error(), "Unexpected response code: 404") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this bool check is reversed, because this is what happens when I use a non-existent token in current main:

$ NOMAD_TOKEN=10000000-0000-0000-0000-000000000000 nomad acl token self
Error fetching self token: Unexpected response code: 404 (ACL token not found)
Suggested change
if !strings.Contains(err.Error(), "Unexpected response code: 404") {
if strings.Contains(err.Error(), "Unexpected response code: 404") {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 I don't follow, Daniel.

On main you will of course get 404 if you try to query token self with WI. What you're showing above is strange to me, prefixing nomad acl token self with a bogus token should give you 403 instead.

The conditional here is correct, though. For any error other than 404, we return error. In case we get 404, we further check job-attached policies, and only if we cannot find any, we say: no acl token and no policies attached.

Does that make sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what exactly I was thinking back then is lost to the mists of time, alas.

but, while I'm here comparing the two error messages from a bad, but valid (uuid), ACL token:

  • nomad release 1.10 is clearly about one thing:

    Error fetching self token: Unexpected response code: 404 (ACL token not found)

  • this pr:

    No ACL tokens or ACL policies attached to a workload identity found.

the new one seems to me like acl token self is only for workload ID, because I don't parse "or ACL policies attached to a workload identity" as a separate item? I get lost grammatically somewhere along the way before hitting "found" and I parse it as "No ... workload identity found."

this might be resolved for me with a single character:

No ACL tokens nor ACL policies attached to a workload identity found.

for bonus clarity, if a bit choppy in rhythm:

No ACL tokens, nor ACL policies attached to a workload identity, found.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pkazmierczak pkazmierczak requested a review from gulducat April 22, 2025 15:25
Base automatically changed from f-self-token-lookup-wi-jwt to main April 22, 2025 15:53
@pkazmierczak pkazmierczak dismissed aimeeu’s stale review April 22, 2025 15:53

The base branch was changed.

@pkazmierczak pkazmierczak added the backport/1.10.x backport to 1.10.x release line label Apr 23, 2025
@pkazmierczak pkazmierczak requested a review from jrasell April 24, 2025 15:09
mismithhisler
mismithhisler previously approved these changes May 6, 2025
Copy link
Member

@mismithhisler mismithhisler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really nice feature, and thanks @gulducat for quick local testing policy/job!

Comment on lines 198 to 202
// Resolve policies for workload identities
policyReply := structs.ACLPolicySetResponse{}
if err := s.agent.RPC("ACL.GetClaimPolicies", &policyArgs, &policyReply); err != nil {
return nil, err
}
Copy link
Member

@tgross tgross May 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My objection to this API as it stands is that we're only presenting the ACL policies for Workload Identity claims but the API is /v1/acl/policy/self. Ignoring unifying the CLI output for the moment, either (a) this API should retrieve policies for any authenticated request or (b) it should be under a different route than /v1/acl/policy/self or (c) the RPC handler should be changed to cover both ACL tokens and WI claims. Unfortunately it looks like we shipped the RPC handler as-is and now it would be awkward to retrieve ACL token policies there. So I think (a) is my preferred option, but we'd need to split the API to send to different RPC handlers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4053a48 implements option (a), but then we end up with a bit of a mess, because GetClaimPolicies and ListPolicies RPCs return different types (maps or slices, respectively). I thought perhaps the key in the map is the WI name, but it's just policy name, so I'll follow-up with a change that just returns a list of ACLPolicyStub to get a unified output.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah but then we'd lose some data that's useful for WI-attached policies if we only return ACLPolicyStub :/

// ACLPolicyListStub is used to for listing ACL policies
type ACLPolicyListStub struct {
	Name        string
	Description string
	Hash        []byte
	CreateIndex uint64
	ModifyIndex uint64
}

This doesn't contain JobACLs or Rules.

Copy link
Contributor

@aimeeu aimeeu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for creating the docs content! I left a few style nits.

Comment on lines +10 to +11
The `acl policy self` command is used to fetch information about the currently
set ACL policies attached to the workload.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The `acl policy self` command is used to fetch information about the currently
set ACL policies attached to the workload.
The `nomad acl policy self` command fetches information about the currently
set ACL policies attached to the workload.

Style nit: use active voice


## Examples

Fetch information about an existing ACL policies attached to the workload:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Fetch information about an existing ACL policies attached to the workload:
Fetch information about an existing ACL policies attached to the workload.

FYI the style guide wants us to introduce a code block with a descriptive, imperative sentence that ends with a period.

@@ -40,3 +40,13 @@ Modify Index = 8
Policies = n/a
Roles = n/a
```

The command will also detect if the current Nomad token is a workload identity
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The command will also detect if the current Nomad token is a workload identity
The command also detects if the current Nomad token is a workload identity

style nit: use present tense

@@ -40,3 +40,13 @@ Modify Index = 8
Policies = n/a
Roles = n/a
```

The command will also detect if the current Nomad token is a workload identity
JWT and respond with a hint if that's the case:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
JWT and respond with a hint if that's the case:
JWT and respond with a hint if that's the case.

|--------|-----------------------|--------------------|
| `GET` | `/v1/acl/policy/self` | `application/json` |

The table below shows this endpoint's support for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The table below shows this endpoint's support for
This table shows this endpoint's support for

style nit: we avoid using "below" or "above"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.10.x backport to 1.10.x release line theme/docs Documentation issues and enhancements theme/workload-identity
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow lookup of self ACL token when using workload identity
6 participants