Conversation
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
pkg/commands/git_clone/params.go
Outdated
| "user-home": { | ||
| Name: "user-home", | ||
| EnvVarName: "KBC_GIT_CLONE_USER_HOME", | ||
| TypeKind: reflect.String, | ||
| DefaultValue: "/tekton/home", | ||
| Usage: "Absolute path to the user's home directory. Set this explicitly if you are running the image as a non-root user.", | ||
| }, |
There was a problem hiding this comment.
We should avoid making the CLI tekton-specific. Can this default to $HOME? In the Task, we can set /tekton/home
pkg/commands/git_clone/params.go
Outdated
| EnvVarName: "KBC_GIT_CLONE_MERGE_SOURCE_DEPTH", | ||
| TypeKind: reflect.Int, | ||
| DefaultValue: "0", | ||
| Usage: "Perform a shallow fetch of the target branch, fetching only the most recent N commits. If empty, fetches the full history of the target branch.", |
There was a problem hiding this comment.
Does 0 mean full history or shallow clone? If full history, then the description should mention that. If shallow clone, then it's inconsistent with the other depth option :harold:
pkg/commands/git_clone/clone.go
Outdated
| // Retry logic for network operations | ||
| var lastErr error | ||
| for attempt := 1; attempt <= maxAttempts; attempt++ { | ||
| if attempt > 1 { | ||
| l.Logger.Infof("Retrying fetch (attempt %d of %d)", attempt, maxAttempts) | ||
| } | ||
|
|
||
| err := c.CliWrappers.GitCli.FetchWithRefspec(checkoutDir, "origin", refspec, c.Params.Depth) | ||
| if err == nil { | ||
| return nil | ||
| } | ||
| lastErr = err | ||
| l.Logger.Warnf("Fetch attempt %d failed: %v", attempt, err) | ||
| } |
There was a problem hiding this comment.
Does our generic retryer API not work here?
| entries, err := os.ReadDir(checkoutDir) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to read checkout directory: %w", err) | ||
| } | ||
|
|
||
| for _, entry := range entries { | ||
| entryPath := filepath.Join(checkoutDir, entry.Name()) | ||
| if err := os.RemoveAll(entryPath); err != nil { | ||
| return fmt.Errorf("failed to remove %s: %w", entryPath, err) | ||
| } | ||
| } |
There was a problem hiding this comment.
Could you add a comment about why it's done this way rather than os.RemoveAll(checkoutDir)?
I assume checkoutDir="." would cause problems otherwise
pkg/cliwrappers/git/git.go
Outdated
| RemoteAdd(workdir, name, url string) (string, error) | ||
| FetchWithRefspec(workdir, remote, refspec string, depth int) error | ||
| Checkout(workdir, ref string) error | ||
| SubmoduleUpdate(workdir string, init bool, paths string) error |
There was a problem hiding this comment.
In the API, paths should probably be a []string and the caller should decide how it splits CLI arguments into an array
pkg/cliwrappers/git/git.go
Outdated
| RemoteAdd(workdir, name, url string) (string, error) | ||
| FetchWithRefspec(workdir, remote, refspec string, depth int) error | ||
| Checkout(workdir, ref string) error | ||
| SetSparseCheckout(workdir, sparseCheckoutDirectories string) error |
| // setupBasicAuth sets up git credentials from a basic-auth workspace. | ||
| // Supports two formats: | ||
| // 1. .git-credentials and .gitconfig files (copied directly) | ||
| // 2. username and password files (kubernetes.io/basic-auth secret format) | ||
| func (c *GitClone) setupBasicAuth() error { |
There was a problem hiding this comment.
It would be nice if some parts of this function could be reused in the prefetch command, as it was apparently copied from the git-clone bash script:
There was a problem hiding this comment.
nevermind; let's refactor this later
pkg/commands/git_clone/params.go
Outdated
| EnvVarName: "KBC_GIT_CLONE_TARGET_BRANCH", | ||
| TypeKind: reflect.String, | ||
| DefaultValue: "main", | ||
| Usage: "The target branch to merge into the revision (if mergeTargetBranch is true).", |
There was a problem hiding this comment.
Nitpick: since we're not making the params camelCase anymore, seeing mergeTargetBranch here looks a little odd. There are other params that have the same issue in their usage notes.
pkg/commands/git_clone/params.go
Outdated
| Name: "subdirectory", | ||
| EnvVarName: "KBC_GIT_CLONE_SUBDIRECTORY", | ||
| TypeKind: reflect.String, | ||
| DefaultValue: "source", |
There was a problem hiding this comment.
I think not using a subdirectory by default is a more intuitive use case here, although I'm looking this from the perspective of a generic user, not only our current build pipelines use case.
There was a problem hiding this comment.
So, coupled with my other comment about delete-existing, I'd make it: not use a subdirectory by default, but also to not clean up the checkout dir by default.
pkg/commands/git_clone/params.go
Outdated
| Name: "delete-existing", | ||
| EnvVarName: "KBC_GIT_CLONE_DELETE_EXISTING", | ||
| TypeKind: reflect.Bool, | ||
| DefaultValue: "true", |
There was a problem hiding this comment.
This is also probably a dangerous default. If I used it on my home dir with a subdirectory "" , then I would be very sad.
There was a problem hiding this comment.
In addition, there seems a logic problem. It defaults to true, but there is no way to disable it.
pkg/commands/git_clone/params.go
Outdated
| DefaultValue: "", | ||
| Usage: "Opt out of proxying HTTP/HTTPS requests.", | ||
| }, | ||
| "verbose": { |
There was a problem hiding this comment.
Is this param being used for anything? Do we plan to use it for anything?
There was a problem hiding this comment.
+1, I'd keep only loglevel, if you want verbose, set debug level
|
Commit message for eca8c2e: I think you meant |
tkdchen
left a comment
There was a problem hiding this comment.
go format formats these files:
modified: pkg/commands/git_clone/git_clone_test.go
modified: pkg/commands/git_clone/mocks_test.go
modified: pkg/commands/git_clone/results.go
| "depth": { | ||
| Name: "depth", | ||
| EnvVarName: "KBC_GIT_CLONE_DEPTH", | ||
| TypeKind: reflect.Int, | ||
| DefaultValue: "1", | ||
| Usage: "Perform a shallow clone, fetching only the most recent N commits.", | ||
| }, |
There was a problem hiding this comment.
0 can be passed to parameter depth, it causes a clone with full commit history. It would be good to describe the behavior explicitly.
pkg/commands/git_clone/params.go
Outdated
| Name: "delete-existing", | ||
| EnvVarName: "KBC_GIT_CLONE_DELETE_EXISTING", | ||
| TypeKind: reflect.Bool, | ||
| DefaultValue: "true", |
There was a problem hiding this comment.
In addition, there seems a logic problem. It defaults to true, but there is no way to disable it.
pkg/commands/git_clone/symlink.go
Outdated
| l "github.com/konflux-ci/konflux-build-cli/pkg/logger" | ||
| ) | ||
|
|
||
| func (c *GitClone) checkSymlinks(checkoutDir string) error { |
There was a problem hiding this comment.
This function does not reference anything from GitClone. Let's make it a common function. I'm also think it should be possible to reuse it for SearchDockerfile.
pkg/commands/git_clone/merge.go
Outdated
| err = c.CliWrappers.GitCli.ConfigLocal(c.getCheckoutDir(), "user.email", "[email protected]") | ||
| if err != nil { | ||
| return err | ||
| } | ||
| err = c.CliWrappers.GitCli.ConfigLocal(c.getCheckoutDir(), "user.name", "Tekton Git Clone Task") | ||
| if err != nil { | ||
| return err | ||
| } |
There was a problem hiding this comment.
It should be good to have a different name to replace tekton in the email and name. Probably "Konflux CI Build". I'm not sure honestly. Any idea?
Nitpick: make constants for the email and user name.
| "ssl-verify": { | ||
| Name: "ssl-verify", | ||
| EnvVarName: "KBC_GIT_CLONE_SSL_VERIFY", | ||
| TypeKind: reflect.Bool, | ||
| DefaultValue: "true", | ||
| Usage: "Set the `http.sslVerify` global git config. Setting this to `false` is not advised unless you are sure that you trust your git remote.", | ||
| }, |
There was a problem hiding this comment.
Usage shows:
--ssl-verify http.sslVerify Set the http.sslVerify global git config. Setting this to `false` is not advised unless you are sure that you trust your git remote. (default true)
But other boolean argument like --submodules does not have the type info:
--submodules Initialize and fetch git submodules. (default true)
Another issue is, when I set --ssl-verify false, Params.SslVerify is still true:
INFO[0000] [param] SSL verify: true
| } | ||
|
|
||
| // setupGitConfig configures git settings for SSL verification and CA bundle. | ||
| func (c *GitClone) setupGitConfig() error { |
There was a problem hiding this comment.
setupGitConfig has side-effect by modifying the git config in host. ConfigLocal calls git config --local.
If the current working directory is not a git repository, git-clone will fail, for example of disabling http.sslVerify:
INFO[0000] Disabling SSL verification (http.sslVerify=false)
FATA[0000] failed to configure http.sslVerify: git config failed with exit code 128: exit status 128 (stderr: fatal: --local can only be used inside a git repository
)
git-clone supports passing config via --config, https://git-scm.com/docs/git-clone#Documentation/git-clone.txt---configkeyvalue
Please write a test to ensure git-clone does not modify the git config in host, or only makes the config available in the target directory.
I made the following manual change for this testing due to the issue mentioned in #45 (comment).
// setupGitConfig configures git settings for SSL verification and CA bundle.
func (c *GitClone) setupGitConfig() error {
+ c.Params.SslVerify = false| TypeKind: reflect.String, | ||
| DefaultValue: "/tekton/home", | ||
| Usage: "Absolute path to the user's home directory. Set this explicitly if you are running the image as a non-root user.", | ||
| }, |
There was a problem hiding this comment.
In addition to Adam's comment, I think we even do not need a user-home. That makes it possible for git-clone to run in host rather than only in container.
git supports reading credentials from custom location, e.g. https://git-scm.com/docs/git-credential-store#Documentation/git-credential-store.txt---filepath.
It is worth to have a dedicated directory to hold credentials and config without having side-effects of writing execution data into the standard git locations in host.
Related comment #45 (comment)
Also ensure this behavior in tests that do not run in container.
MartinBasti
left a comment
There was a problem hiding this comment.
IMHO recursive submodules aren't working
97a539a to
8a8f719
Compare
|
/agentic_review |
1 similar comment
|
/agentic_review |
|
/agentic_review |
1 similar comment
|
/agentic_review |
|
Persistent review updated to latest commit 9a3ae16 |
| // Validate subdirectory for path traversal | ||
| if c.Params.Subdirectory != "" { | ||
| if filepath.IsAbs(c.Params.Subdirectory) { | ||
| return fmt.Errorf("subdirectory must be a relative path, got absolute path: %s", c.Params.Subdirectory) | ||
| } | ||
| if strings.Contains(c.Params.Subdirectory, "..") { | ||
| return fmt.Errorf("subdirectory must not contain path traversal (..): %s", c.Params.Subdirectory) | ||
| } | ||
| // Verify the resolved path stays within OutputDir. | ||
| // When OutputDir is empty, filepath.Abs resolves relative to cwd, | ||
| // which is the correct base for validation. | ||
| baseDir := c.Params.OutputDir | ||
| if baseDir == "" { | ||
| baseDir = "." | ||
| } | ||
| absOutput, err := filepath.Abs(baseDir) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to resolve output dir: %w", err) | ||
| } | ||
| absCheckout, err := filepath.Abs(filepath.Join(baseDir, c.Params.Subdirectory)) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to resolve checkout dir: %w", err) | ||
| } | ||
| if absCheckout != absOutput && !strings.HasPrefix(absCheckout, absOutput+string(filepath.Separator)) { | ||
| return fmt.Errorf("subdirectory must not escape output directory") | ||
| } |
There was a problem hiding this comment.
1. Subdir symlink escape 🐞 Bug ⛨ Security
validateParams() verifies the checkout path stays within OutputDir using filepath.Abs(), but does not resolve symlinks, so an existing symlinked subdirectory (e.g., <output>/source -> /etc) can cause cleanCheckoutDir() and performClone() to operate outside the intended workspace. This can delete/init arbitrary filesystem locations before the later in-repo symlink check runs.
Agent Prompt
### Issue description
`validateParams()` ensures `Subdirectory` doesn't escape `OutputDir` using `filepath.Abs()` + `HasPrefix`, but this can be bypassed when any path component under `OutputDir` is an existing symlink (e.g. `OutputDir/source -> /etc`). Because `Run()` may call `cleanCheckoutDir()` (deleting files) and `performClone()` (git init/fetch/checkout) **before** `common.CheckSymlinks()`, a symlinked checkout path can cause operations outside the workspace.
### Issue Context
This is a checkout-directory *path* containment problem; the existing repo symlink scan (`common.CheckSymlinks`) validates symlinks *inside* the checked out tree and occurs later.
### Fix Focus Areas
- pkg/commands/git_clone/git_clone.go[190-229]
- pkg/commands/git_clone/git_clone.go[59-113]
- pkg/commands/git_clone/setup.go[18-47]
- pkg/commands/git_clone/clone.go[13-24]
### What to change
1. Add a robust containment check that resolves symlinks:
- Resolve `OutputDir` with `Abs` + `EvalSymlinks`.
- After ensuring/creating the checkout directory (or at least its parent), resolve `checkoutDir` with `Abs` + `EvalSymlinks` and confirm it is within the resolved output dir.
2. Before `cleanCheckoutDir()` removes entries, ensure the resolved checkout dir is within the resolved output dir and that the checkout dir itself is not a symlink (`os.Lstat`).
3. If `EvalSymlinks` can’t be applied because the path doesn’t exist yet, create the directory first (safely) and then resolve, or validate each existing path component with `Lstat` to ensure no symlink components are traversed.
### Acceptance criteria
- A pre-existing symlinked `Subdirectory` under `OutputDir` is rejected with a clear error.
- No filesystem deletion or git init can occur outside `OutputDir` due to symlink traversal.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
There was a problem hiding this comment.
While I'm not sure if this would be an issue when in Konflux... since this is technically a standalone CLI that can be ran on baremetal, I guess it can be addressed though
| } | ||
|
|
||
| var minGitVersion = [3]int{2, 25, 0} | ||
| var gitVersionRegex = regexp.MustCompile(`git version (\d+)\.(\d+)\.(\d+)`) |
b3c3f16 to
570e4df
Compare
| c = &GitClone{ | ||
| CliWrappers: CliWrappers{GitCli: _mockGitCli}, | ||
| Params: &Params{ | ||
| URL: "https://github.com/user/repo.git", |
There was a problem hiding this comment.
Better to have the URL to non existing service unless strictly needed.
integration_tests/gitclone_test.go
Outdated
| defer func() { _ = os.RemoveAll(tempDir) }() | ||
|
|
||
| gitClone, err := newGitClone(&gitclone.Params{ | ||
| URL: "https://github.com/kelseyhightower/nocode", |
There was a problem hiding this comment.
I'd avoid using a repository we don't own. We might have tests failed very unexpectedly.
integration_tests/gitclone_test.go
Outdated
|
|
||
| // Clone only specific directories using sparse checkout | ||
| gitClone, err := newGitClone(&gitclone.Params{ | ||
| URL: "https://github.com/konflux-ci/build-definitions", |
There was a problem hiding this comment.
One day it will be deprecated. Will it cause problems in the future?
integration_tests/gitclone_test.go
Outdated
|
|
||
| // Test_GitClone_DoesNotWriteToHomeDirectory verifies that git-clone does not write | ||
| // .git-credentials, .gitconfig, or .ssh/ to the user's $HOME directory. | ||
| func Test_GitClone_DoesNotWriteToHomeDirectory(t *testing.T) { |
There was a problem hiding this comment.
If the test fails, it might mess up your real credentials.
integration_tests/gitclone_test.go
Outdated
| @@ -0,0 +1,776 @@ | |||
| package integration_tests | |||
There was a problem hiding this comment.
The simplicity of the approach for integration tests buys, but we have several issues with it:
- Very likely a containerized version of the clone command will be needed to do integration test of the whole pipeline
- Some tests (for example
Test_GitClone_DoesNotWriteToHomeDirectory) might be dangerous to run on host. - Usage of github.com and/or not owber repositories might bring problems, especially in air gapped environments.
In general, we have at least three options:
- Run git operations on host (current)
- Run git operations in a container like other integration tests do (safe, but still requires external resources).
- Setup local git server (e.g. Forgejo) and run tests in container (safest, but most heavy).
I lean towards option 3, but would like to hear your opinion @konflux-ci/build-maintainers .
There was a problem hiding this comment.
Setup local git server (e.g. Forgejo)
I guess it shouldn't be any harder than the Zot registry ¯\_(ツ)_/¯
There was a problem hiding this comment.
At least option 2 would be good. Option 3 seems like it could be cleanly added onto option 2 later, so if it's not required at the moment, I don't mind the external dependency on github for now.
I think option 3 would be needed if we want to set up git repos in very specific states (like that case where git fetch --tags failed when running the first time but worked the second time)
There was a problem hiding this comment.
Option 3 should not be much harder, there is code that creates commits. Also, we might to do it only once for all tests.
mmorhun
left a comment
There was a problem hiding this comment.
Approving code part, but integration tests must be reworked in a follow up PR.
Assisted-by: Claude Opus 4.5 Signed-off-by: Tomáš Nevrlka <[email protected]>
Assisted-by: Claude Opus 4.5 Signed-off-by: Tomáš Nevrlka <[email protected]>
Assisted-by: Claude Opus 4.5 Signed-off-by: Tomáš Nevrlka <[email protected]>
Clean existing checkout directory contents before cloning. Assisted-by: Claude Opus 4.5 Signed-off-by: Tomáš Nevrlka <[email protected]>
Assisted-by: Claude Opus 4.5 Signed-off-by: Tomáš Nevrlka <[email protected]>
Reject symlinks pointing outside the cloned repository. Assisted-by: Claude Opus 4.5 Signed-off-by: Tomáš Nevrlka <[email protected]>
Assisted-by: Claude Opus 4.5 Signed-off-by: Tomáš Nevrlka <[email protected]>
Support basic auth (git credentials or K8s secrets) and SSH keys. Assisted-by: Claude Opus 4.5 Signed-off-by: Tomáš Nevrlka <[email protected]>
Assisted-by: Claude Opus 4.5 Signed-off-by: Tomáš Nevrlka <[email protected]>
Signed-off-by: Tomáš Nevrlka <[email protected]>
| c.internalDir = internalDir | ||
|
|
||
| // Save and restore all environment variables that may be modified during Run(). | ||
| restoreEnv := createEnvRestoreFunc("GIT_CONFIG_GLOBAL", "GIT_SSH_COMMAND", "GIT_SSL_NO_VERIFY") |
There was a problem hiding this comment.
Would be good to define constants for these environment variables in order to not repeat these strings in the code.
| c.internalDir = internalDir | ||
|
|
||
| // Save and restore all environment variables that may be modified during Run(). | ||
| restoreEnv := createEnvRestoreFunc("GIT_CONFIG_GLOBAL", "GIT_SSH_COMMAND", "GIT_SSL_NO_VERIFY") |
There was a problem hiding this comment.
Passing these environment variables via Cmd.Env would save code, i.e. createEnvRestoreFunc, setup functions and tests. It does not modify host's environment variables.
Rewrite of Bash logic of the git-clone task in Go.
See individual commits, they should form logical chunks.
Reviewer notes: I'm sorry, I tried