|
| 1 | +# Processors |
| 2 | + |
| 3 | +**Processors** (also known as push/pull actions) represent operations that each push or pull must go through in order to get approved or rejected. |
| 4 | + |
| 5 | +Processors do not necessarily represent policies. Some processors are just operations that help fetch or process data: For example, [`pullRemote`](#pullremote) simply clones the remote repository from the Git host. |
| 6 | + |
| 7 | +## `parseAction` |
| 8 | + |
| 9 | +A pre-processor that classifies the request into a pull, a push or "default" if it fails to match these. This allows GitProxy to run the correct chain (`pushActionChain`, `pullActionChain` or `defaultActionChain`). Then, it creates an Action object which is used by the selected chain. |
| 10 | + |
| 11 | +This action also handles fallbacks for v1 legacy proxy URLs. |
| 12 | + |
| 13 | +## `checkRepoInAuthorisedList` |
| 14 | + |
| 15 | +Checks if the URL of the repo being pushed to is present in the GitProxy repo database. If no repo URL in the database matches, the push is blocked. |
| 16 | + |
| 17 | +Source: [/src/proxy/processors/push-action/checkRepoInAuthorisedList.ts](/src/proxy/processors/push-action/checkRepoInAuthorisedList.ts) |
| 18 | + |
| 19 | +## `parsePush` |
| 20 | + |
| 21 | +Parses the push request data which comes from the Git client as a buffer that contains packet line data. If anything unexpected happens during parsing, such as malformed pack data or multiple ref updates in a single push, the push will get rejected. |
| 22 | + |
| 23 | +Also handles extraction of push contents, such as the details of the individual commits contained in the push and the details of `committer` (the user attempting to push the commits through the proxy). |
| 24 | + |
| 25 | +Source: [/src/proxy/processors/push-action/parsePush.ts](/src/proxy/processors/push-action/parsePush.ts) |
| 26 | + |
| 27 | +## `checkEmptyBranch` |
| 28 | + |
| 29 | +Checks if the push contains any commit data, or is just an empty branch push (pushing a new branch without any additional commits). Empty branch pushes are blocked because subsequent processors require commit data to work correctly. |
| 30 | + |
| 31 | +Source: [/src/proxy/processors/push-action/checkEmptyBranch.ts](/src/proxy/processors/push-action/checkEmptyBranch.ts) |
| 32 | + |
| 33 | +## `checkCommitMessages` |
| 34 | + |
| 35 | +A **configurable** processor that blocks pushes containing commit messages that match the provided literals or patterns. These patterns can be configured in the `commitConfig.message` entry in `proxy.config.json` or the active configuration file: |
| 36 | + |
| 37 | +```json |
| 38 | +"commitConfig": { |
| 39 | + "author": { |
| 40 | + "email": { |
| 41 | + "local": { |
| 42 | + "block": "" |
| 43 | + }, |
| 44 | + "domain": { |
| 45 | + "allow": ".*" |
| 46 | + } |
| 47 | + } |
| 48 | + }, |
| 49 | + "message": { |
| 50 | + "block": { |
| 51 | + "literals": [], |
| 52 | + "patterns": [] |
| 53 | + } |
| 54 | + }, |
| 55 | + "diff": { |
| 56 | + "block": { |
| 57 | + "literals": [], |
| 58 | + "patterns": [], |
| 59 | + "providers": {} |
| 60 | + } |
| 61 | + } |
| 62 | +}, |
| 63 | +``` |
| 64 | + |
| 65 | +If the arrays are empty, the checks will pass and chain execution will continue. |
| 66 | + |
| 67 | +Note that invalid regex patterns will throw an error during proxy startup. These must be fixed in order to initialize GitProxy. |
| 68 | + |
| 69 | +Source: [/src/proxy/processors/push-action/checkCommitMessages.ts](/src/proxy/processors/push-action/checkCommitMessages.ts) |
| 70 | + |
| 71 | +## `checkAuthorEmails` |
| 72 | + |
| 73 | +Similar to [`checkCommitMessages`](#checkcommitmessages), allows configuring allowed domains or blocked "locals" (the part before "@domain.com"). If any commit(s) author email(s) match the `local.block` regex, the push gets blocked. Likewise, if any of the emails' domains does not match the `domain.allow` regex, the push gets blocked. |
| 74 | + |
| 75 | +If neither of these are configured (set to empty strings), then the checks will pass and chain execution will continue. |
| 76 | + |
| 77 | +Note that invalid regex patterns will throw an error during proxy startup. These must be fixed in order to initialize GitProxy. |
| 78 | + |
| 79 | +Source: [/src/proxy/processors/push-action/checkAuthorEmails.ts](/src/proxy/processors/push-action/checkAuthorEmails.ts) |
| 80 | + |
| 81 | +#### `checkUserPushPermission` |
| 82 | + |
| 83 | +Checks if the push has a valid user email associated to it (the email of the user making the push, **not the individual commit authors**), and if that user is allowed to push to that specific repo. |
| 84 | + |
| 85 | +This step will fail on various scenarios such as: |
| 86 | + |
| 87 | +- Push has no email associated to it (potentially a push parsing error) |
| 88 | +- The email associated to the push matches multiple GitProxy users |
| 89 | +- The user with the given email isn't in the repo's contributor list (`canPush`) |
| 90 | + |
| 91 | +Note: The _pusher_ can potentially be a different user from the _commit author(s)_. In order to filter the commit authors, you must use the `commitConfig.author` config entry. See [`checkAuthorEmails`](#checkauthoremails) for more details. |
| 92 | + |
| 93 | +Source: [/src/proxy/processors/push-action/checkUserPushPermission.ts](/src/proxy/processors/push-action/checkUserPushPermission.ts) |
| 94 | + |
| 95 | +## `pullRemote` |
| 96 | + |
| 97 | +Clones the repository and temporarily stores it locally in a subdirectory of the _.remote_ folder in the deployment. Each clone is named using the base and head SHA of the push, ensuring a unique clone for each different push. The path to the subdirectory is set in the action as the `proxyGitPath` property and is used in subsequent steps. |
| 98 | + |
| 99 | +For private repos, `pullRemote` uses the authorization headers from the push and uses them to authenticate the `git clone` operation. |
| 100 | + |
| 101 | +In the event that the clone fails, pullRemote will automatically delete the _.remote/\*_ directory that it created - unless that failure was caused by a concurrent request for the same push (so that the earlier request can complete if it is going to). |
| 102 | + |
| 103 | +If the clone succeeds then the chain will schedule deletion of the clone by [`clearBareClone`](#clearbareclone) after processing of the chain completes. This ensures that disk space used is recovered, subsequent pushes of the same SHA don't conflict and that user credentials cached in the `git clone` are removed. |
| 104 | + |
| 105 | +Source: [/src/proxy/processors/push-action/pullRemote.ts](/src/proxy/processors/push-action/pullRemote.ts) |
| 106 | + |
| 107 | +## `writePack` |
| 108 | + |
| 109 | +Executes `git receive-pack` with the incoming pack data from the request body in order to receive the pushed data. It also identifies new `.idx` files in `.git/objects/pack` for other processors (such as [`checkHiddenCommits`](#checkhiddencommits)) to scan more efficiently. |
| 110 | + |
| 111 | +Note that `writePack` sets Git's `receive.unpackLimit` to `0`, which forces Git to always create pack files instead of unpacking objects individually. |
| 112 | + |
| 113 | +Source: [/src/proxy/processors/push-action/writePack.ts](/src/proxy/processors/push-action/writePack.ts) |
| 114 | + |
| 115 | +## `checkHiddenCommits` |
| 116 | + |
| 117 | +Detects "hidden" commits in a push, which is possible if the pack file in the push was tampered in some way. |
| 118 | + |
| 119 | +It calls `git verify-pack` on each of the new `.idx` files found in [`writePack`](#writepack). If any unreferenced commits are present, the push is blocked. |
| 120 | + |
| 121 | +Source: [/src/proxy/processors/push-action/checkHiddenCommits.ts](/src/proxy/processors/push-action/checkHiddenCommits.ts) |
| 122 | + |
| 123 | +## `checkIfWaitingAuth` |
| 124 | + |
| 125 | +Checks if the action has been authorised (approved by a reviewer). If so, allows the push to continue to the remote. It simply continues chain execution if the push hasn't been approved. |
| 126 | + |
| 127 | +Source: [/src/proxy/processors/push-action/checkIfWaitingAuth.ts](/src/proxy/processors/push-action/checkIfWaitingAuth.ts) |
| 128 | + |
| 129 | +## `preReceive` |
| 130 | + |
| 131 | +Allows executing pre-receive hooks from `.sh` scripts located in the `./hooks` directory. **Also allows automating the approval process.** This enables admins to reuse GitHub enterprise commit policies and provide a seamless experience for contributors who no longer need to wait for manual approval or be aware of GitProxy intercepting their pushes. |
| 132 | + |
| 133 | +Pre-receive hooks are a feature that allows blocking or automatically approving commits based on rules described in `.sh` scripts. GitHub provides a set of [sample rules](https://github.com/github/platform-samples/blob/master/pre-receive-hooks) to get started. |
| 134 | + |
| 135 | +**Important**: The pre-receive hook does not bypass the other processors in the chain. All processors continue to execute normally, and any of them can still block the push. The pre-receive hook only determines whether the push will be auto-approved, auto-rejected, or require manual review after all processors have completed. |
| 136 | + |
| 137 | +This processor will block the push depending on the exit status of the pre-receive hook: |
| 138 | + |
| 139 | +- Exit status `0`: Sets the push to `autoApproved`. If no other processors block the push, the contributor can immediately push again to the upstream repository without waiting for manual approval. |
| 140 | +- Exit status `1`: Sets the push to `autoRejected`, automatically rejecting the push after the chain completes, regardless of whether the other processors would have allowed it. |
| 141 | +- Exit status `2`: Requires subsequent manual approval as any regular push, even if all processors succeed. |
| 142 | + |
| 143 | +For detailed setup instructions and examples, see the [Pre-Receive Hook configuration guide](https://git-proxy.finos.org/docs/configuration/pre-receive/). |
| 144 | + |
| 145 | +Source: [/src/proxy/processors/push-action/preReceive.ts](/src/proxy/processors/push-action/preReceive.ts) |
| 146 | + |
| 147 | +## `getDiff` |
| 148 | + |
| 149 | +Executes `git diff` to obtain the diff for the given revision range. If there are no commits (possibly due to a malformed push), the push is blocked. |
| 150 | + |
| 151 | +The data extracted in this step is later used in [`scanDiff`](#scandiff). |
| 152 | + |
| 153 | +Source: [/src/proxy/processors/push-action/getDiff.ts](/src/proxy/processors/push-action/getDiff.ts) |
| 154 | + |
| 155 | +## `gitleaks` |
| 156 | + |
| 157 | +Runs [Gitleaks](https://github.com/gitleaks/gitleaks) to detect sensitive information such as API keys and passwords in the commits being pushed to prevent credentials from leaking. |
| 158 | + |
| 159 | +The following parameters can be configured: |
| 160 | + |
| 161 | +- `enabled`: Whether scanning is active. `false` by default |
| 162 | +- `ignoreGitleaksAllow`: Forces scanning even if developers added `gitleaks:allow` comments |
| 163 | +- `noColor`: Controls color output formatting |
| 164 | +- `configPath`: Sets a custom Gitleaks rules file |
| 165 | + |
| 166 | +This processor runs the Gitleaks check starting from the root commit to the `commitFrom` value present in the push. If the Gitleaks check fails (nonzero exit code), or otherwise cannot spawn, the push will be blocked. |
| 167 | + |
| 168 | +Source: [/src/proxy/processors/push-action/gitleaks.ts](/src/proxy/processors/push-action/gitleaks.ts) |
| 169 | + |
| 170 | +## `scanDiff` |
| 171 | + |
| 172 | +A **configurable** processor that blocks pushes containing diff (changes) that match the provided literals or patterns. These patterns can be configured in the `commitConfig.diff` entry in `proxy.config.json` or the active configuration file: |
| 173 | + |
| 174 | +```json |
| 175 | +"commitConfig": { |
| 176 | + "author": { |
| 177 | + "email": { |
| 178 | + "local": { |
| 179 | + "block": "" |
| 180 | + }, |
| 181 | + "domain": { |
| 182 | + "allow": ".*" |
| 183 | + } |
| 184 | + } |
| 185 | + }, |
| 186 | + "message": { |
| 187 | + "block": { |
| 188 | + "literals": [], |
| 189 | + "patterns": [] |
| 190 | + } |
| 191 | + }, |
| 192 | + "diff": { |
| 193 | + "block": { |
| 194 | + "literals": [], |
| 195 | + "patterns": [], |
| 196 | + "providers": {} |
| 197 | + } |
| 198 | + } |
| 199 | +}, |
| 200 | +``` |
| 201 | + |
| 202 | +This will scan every file changed and try to match the configured literals, patterns or providers. If any diff violations are found, the push is blocked. |
| 203 | + |
| 204 | +Note that invalid regex patterns will throw an error during proxy startup. These must be fixed in order to initialize GitProxy. |
| 205 | + |
| 206 | +Source: [/src/proxy/processors/push-action/scanDiff.ts](/src/proxy/processors/push-action/scanDiff.ts) |
| 207 | + |
| 208 | +## `blockForAuth` |
| 209 | + |
| 210 | +This action appends a message to be displayed after all the processors have finished on a pre-approval push. |
| 211 | + |
| 212 | +Note that this message will show again even if the push had been previously rejected by a reviewer or cancelled and resubmitted by the committer. After a manual rejection, pushing again creates a new `action` object so that the push can be re-reviewed and approved. |
| 213 | + |
| 214 | + |
| 215 | + |
| 216 | +Source: [/src/proxy/processors/push-action/blockForAuth.ts](/src/proxy/processors/push-action/blockForAuth.ts) |
| 217 | + |
| 218 | +## `audit` |
| 219 | + |
| 220 | +This action runs after a chain has been executed. It stores in the database the entire `Action` object along with the list of `steps` that the action has gone through and their associated logs or error messages that occurred during processing of the chain. |
| 221 | + |
| 222 | +Note: **`audit` writes all actions** (push, pull, default/unclassified) to the DB. |
| 223 | + |
| 224 | +An action object (or entry in the pushes table) might look like this: |
| 225 | + |
| 226 | +```json |
| 227 | +{ |
| 228 | + "steps": [ |
| 229 | + { |
| 230 | + "logs": [ |
| 231 | + "checkRepoInAuthorisedList - repo https://github.com/finos/git-proxy.git is in the authorisedList" |
| 232 | + ], |
| 233 | + "id": "73d47899-b1f8-45f0-9fd5-ef2535a07bbd", |
| 234 | + "stepName": "checkRepoInAuthorisedList", |
| 235 | + "content": null, |
| 236 | + "error": false, |
| 237 | + "errorMessage": null, |
| 238 | + "blocked": false, |
| 239 | + "blockedMessage": null |
| 240 | + } |
| 241 | + ], |
| 242 | + "error": false, |
| 243 | + "blocked": false, |
| 244 | + "allowPush": false, |
| 245 | + "authorised": false, |
| 246 | + "canceled": false, |
| 247 | + "rejected": false, |
| 248 | + "autoApproved": false, |
| 249 | + "autoRejected": false, |
| 250 | + "commitData": [], |
| 251 | + "id": "1763522405484", |
| 252 | + "type": "default", |
| 253 | + "method": "GET", |
| 254 | + "timestamp": 1763522405484, |
| 255 | + "url": "https://github.com/finos/git-proxy.git", |
| 256 | + "repo": "https://github.com/finos/git-proxy.git", |
| 257 | + "project": "finos", |
| 258 | + "repoName": "git-proxy.git", |
| 259 | + "lastStep": { |
| 260 | + "logs": [ |
| 261 | + "checkRepoInAuthorisedList - repo https://github.com/finos/git-proxy.git is in the authorisedList" |
| 262 | + ], |
| 263 | + "id": "73d47899-b1f8-45f0-9fd5-ef2535a07bbd", |
| 264 | + "stepName": "checkRepoInAuthorisedList", |
| 265 | + "content": null, |
| 266 | + "error": false, |
| 267 | + "errorMessage": null, |
| 268 | + "blocked": false, |
| 269 | + "blockedMessage": null |
| 270 | + }, |
| 271 | + "_id": "h69TOxN1AMsxd0xr" |
| 272 | +} |
| 273 | +``` |
| 274 | + |
| 275 | +## `clearBareClone` |
| 276 | + |
| 277 | +Recursively removes the contents of the (modified) repository clone stored in `./.remote` by [`pullRemote`](#pullremote) and indivated by the `proxyGitPath` property of the `Action`. This clean-up is necessary for: |
| 278 | + |
| 279 | +- Security (cached credentials): |
| 280 | + - Since repositories require a git username and password or personal access token (PAT) on clone and these are cached in the clone, they must be removed to prevent leakage. |
| 281 | +- Managing disk space: |
| 282 | + - Without deletion, `./.remote` would grow indefinitely as new repository clones are added for each push (rather than each repository!) |
| 283 | + - Each action gets a unique directory for isolation in [`pullRemote`](#pullremote), which allows pushes to the same repository for multiple users to be processed concurrently without conflicts or confusion over credentials. |
| 284 | + |
| 285 | +`clearBareClone` runs only if `pullRemote` was successful. |
| 286 | + |
| 287 | +Source: [/src/proxy/processors/post-processor/clearBareClone.ts](/src/proxy/processors/post-processor/clearBareClone.ts) |
0 commit comments