Skip to content

Commit 07da631

Browse files
authored
Merge branch 'main' into 1336-fix-invalid-regex-commitConfig
2 parents 3d977e7 + 588d7f3 commit 07da631

File tree

7 files changed

+1192
-1
lines changed

7 files changed

+1192
-1
lines changed

.github/workflows/ci.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ jobs:
3939
node-version: ${{ matrix.node-version }}
4040

4141
- name: Start MongoDB
42-
uses: supercharge/mongodb-github-action@90004df786821b6308fb02299e5835d0dae05d0d # 1.12.0
42+
uses: supercharge/mongodb-github-action@315db7fe45ac2880b7758f1933e6e5d59afd5e94 # 1.12.1
4343
with:
4444
mongodb-version: ${{ matrix.mongodb-version }}
4545

docs/Architecture.md

Lines changed: 579 additions & 0 deletions
Large diffs are not rendered by default.

docs/Processors.md

Lines changed: 287 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,287 @@
1+
# Processors
2+
3+
**Processors** (also known as push/pull actions) represent operations that each push or pull must go through in order to get approved or rejected.
4+
5+
Processors do not necessarily represent policies. Some processors are just operations that help fetch or process data: For example, [`pullRemote`](#pullremote) simply clones the remote repository from the Git host.
6+
7+
## `parseAction`
8+
9+
A pre-processor that classifies the request into a pull, a push or "default" if it fails to match these. This allows GitProxy to run the correct chain (`pushActionChain`, `pullActionChain` or `defaultActionChain`). Then, it creates an Action object which is used by the selected chain.
10+
11+
This action also handles fallbacks for v1 legacy proxy URLs.
12+
13+
## `checkRepoInAuthorisedList`
14+
15+
Checks if the URL of the repo being pushed to is present in the GitProxy repo database. If no repo URL in the database matches, the push is blocked.
16+
17+
Source: [/src/proxy/processors/push-action/checkRepoInAuthorisedList.ts](/src/proxy/processors/push-action/checkRepoInAuthorisedList.ts)
18+
19+
## `parsePush`
20+
21+
Parses the push request data which comes from the Git client as a buffer that contains packet line data. If anything unexpected happens during parsing, such as malformed pack data or multiple ref updates in a single push, the push will get rejected.
22+
23+
Also handles extraction of push contents, such as the details of the individual commits contained in the push and the details of `committer` (the user attempting to push the commits through the proxy).
24+
25+
Source: [/src/proxy/processors/push-action/parsePush.ts](/src/proxy/processors/push-action/parsePush.ts)
26+
27+
## `checkEmptyBranch`
28+
29+
Checks if the push contains any commit data, or is just an empty branch push (pushing a new branch without any additional commits). Empty branch pushes are blocked because subsequent processors require commit data to work correctly.
30+
31+
Source: [/src/proxy/processors/push-action/checkEmptyBranch.ts](/src/proxy/processors/push-action/checkEmptyBranch.ts)
32+
33+
## `checkCommitMessages`
34+
35+
A **configurable** processor that blocks pushes containing commit messages that match the provided literals or patterns. These patterns can be configured in the `commitConfig.message` entry in `proxy.config.json` or the active configuration file:
36+
37+
```json
38+
"commitConfig": {
39+
"author": {
40+
"email": {
41+
"local": {
42+
"block": ""
43+
},
44+
"domain": {
45+
"allow": ".*"
46+
}
47+
}
48+
},
49+
"message": {
50+
"block": {
51+
"literals": [],
52+
"patterns": []
53+
}
54+
},
55+
"diff": {
56+
"block": {
57+
"literals": [],
58+
"patterns": [],
59+
"providers": {}
60+
}
61+
}
62+
},
63+
```
64+
65+
If the arrays are empty, the checks will pass and chain execution will continue.
66+
67+
Note that invalid regex patterns will throw an error during proxy startup. These must be fixed in order to initialize GitProxy.
68+
69+
Source: [/src/proxy/processors/push-action/checkCommitMessages.ts](/src/proxy/processors/push-action/checkCommitMessages.ts)
70+
71+
## `checkAuthorEmails`
72+
73+
Similar to [`checkCommitMessages`](#checkcommitmessages), allows configuring allowed domains or blocked "locals" (the part before "@domain.com"). If any commit(s) author email(s) match the `local.block` regex, the push gets blocked. Likewise, if any of the emails' domains does not match the `domain.allow` regex, the push gets blocked.
74+
75+
If neither of these are configured (set to empty strings), then the checks will pass and chain execution will continue.
76+
77+
Note that invalid regex patterns will throw an error during proxy startup. These must be fixed in order to initialize GitProxy.
78+
79+
Source: [/src/proxy/processors/push-action/checkAuthorEmails.ts](/src/proxy/processors/push-action/checkAuthorEmails.ts)
80+
81+
#### `checkUserPushPermission`
82+
83+
Checks if the push has a valid user email associated to it (the email of the user making the push, **not the individual commit authors**), and if that user is allowed to push to that specific repo.
84+
85+
This step will fail on various scenarios such as:
86+
87+
- Push has no email associated to it (potentially a push parsing error)
88+
- The email associated to the push matches multiple GitProxy users
89+
- The user with the given email isn't in the repo's contributor list (`canPush`)
90+
91+
Note: The _pusher_ can potentially be a different user from the _commit author(s)_. In order to filter the commit authors, you must use the `commitConfig.author` config entry. See [`checkAuthorEmails`](#checkauthoremails) for more details.
92+
93+
Source: [/src/proxy/processors/push-action/checkUserPushPermission.ts](/src/proxy/processors/push-action/checkUserPushPermission.ts)
94+
95+
## `pullRemote`
96+
97+
Clones the repository and temporarily stores it locally in a subdirectory of the _.remote_ folder in the deployment. Each clone is named using the base and head SHA of the push, ensuring a unique clone for each different push. The path to the subdirectory is set in the action as the `proxyGitPath` property and is used in subsequent steps.
98+
99+
For private repos, `pullRemote` uses the authorization headers from the push and uses them to authenticate the `git clone` operation.
100+
101+
In the event that the clone fails, pullRemote will automatically delete the _.remote/\*_ directory that it created - unless that failure was caused by a concurrent request for the same push (so that the earlier request can complete if it is going to).
102+
103+
If the clone succeeds then the chain will schedule deletion of the clone by [`clearBareClone`](#clearbareclone) after processing of the chain completes. This ensures that disk space used is recovered, subsequent pushes of the same SHA don't conflict and that user credentials cached in the `git clone` are removed.
104+
105+
Source: [/src/proxy/processors/push-action/pullRemote.ts](/src/proxy/processors/push-action/pullRemote.ts)
106+
107+
## `writePack`
108+
109+
Executes `git receive-pack` with the incoming pack data from the request body in order to receive the pushed data. It also identifies new `.idx` files in `.git/objects/pack` for other processors (such as [`checkHiddenCommits`](#checkhiddencommits)) to scan more efficiently.
110+
111+
Note that `writePack` sets Git's `receive.unpackLimit` to `0`, which forces Git to always create pack files instead of unpacking objects individually.
112+
113+
Source: [/src/proxy/processors/push-action/writePack.ts](/src/proxy/processors/push-action/writePack.ts)
114+
115+
## `checkHiddenCommits`
116+
117+
Detects "hidden" commits in a push, which is possible if the pack file in the push was tampered in some way.
118+
119+
It calls `git verify-pack` on each of the new `.idx` files found in [`writePack`](#writepack). If any unreferenced commits are present, the push is blocked.
120+
121+
Source: [/src/proxy/processors/push-action/checkHiddenCommits.ts](/src/proxy/processors/push-action/checkHiddenCommits.ts)
122+
123+
## `checkIfWaitingAuth`
124+
125+
Checks if the action has been authorised (approved by a reviewer). If so, allows the push to continue to the remote. It simply continues chain execution if the push hasn't been approved.
126+
127+
Source: [/src/proxy/processors/push-action/checkIfWaitingAuth.ts](/src/proxy/processors/push-action/checkIfWaitingAuth.ts)
128+
129+
## `preReceive`
130+
131+
Allows executing pre-receive hooks from `.sh` scripts located in the `./hooks` directory. **Also allows automating the approval process.** This enables admins to reuse GitHub enterprise commit policies and provide a seamless experience for contributors who no longer need to wait for manual approval or be aware of GitProxy intercepting their pushes.
132+
133+
Pre-receive hooks are a feature that allows blocking or automatically approving commits based on rules described in `.sh` scripts. GitHub provides a set of [sample rules](https://github.com/github/platform-samples/blob/master/pre-receive-hooks) to get started.
134+
135+
**Important**: The pre-receive hook does not bypass the other processors in the chain. All processors continue to execute normally, and any of them can still block the push. The pre-receive hook only determines whether the push will be auto-approved, auto-rejected, or require manual review after all processors have completed.
136+
137+
This processor will block the push depending on the exit status of the pre-receive hook:
138+
139+
- Exit status `0`: Sets the push to `autoApproved`. If no other processors block the push, the contributor can immediately push again to the upstream repository without waiting for manual approval.
140+
- Exit status `1`: Sets the push to `autoRejected`, automatically rejecting the push after the chain completes, regardless of whether the other processors would have allowed it.
141+
- Exit status `2`: Requires subsequent manual approval as any regular push, even if all processors succeed.
142+
143+
For detailed setup instructions and examples, see the [Pre-Receive Hook configuration guide](https://git-proxy.finos.org/docs/configuration/pre-receive/).
144+
145+
Source: [/src/proxy/processors/push-action/preReceive.ts](/src/proxy/processors/push-action/preReceive.ts)
146+
147+
## `getDiff`
148+
149+
Executes `git diff` to obtain the diff for the given revision range. If there are no commits (possibly due to a malformed push), the push is blocked.
150+
151+
The data extracted in this step is later used in [`scanDiff`](#scandiff).
152+
153+
Source: [/src/proxy/processors/push-action/getDiff.ts](/src/proxy/processors/push-action/getDiff.ts)
154+
155+
## `gitleaks`
156+
157+
Runs [Gitleaks](https://github.com/gitleaks/gitleaks) to detect sensitive information such as API keys and passwords in the commits being pushed to prevent credentials from leaking.
158+
159+
The following parameters can be configured:
160+
161+
- `enabled`: Whether scanning is active. `false` by default
162+
- `ignoreGitleaksAllow`: Forces scanning even if developers added `gitleaks:allow` comments
163+
- `noColor`: Controls color output formatting
164+
- `configPath`: Sets a custom Gitleaks rules file
165+
166+
This processor runs the Gitleaks check starting from the root commit to the `commitFrom` value present in the push. If the Gitleaks check fails (nonzero exit code), or otherwise cannot spawn, the push will be blocked.
167+
168+
Source: [/src/proxy/processors/push-action/gitleaks.ts](/src/proxy/processors/push-action/gitleaks.ts)
169+
170+
## `scanDiff`
171+
172+
A **configurable** processor that blocks pushes containing diff (changes) that match the provided literals or patterns. These patterns can be configured in the `commitConfig.diff` entry in `proxy.config.json` or the active configuration file:
173+
174+
```json
175+
"commitConfig": {
176+
"author": {
177+
"email": {
178+
"local": {
179+
"block": ""
180+
},
181+
"domain": {
182+
"allow": ".*"
183+
}
184+
}
185+
},
186+
"message": {
187+
"block": {
188+
"literals": [],
189+
"patterns": []
190+
}
191+
},
192+
"diff": {
193+
"block": {
194+
"literals": [],
195+
"patterns": [],
196+
"providers": {}
197+
}
198+
}
199+
},
200+
```
201+
202+
This will scan every file changed and try to match the configured literals, patterns or providers. If any diff violations are found, the push is blocked.
203+
204+
Note that invalid regex patterns will throw an error during proxy startup. These must be fixed in order to initialize GitProxy.
205+
206+
Source: [/src/proxy/processors/push-action/scanDiff.ts](/src/proxy/processors/push-action/scanDiff.ts)
207+
208+
## `blockForAuth`
209+
210+
This action appends a message to be displayed after all the processors have finished on a pre-approval push.
211+
212+
Note that this message will show again even if the push had been previously rejected by a reviewer or cancelled and resubmitted by the committer. After a manual rejection, pushing again creates a new `action` object so that the push can be re-reviewed and approved.
213+
214+
![blockForAuth output](./img/blockForAuth_output.png)
215+
216+
Source: [/src/proxy/processors/push-action/blockForAuth.ts](/src/proxy/processors/push-action/blockForAuth.ts)
217+
218+
## `audit`
219+
220+
This action runs after a chain has been executed. It stores in the database the entire `Action` object along with the list of `steps` that the action has gone through and their associated logs or error messages that occurred during processing of the chain.
221+
222+
Note: **`audit` writes all actions** (push, pull, default/unclassified) to the DB.
223+
224+
An action object (or entry in the pushes table) might look like this:
225+
226+
```json
227+
{
228+
"steps": [
229+
{
230+
"logs": [
231+
"checkRepoInAuthorisedList - repo https://github.com/finos/git-proxy.git is in the authorisedList"
232+
],
233+
"id": "73d47899-b1f8-45f0-9fd5-ef2535a07bbd",
234+
"stepName": "checkRepoInAuthorisedList",
235+
"content": null,
236+
"error": false,
237+
"errorMessage": null,
238+
"blocked": false,
239+
"blockedMessage": null
240+
}
241+
],
242+
"error": false,
243+
"blocked": false,
244+
"allowPush": false,
245+
"authorised": false,
246+
"canceled": false,
247+
"rejected": false,
248+
"autoApproved": false,
249+
"autoRejected": false,
250+
"commitData": [],
251+
"id": "1763522405484",
252+
"type": "default",
253+
"method": "GET",
254+
"timestamp": 1763522405484,
255+
"url": "https://github.com/finos/git-proxy.git",
256+
"repo": "https://github.com/finos/git-proxy.git",
257+
"project": "finos",
258+
"repoName": "git-proxy.git",
259+
"lastStep": {
260+
"logs": [
261+
"checkRepoInAuthorisedList - repo https://github.com/finos/git-proxy.git is in the authorisedList"
262+
],
263+
"id": "73d47899-b1f8-45f0-9fd5-ef2535a07bbd",
264+
"stepName": "checkRepoInAuthorisedList",
265+
"content": null,
266+
"error": false,
267+
"errorMessage": null,
268+
"blocked": false,
269+
"blockedMessage": null
270+
},
271+
"_id": "h69TOxN1AMsxd0xr"
272+
}
273+
```
274+
275+
## `clearBareClone`
276+
277+
Recursively removes the contents of the (modified) repository clone stored in `./.remote` by [`pullRemote`](#pullremote) and indivated by the `proxyGitPath` property of the `Action`. This clean-up is necessary for:
278+
279+
- Security (cached credentials):
280+
- Since repositories require a git username and password or personal access token (PAT) on clone and these are cached in the clone, they must be removed to prevent leakage.
281+
- Managing disk space:
282+
- Without deletion, `./.remote` would grow indefinitely as new repository clones are added for each push (rather than each repository!)
283+
- Each action gets a unique directory for isolation in [`pullRemote`](#pullremote), which allows pushes to the same repository for multiple users to be processed concurrently without conflicts or confusion over credentials.
284+
285+
`clearBareClone` runs only if `pullRemote` was successful.
286+
287+
Source: [/src/proxy/processors/post-processor/clearBareClone.ts](/src/proxy/processors/post-processor/clearBareClone.ts)

0 commit comments

Comments
 (0)