Skip to content

[Fix]: correctly implement webhook secret for both github and gitlab #79

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

mashb1t
Copy link
Contributor

@mashb1t mashb1t commented Mar 29, 2025

Currently the webhook secret token is implemented as alternative to the access token env var(s).

Sending raw passwords or tokens in headers—even over HTTPS—poses risks because these headers can be logged or exposed during debugging and processing, e.g. in firewalls, logs, Man-in-the-middle attacks etc. HTTPS secures data in transit but doesn’t protect the endpoints, making any sensitive token vulnerable if misconfigured or mishandled. Additionally, using a raw password or full token violates the principle of least privilege, as it provides broader access than necessary, increasing the attack surface.

The new implementation mitigates these risks by separating the trusted secret stored in an environment variable from the token received in the request header and then verifying the data in a dedicated function. This approach not only limits the scope and lifetime of the token but also isolates the verification logic to prevent accidental logging or exposure, ultimately providing a more secure method for authenticating between systems.

I've also implemented the reference signature check from https://docs.github.com/en/webhooks/using-webhooks/validating-webhook-deliveries#python-example, which you can find now in webhook_secret_checker.py.

Happy to support on further questions!

@mashb1t mashb1t requested a review from sunmh207 as a code owner March 29, 2025 18:21
@mashb1t mashb1t marked this pull request as draft March 29, 2025 18:47
@mashb1t mashb1t marked this pull request as ready for review March 29, 2025 19:01
@sunmh207
Copy link
Owner

Thanks for your PR, but this will affect the normal operation of the "multiple access token" scenario:

  1. Multiple gitlab projects, each project uses its own project access token.
  2. Multiple gitlab servers (different gitlab_url), multiple project/personal access tokens.

Is there a solution for these situations?

@mashb1t
Copy link
Contributor Author

mashb1t commented Mar 30, 2025

Thanks for the feedback. I understand the problem and did want to confirm with you first of this makes sense, which is why i've put th PR in draft mode.

Currently the secret token functionality doesn ot serve its purpose if used as you've brought up in your previous comment, as there is no counterpart checking authenticity but it's a password transmission via header without validation on API side.

There are 2 solutions for these problems:

  1. already possible: use a reverse proxy and different domains, each corresponding to a separate API docker container (already possible), using one redis container for queueing and multiple workers (a bit fiddly i know, but doable)
  2. extend the logic for IP/domain based env vars to these secrets, so a different secret key can be used per host or project. This still is up for discussion.

What i've currently set up in our infrastrucure is:

  • 1 Gitlab instance
  • 1 system hook with secret token
  • 1 Gitlab user called "AI Review Bot" with a personal access token for API access
  • the project on another server using the personal access token as gitlab token

This allows:

  • project owners to add the user with limited permissions (reporter) to a project OR A GROUP (otherwise not possible without enterprise/ultimate Gitlab license) they manage
  • process only requests where the AI Review Bot user has access due to commit checks in this repository

But it's also possible to set up everything without any secret token env var and then just use the webhooks per project without secret tokens too. The user nevertheless is needed to be added to the project.

If the intention is to set up tokens for MR/commit review per project then this centralized approach might not be feasible, as you would have to separate access tokens on worker side per project and not per Gitlab instance/domain.

What are your thoughts on the matter? Is the usecase i'm describing also one of the intended situations or do you intend to use another approach?

Happy to hear your feedback.

@mashb1t mashb1t marked this pull request as draft March 30, 2025 10:12
@mashb1t
Copy link
Contributor Author

mashb1t commented Mar 30, 2025

Additionally it's a matter of security and access control:

With the current approach everybody can set up their project to use AI reviews, as there is no authentication on the endpoint. If you currently have access to the ai review projects URL, no admin can prevent you from setting up your project.
Power => maintainers of the repository

With the changes, admins can manage the secret and only projects with valid secret tokens in webhooks can access the API, making it way easier for administrators to manage who has access or who doesn't, as only they know the secret and the secret also isn't visible anymore after they've set up the webhook.
Power => admins / maintainers of the API

@mashb1t
Copy link
Contributor Author

mashb1t commented Mar 30, 2025

What just came to my mind: obviously we could add another env var to skip the secret token validation and use it as auth, then it would 100% be backwards compatible and up to the user to decide.

Looking forward to hearing your thoughts.

@sunmh207
Copy link
Owner

sunmh207 commented Apr 1, 2025

Could you please explain a little more about the "another env var to skip the secret token validation" scenario? Fo example, what variable is added and how to set it?

@sunmh207
Copy link
Owner

sunmh207 commented Apr 1, 2025

I did a questionnaire in the WeChat group, and one of the questions was: "Where is your code repository (GitLab) and this review tool deployed?"

So far, 22 feedbacks have been received, and the statistical results of this question are:

  1. All in the intranet (privatized deployment): 77%
  2. All in the public network: 9.1%
  3. Hybrid deployment: 13.6%

So,seems most users will not have security issues.

@mashb1t
Copy link
Contributor Author

mashb1t commented Apr 1, 2025

Could you please explain a little more about the "another env var to skip the secret token validation" scenario? Fo example, what variable is added and how to set it?

I'm referring to an env var which could act like a boolean, switching on and off the secret validation to be able to restore the previous behavior. This way both needs could be combined and security can be bumped for users who need the secret token to be validated.

Should i go ahead and implement the necessary adjustments?

@sunmh207
Copy link
Owner

sunmh207 commented Apr 1, 2025

I still don't quite understand the following code:

gitlab_webhook_secret_token_env = os.getenv('GITLAB_WEBHOOK_SECRET_TOKEN')
gitlab_webhook_secret_token_request = request.headers.get('X-Gitlab-Token')
if not verify_gitlab_webhook_secret_token(gitlab_webhook_secret_token_env, gitlab_webhook_secret_token_request):

Users need to configure Gitlab Access Token in two places: webhook and .env. After this configuration, the token will still be transmitted on the network, which does not seem to be more secure.

@mashb1t
Copy link
Contributor Author

mashb1t commented Apr 1, 2025

GITLAB_WEBHOOK_SECRET_TOKEN is not the Gitlab access token, but a secret shared to identify the webhook and check if the "signature" of the webhook matches the expected known secret in the app.

This has nothing to do with the Gitlab access token, which is completely private.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants