-
Notifications
You must be signed in to change notification settings - Fork 40
Description
Hello,
We're running GHES 3.12 and we're trying to setup repo webhook.
Garm-server is in kubernetes running the ghcr.io/cloudbase/garm:v0.1.5 image :
configMaps:
config.toml: |-
[default]
enable_webhook_management = true
[logging]
# If using nginx, you'll need to configure connection upgrade headers
# for the /api/v1/ws location. See the sample config in the testdata
# folder.
enable_log_streamer = true
# Set this to "json" if you want to consume these logs in something like
# Loki or ELK.
log_format = "text"
log_level = "debug"
log_source = false
[metrics]
enable = true
disable_auth = false
[jwt_auth]
secret = "awesome_secret_redacted"
time_to_live = "8760h"
[apiserver]
bind = "0.0.0.0"
port = 80
use_tls = false
[database]
backend = "sqlite3"
# This needs to be changed.
passphrase = "awesome_secret_redacted"
[database.sqlite3]
db_file = "/etc/garm/garm.db"
[[provider]]
name = "gcp"
provider_type = "external"
description = "gcp provider"
[provider.external]
provider_executable = "/opt/garm/providers.d/garm-provider-gcp"
config_file = "/etc/garm/garm-provider-gcp.toml"
# This is needed if you want GARM to pass this along to the provider.
environment_variables = ["GOOGLE_APPLICATION_CREDENTIALS"]
garm-provider-gcp.toml: |-
project_id = "project_id"
zone = "gcp_zone"
network_id = "network_self_link"
subnetwork_id = "subnetwork_self_link"
# The credentials file is optional.
# Leave this empty if you want to use the default credentials.
credentials_file = "/etc/garm/service-account-key.json/sa.key"
external_ip_access = false
The volumes are correctly mounted : GARM is up and running, and GCE VMs are correctly created/deleted and can reach GHES and Garm.
I used this command to create the repository on GARM :
/home/user/bin/garm-cli repository add \
--name github-actions \
--owner <The_Org> \
--credentials github-pat \
--install-webhook \
--pool-balancer-type roundrobin \
--random-webhook-secret
/home/user/bin/garm-cli pool create \
--os-type linux \
--os-arch amd64 \
--enabled=true \
--flavor e2-medium \
--image <GCE_IMAGE_SELF_LINK> \
--min-idle-runners 0 \
--repo <the_ID> \
--tags poc-garm \
--provider-name gcpOn GHES side the webhook exists and seems to be configured, although all events get a 404 and workflow hangs forever :
./garm-cli runner list --all (for 5 min)
+----+------+--------+---------------+---------+
| NR | NAME | STATUS | RUNNER STATUS | POOL ID |
+----+------+--------+---------------+---------+
+----+------+--------+---------------+---------+
+--------------------------------------+-------+----------------+-------------+------------------+--------------------+------------------+
| ID | OWNER | NAME | ENDPOINT | CREDENTIALS NAME | POOL BALANCER TYPE | POOL MGR RUNNING |
+--------------------------------------+-------+----------------+-------------+------------------+--------------------+------------------+
| <the_ID> | The_Org | github-actions | my-ghes | github-pat | roundrobin | true |
+--------------------------------------+-------+----------------+-------------+------------------+--------------------+------------------+
./garm-cli controller show
+-------------------------+---------------------------------------------------------------------------+
| FIELD | VALUE |
+-------------------------+---------------------------------------------------------------------------+
| Controller ID | ID |
| Hostname | garm-server-0 |
| Metadata URL | https://stg-garm.my.dns.zone/api/v1/metadata |
| Callback URL | https://stg-garm.my.dns.zone/api/v1/callbacks |
| Webhook Base URL | https://stg-garm.my.dns.zone/webhook |
| Controller Webhook URL | https://stg-garm.my.dns.zone/webhook/ID |
| Minimum Job Age Backoff | 30 |
| Version | v0.1.5 |
+-------------------------+---------------------------------------------------------------------------+
./garm-cli github endpoint list
+-------------+------------------------------+-------------------------+
| NAME | BASE URL | DESCRIPTION |
+-------------+------------------------------+-------------------------+
| github.com | https://github.com | The github.com endpoint |
+-------------+------------------------------+-------------------------+
| my-ghes | https://github.my.dns.zone | My GHES |
+-------------+------------------------------+-------------------------+
The job_count in logs stays at 0 despite me triggering redeliver on the webhook failed event.
Am I missing something ?
WIth a minidlerunners at 2, I have runners on GHES side and in garm via the garm-cli runner list --all command but since the webhook doesn't work it doesn't scale :/
Note : I'm using GARM operator but I created the repo and pool manually to exclude an issue from it.
I also confirmed that network flow GHES => GARM is correct :
admin@github:~$ curl https://stg-garm.my.dns.zone/
404 page not found
admin@github:~$ curl https://stg-garm.my.dns.zone/api/v1
404 page not found
admin@github:~$ curl https://stg-garm.my.dns.zone/api/v1/metadata
{"error":"Authentication failed","details":""}
admin@github:~$ curl https://stg-garm.my.dns.zone/webhook/84474b45-b8e7-4350-9be0-012531f22388
404 page not found
Log is attached to issue. Let me know if I can help to troubleshoot further.
Thanks,

