Skip to content

[0.1.5] Repo webhook on GHES side : 404 page not found #332

@Fabiosilvero

Description

@Fabiosilvero

Hello,

We're running GHES 3.12 and we're trying to setup repo webhook.

Garm-server is in kubernetes running the ghcr.io/cloudbase/garm:v0.1.5 image :

configMaps:
  config.toml: |-
    [default]
    enable_webhook_management = true
    
    [logging]
    # If using nginx, you'll need to configure connection upgrade headers
    # for the /api/v1/ws location. See the sample config in the testdata
    # folder.
    enable_log_streamer = true
    # Set this to "json" if you want to consume these logs in something like
    # Loki or ELK.
    log_format = "text"
    log_level = "debug"
    log_source = false

    [metrics]
      enable = true
      disable_auth = false

    [jwt_auth]
    secret = "awesome_secret_redacted"
    time_to_live = "8760h"

    [apiserver]
      bind = "0.0.0.0"
      port = 80
      use_tls = false

    [database]
      backend = "sqlite3"
      # This needs to be changed.
      passphrase =  "awesome_secret_redacted"
      [database.sqlite3]
        db_file = "/etc/garm/garm.db"
    
    [[provider]]
      name = "gcp"
      provider_type = "external"
      description = "gcp provider"
      [provider.external]
        provider_executable = "/opt/garm/providers.d/garm-provider-gcp"
        config_file = "/etc/garm/garm-provider-gcp.toml"
        # This is needed if you want GARM to pass this along to the provider.
        environment_variables = ["GOOGLE_APPLICATION_CREDENTIALS"]

  garm-provider-gcp.toml: |-
    project_id = "project_id"
    zone = "gcp_zone"
    network_id = "network_self_link"
    subnetwork_id = "subnetwork_self_link"
    # The credentials file is optional.
    # Leave this empty if you want to use the default credentials.
    credentials_file = "/etc/garm/service-account-key.json/sa.key"
    external_ip_access = false

The volumes are correctly mounted : GARM is up and running, and GCE VMs are correctly created/deleted and can reach GHES and Garm.

I used this command to create the repository on GARM :

/home/user/bin/garm-cli repository add \
    --name github-actions \
    --owner <The_Org> \
    --credentials github-pat \
    --install-webhook \
    --pool-balancer-type roundrobin \
    --random-webhook-secret

/home/user/bin/garm-cli pool create \
    --os-type linux \
    --os-arch amd64 \
    --enabled=true \
    --flavor e2-medium \
    --image  <GCE_IMAGE_SELF_LINK> \
    --min-idle-runners 0 \
    --repo <the_ID> \
    --tags poc-garm \
    --provider-name gcp

On GHES side the webhook exists and seems to be configured, although all events get a 404 and workflow hangs forever :

./garm-cli runner list --all (for 5 min)
+----+------+--------+---------------+---------+
| NR | NAME | STATUS | RUNNER STATUS | POOL ID |
+----+------+--------+---------------+---------+
+----+------+--------+---------------+---------+

+--------------------------------------+-------+----------------+-------------+------------------+--------------------+------------------+
| ID                                   | OWNER | NAME           | ENDPOINT    | CREDENTIALS NAME | POOL BALANCER TYPE | POOL MGR RUNNING |
+--------------------------------------+-------+----------------+-------------+------------------+--------------------+------------------+
|  <the_ID> | The_Org | github-actions | my-ghes | github-pat       | roundrobin         | true             |
+--------------------------------------+-------+----------------+-------------+------------------+--------------------+------------------+

./garm-cli controller show
+-------------------------+---------------------------------------------------------------------------+
| FIELD                   | VALUE                                                                     |
+-------------------------+---------------------------------------------------------------------------+
| Controller ID           | ID                                      |
| Hostname                | garm-server-0                                                             |
| Metadata URL            | https://stg-garm.my.dns.zone/api/v1/metadata                              |
| Callback URL            | https://stg-garm.my.dns.zone/api/v1/callbacks                             |
| Webhook Base URL        | https://stg-garm.my.dns.zone/webhook                                      |
| Controller Webhook URL  | https://stg-garm.my.dns.zone/webhook/ID |
| Minimum Job Age Backoff | 30                                                                        |
| Version                 | v0.1.5                                                                    |
+-------------------------+---------------------------------------------------------------------------+

 ./garm-cli github endpoint list
+-------------+------------------------------+-------------------------+
| NAME        | BASE URL                     | DESCRIPTION             |
+-------------+------------------------------+-------------------------+
| github.com  | https://github.com           | The github.com endpoint |
+-------------+------------------------------+-------------------------+
| my-ghes | https://github.my.dns.zone | My GHES             |
+-------------+------------------------------+-------------------------+

Image

Image

The job_count in logs stays at 0 despite me triggering redeliver on the webhook failed event.

Am I missing something ?

WIth a minidlerunners at 2, I have runners on GHES side and in garm via the garm-cli runner list --all command but since the webhook doesn't work it doesn't scale :/

Note : I'm using GARM operator but I created the repo and pool manually to exclude an issue from it.

I also confirmed that network flow GHES => GARM is correct :

admin@github:~$ curl https://stg-garm.my.dns.zone/
404 page not found
admin@github:~$ curl https://stg-garm.my.dns.zone/api/v1
404 page not found
admin@github:~$ curl https://stg-garm.my.dns.zone/api/v1/metadata
{"error":"Authentication failed","details":""}
admin@github:~$ curl https://stg-garm.my.dns.zone/webhook/84474b45-b8e7-4350-9be0-012531f22388
404 page not found

Log is attached to issue. Let me know if I can help to troubleshoot further.

garm.log

Thanks,

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions