Skip to content

Commit 44ba76b

Browse files
author
gaspardmoindrot
authored
[RELENG-7422] 📝 Add documentation (#42)
1 parent a95e58d commit 44ba76b

File tree

6 files changed

+318
-22
lines changed

6 files changed

+318
-22
lines changed
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# Local environment
2+
3+
Setting up your local environment
4+
5+
## Install Poetry
6+
7+
The Runner manager uses [Poetry](https://python-poetry.org/), a Python packaging
8+
and dependency management.
9+
10+
To install and use this project, please make sure you have poetry
11+
installed. Follow [poetry](https://python-poetry.org/docs/#installation)
12+
documentation for proper installation instruction.
13+
14+
## Install dependencies
15+
16+
```shell
17+
poetry install
18+
```
Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
# Run it locally
2+
3+
Before starting this guide:
4+
5+
- Follow the [local setup](./local-setup.md) documentation.
6+
7+
## Run
8+
9+
Once everything is properly set up, you can launch the project
10+
with the following command at root level:
11+
12+
```bash
13+
poetry run start
14+
```
15+
16+
The application is now launched and running on port 8000 of the machine.
17+
18+
## Webhook setup
19+
20+
### Ngrok setup
21+
22+
As GitHub Actions Exporter depends on webhook coming from github to work properly.
23+
24+
Ngrok can help you setup a public URL to be used with GitHub webhooks.
25+
26+
You can install Ngrok on your Linux machine using the following command:
27+
28+
```bash
29+
curl -s https://ngrok-agent.s3.amazonaws.com/ngrok.asc | sudo tee /etc/apt/trusted.gpg.d/ngrok.asc >/dev/null && echo "deb https://ngrok-agent.s3.amazonaws.com buster main" | sudo tee /etc/apt/sources.list.d/ngrok.list && sudo apt update && sudo apt install ngrok
30+
```
31+
32+
For more information, you can visit the Ngrok [website](https://ngrok.com/download).
33+
34+
Once installed, you can run the following command to listen on port 8000
35+
of the machine and assign a public URL to it.
36+
37+
```shell
38+
ngrok http 8000
39+
```
40+
41+
### Setting up the webhook
42+
43+
Setup a webhook at the organization level, should be on a link like the following:
44+
`https://github.com/organizations/<your org>/settings/hooks`
45+
46+
- Click on Add Webhook
47+
- In payload url, enter your ngrok url, like the following:
48+
`https://xxxxx.ngrok.io/webhook`
49+
- Content type: application/json
50+
- Click on `Let me select individual events.`
51+
- Select: `Workflow jobs` and `Workflow runs`
52+
- Save
53+
54+
## Setting up your testing repo
55+
56+
Create a new repository in the organization you have configured the runner manager.
57+
58+
And push a workflow in the repository, here is an example:
59+
60+
```yaml
61+
# .github/workflows/test-gh-actions-exporter.yaml
62+
---
63+
name: test-gh-actions-exporter
64+
on:
65+
push:
66+
workflow_dispatch:
67+
jobs:
68+
greet:
69+
strategy:
70+
matrix:
71+
person:
72+
- foo
73+
- bar
74+
runs-on:
75+
- ubuntu
76+
- focal
77+
- large
78+
- gcloud
79+
steps:
80+
- name: sleep
81+
run: sleep 120
82+
- name: Send greeting
83+
run: echo "Hello ${{ matrix.person }}!"
84+
```
85+
86+
Trigger builds and enjoy :beers:

docs/index.md

Lines changed: 4 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,5 @@
1-
# GitHub WebHook Exporter
1+
# GitHub Actions Exporter
22

3-
The idea of this exporter is to be able to expose this service to listen
4-
from WebHooks coming from GitHub.
5-
Then expose those metrics in OpenMetrics format for later usage.
6-
7-
## Install
8-
9-
To install and use this project, please make sure you have [poetry](https://python-poetry.org/) installed.
10-
11-
Then run:
12-
```shell
13-
poetry install
14-
```
15-
16-
## Start
17-
18-
To start the API locally you can use the following command:
19-
20-
```shell
21-
poetry run start
22-
```
3+
The GitHub Actions Exporter is a project used to retrieve information
4+
provided by GitHub, notably through Webhooks, process it, and store it
5+
via Prometheus.
Lines changed: 145 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
# Collected and reported metrics
2+
3+
In first place, it is important to differentiate the `workflow_run`
4+
and the `workflow_job` webhook events.
5+
6+
The `workflow_run` request is triggered when a workflow run is `requested`,
7+
`in_progress`, `completed` or `failure`. However, for this project, we are not
8+
interested in the `cancelled` or `skipped` events, so we will ignore them.
9+
10+
On the other hand, the `workflow_job` request is triggered when a
11+
workflow job is `queued`, `in_progress`, or `completed`. We will also ignore
12+
the `cancelled` or `skipped` events for `workflow_job` in this project.
13+
14+
## Workflow run
15+
16+
Here are the different metrics collected by the GitHub Actions Exporter
17+
project for workflow runs:
18+
19+
The number of workflow rebuilds: `github_actions_workflow_rebuild_count`.
20+
21+
The duration of a workflow in seconds: `github_actions_workflow_duration_seconds`.
22+
23+
Count the number of workflows for each state:
24+
25+
- `github_actions_workflow_failure_count`
26+
- `github_actions_workflow_success_count`
27+
- `github_actions_workflow_cancelled_count`
28+
- `github_actions_workflow_inprogress_count`
29+
- `github_actions_workflow_total_count`
30+
31+
## Workflow job
32+
33+
Here are the different metrics collected by the GitHub Actions
34+
Exporter project for workflows and jobs.
35+
36+
The duration of a job in seconds: `github_actions_job_duration_seconds`.
37+
38+
Time between when a job is requested and started: `github_actions_job_start_duration_seconds`.
39+
40+
Count the number of jobs for each states:
41+
42+
- `github_actions_job_failure_count`
43+
- `github_actions_job_success_count`
44+
- `github_actions_job_cancelled_count`
45+
- `github_actions_job_inprogress_count`
46+
- `github_actions_job_queued_count`
47+
- `github_actions_job_total_count`
48+
49+
## Cost metric
50+
51+
This is the last metric we collect, and it is one of the most important
52+
ones. It allows us to determine the cost of our CI runs.
53+
54+
### Formula
55+
56+
Here is the formula to calculate the cost over a period of time:
57+
58+
```bash
59+
cost = duration (per second) / 60 * cost (per minute)
60+
```
61+
62+
### How do we find the cost per minute?
63+
64+
#### GitHub
65+
66+
As for GitHub, it is quite simple. They provide us with a fixed value, and
67+
the price never varies. To give an example, for `ubuntu-latest`, we have a cost
68+
of 0.008$/min, that's it. Easy!
69+
70+
For larger GitHub hosted runners, such as the high-performance options, the
71+
pricing structure may differ. The exact details and costs associated with those
72+
specific runner types can be obtained from
73+
[GitHub's documentation](https://docs.github.com/en/billing/managing-billing-for-github-actions/about-billing-for-github-actions).
74+
75+
#### Self-Hosted
76+
77+
When it comes to the cost of self-hosted runners, it's a bit more complicated.
78+
79+
To calculate the costs of self-hosted runners, we can play the game of
80+
calculating for the main ones, namely AWS and Google Cloud Provider (GCP).
81+
82+
The cost can be found based on the machine type in the Management Console
83+
for AWS (when creating an EC2 instance) and on the
84+
[Google Cloud website](https://cloud.google.com/compute/vm-instance-pricing)
85+
for GCP.
86+
87+
Key points to consider for retrieving cost information:
88+
89+
!!! note "Cost for self-hosted runners are approximate"
90+
91+
When retrieving the cost of each key point,
92+
calculating the exact cost per minute might not be possible
93+
as it depends on the cloud provider billing policy
94+
and each individual CI workload:
95+
96+
- Internal cloud provider/lab with dedicated hardware.
97+
- Cloud provider billing policy for virtual machines is per hour or day only.
98+
- Price of instance varies during the day, week or month.
99+
- CI job that uploads a large amount of data.
100+
101+
- RAM and CPU Costs : provided cost per minute for RAM and CPU expenses, can
102+
be found in the documentation of the respective cloud provider.
103+
- Storage Costs : provided cost per minute for storage expenses, can
104+
be found in the documentation of the respective cloud provider.
105+
- Bandwidth Cost: Directly determining the cost per minute of bandwidth is
106+
not feasible.
107+
108+
Calculating the bandwidth cost per minutes is up to the discretion of the
109+
user and will vary depending on the workload. As an example, adding an
110+
extra 30% is what we found by comparing the values in the documentation
111+
of different cloud providers (for CPU, RAM, and storage) with the actual
112+
values available on our invoices. Using this information,
113+
estimating the overall cost can be done using the following formula:
114+
(all costs are per minute)
115+
116+
```bash
117+
cost = (cost_per_flavor + cost_per_storage) * percentage_cost_of_bandwidth
118+
```
119+
120+
!!! note
121+
122+
GCP and AWS costs are quite the same for the same flavors.
123+
124+
### The different tags and their associated cost
125+
126+
| Provider | Runner | Cost ($ per min) |
127+
| -------- | -------------------- | ---------------- |
128+
| GitHub | `ubuntu-latest` | 0.008 |
129+
| GitHub | `ubuntu-18.04` | 0.008 |
130+
| GitHub | `ubuntu-20.04` | 0.008 |
131+
| GitHub | `ubuntu-22.04` | 0.008 |
132+
| GitHub | `ubuntu-20.04-4core` | 0.016 |
133+
| GitHub | `ubuntu-22.04-4core` | 0.016 |
134+
| GitHub | `ubuntu-22.04-8core` | 0.032 |
135+
| AWS | `t3.small` | 0.000625 |
136+
| GCP | `n2-standard-2` | 0.0025 |
137+
| AWS | `t3.large` | 0.0025 |
138+
| GCP | `n2-standard-4` | 0.005 |
139+
| GCP | `n2-standard-8` | 0.01 |
140+
141+
!!! note
142+
143+
Please note that the names of large GitHub hosted runners
144+
may not be explicitly the same as shown below, but this is
145+
the naming convention recommended by GitHub.
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
# Prometheus
2+
3+
## Introduction
4+
5+
Prometheus is a powerful open-source monitoring and alerting system that allows
6+
users to collect, store, and analyze time-series data. In this guide, we will
7+
explore how to effectively utilize Prometheus to analyze GitHub Actions.
8+
9+
To collect and analyze GitHub Actions metrics, users need to have an existing
10+
Prometheus installation and configure it to pull metrics
11+
from the `/metrics` endpoint of the exporter.
12+
13+
## Understanding Prometheus Queries
14+
15+
The idea here is not to recreate the entire Prometheus documentation; we will
16+
simply discuss the key points to get you started easily without getting lost in
17+
the plethora of information available on the Internet.
18+
19+
To learn more about Prometheus itself, checkout the official
20+
[documentation](https://prometheus.io/docs/introduction/overview/),
21+
as well as [querying Prometheus](https://prometheus.io/docs/prometheus/latest/querying/basics/).
22+
23+
To proceed, I will take a typical query and break it down, discussing other
24+
potentially useful information to cover.
25+
26+
Let's examining this example query:
27+
28+
```bash
29+
topk(5, sum(increase(github_actions_job_cost_count_total{}[5m]])) by (repository) > 0)
30+
```
31+
32+
This query retrieves data related to GitHub Actions job costs and
33+
provides the top 5 repositories with the highest cumulative cost
34+
within a specified time range.
35+
36+
1. The query starts with the topk(5, ...) function, which returns the
37+
top 5 values based on a specified metric or condition.
38+
2. The sum(increase(...)) part of the query calculates the cumulative
39+
sum of the specified metric. In our example, it calculates the
40+
cumulative sum of the github_actions_job_cost_count_total metric,
41+
representing the total job cost count.
42+
3. The `[5m]` part specifies the time range for the query.
43+
4. The `by (repository)` clause groups the data by the repository label.
44+
This enables the query to calculate the cost sum for each repository individually.
45+
5. The expression `> 0` filters the query results to only include
46+
repositories with a value greater than zero.
47+
48+
!!! info
49+
50+
Using Grafana enhances the visualization of Prometheus data and
51+
provides powerful querying capabilities. Within Grafana, apply filters,
52+
combine queries, and utilize variables for dynamic filtering. It's important
53+
to understand `__interval` (time interval between data points) and `__range`
54+
(selected time range) when working with Prometheus data in Grafana. This
55+
integration enables efficient data exploration and analysis for better
56+
insights and decision-making.

mkdocs.yml

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,15 @@ theme:
3838
code: Roboto Mono
3939

4040
nav:
41-
- Home: index.md
41+
- Home: index.md
42+
43+
- Getting Started:
44+
- Local Setup: getting-started/local-setup.md
45+
- Run it Locally: getting-started/run-it-locally.md
46+
47+
- Metrics Analysis and Prometheus Monitoring for GitHub Actions:
48+
- Collected and reported metrics: metrics-analysis-prometheus/collected-reported-metrics.md
49+
- Prometheus Monitoring: metrics-analysis-prometheus/prometheus.md
4250

4351
markdown_extensions:
4452
- pymdownx.highlight:

0 commit comments

Comments
 (0)