-
Notifications
You must be signed in to change notification settings - Fork 287
Update github workflow setup #2728
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for dlt-hub-docs canceled.
|
71d42be
to
5c1212c
Compare
227c99c
to
7f4c04e
Compare
a7a3aea
to
85b9880
Compare
322fee4
to
ca30439
Compare
20e02e3
to
2e31b94
Compare
extras: "postgres postgis parquet duckdb cli filesystem" | ||
needs_postgres: true | ||
|
||
# Clickhouse OSS (TODO: test with minio s3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clickhouse oss is disabled for now, it would be good to make it work without any remote services, for example test it with s3 using minio
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmmm why? it is only using local filesystem. I'm running tests locally on it with network disconnected. we have compose file for it:
services:
clickhouse:
image: clickhouse/clickhouse-server
ports:
- "9000:9000"
- "8123:8123"
environment:
- CLICKHOUSE_DB=dlt_data
- CLICKHOUSE_USER=loader
- CLICKHOUSE_PASSWORD=loader
- CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT=1
volumes:
- clickhouse_data:/var/lib/clickhouse/
- clickhouse_logs:/var/log/clickhouse-server/
restart: unless-stopped
healthcheck:
test: [ "CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:8123/ping" ]
interval: 3s
timeout: 5s
retries: 5
volumes:
clickhouse_data:
clickhouse_logs:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it uses the local clickhouse but depends on s3, az etc. are you sure you can run the full test suite without network connection? I will uncomment it again and have another look tomorrow, but I think it will not work without secrets.
- name: postgres and redshift | ||
# TODO: check wether duckdb and all filesystem drivers need to be here, if so, explain why | ||
destinations: "[\"redshift\", \"postgres\", \"duckdb\", \"dummy\"]" | ||
filesystem_drivers: "[\"memory\", \"file\", \"r2\", \"s3\", \"gs\", \"az\", \"abfss\", \"gdrive\"]" # excludes sftp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need all these drivers here, I took this from test_destinations.yml
, so it is a pretty ancient part of this setup and I was not sure if there are any combinations of the above destinations that need these bucket settings, but afaik filesystem_drivers
only applies to filesystem tests, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
those are left there after I extracted filesystem to a separate workflow. I think you can leave memory and file. let's see what happens
extras: "postgres redshift postgis s3 gs az parquet duckdb" | ||
with: ",adbc" | ||
|
||
# Qdrant (disabled, TODO: explain why) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this workflow currently is disabled in the repo. I'd like to add a note here why exactly (because I don't know)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we lost our test account. we only test local version now
fead3e6
to
83978ba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for doing this!
branches: [ master, devel ] | ||
|
||
concurrency: | ||
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this workflow will cancel all the workflows it started?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes it will, it's pretty cool!
authorize_run_from_fork: | ||
name: check if fork and if so wether secrets are available | ||
# run when label is assigned OR when we are not a fork | ||
if: ${{ github.event.label.name == 'ci from fork' || (github.event.pull_request.head.repo.full_name == github.repository && (github.event.action == 'opened' || github.event.action == 'synchronize'))}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now the pull_request_target
is deactivated so this authorization is defunct (github.event.pull_request.head.repo.full_name == github.repository as always true)
we can enable PRs from forks in a separate ticket
# Destination and Sources local tests, do not provide secrets | ||
# Other tests that do not require remote connections | ||
# | ||
test_destinations_local: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
those run in parallel right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, all tests that have their conditions met start immediately, so you have a bunch of tests running in parallel which also include matrixes.
needs: test_common | ||
uses: ./.github/workflows/test_sources_local.yml | ||
|
||
test_plus: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe those 3 workflows should only depend on lint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it
extras: "postgres postgis parquet duckdb cli filesystem" | ||
needs_postgres: true | ||
|
||
# Clickhouse OSS (TODO: test with minio s3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmmm why? it is only using local filesystem. I'm running tests locally on it with network disconnected. we have compose file for it:
services:
clickhouse:
image: clickhouse/clickhouse-server
ports:
- "9000:9000"
- "8123:8123"
environment:
- CLICKHOUSE_DB=dlt_data
- CLICKHOUSE_USER=loader
- CLICKHOUSE_PASSWORD=loader
- CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT=1
volumes:
- clickhouse_data:/var/lib/clickhouse/
- clickhouse_logs:/var/log/clickhouse-server/
restart: unless-stopped
healthcheck:
test: [ "CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:8123/ping" ]
interval: 3s
timeout: 5s
retries: 5
volumes:
clickhouse_data:
clickhouse_logs:
- name: postgres and redshift | ||
# TODO: check wether duckdb and all filesystem drivers need to be here, if so, explain why | ||
destinations: "[\"redshift\", \"postgres\", \"duckdb\", \"dummy\"]" | ||
filesystem_drivers: "[\"memory\", \"file\", \"r2\", \"s3\", \"gs\", \"az\", \"abfss\", \"gdrive\"]" # excludes sftp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
those are left there after I extracted filesystem to a separate workflow. I think you can leave memory and file. let's see what happens
post_install_commands: "poetry run pip install sqlalchemy==2.0.18" # minimum version required by `pyiceberg` | ||
additional_tests: "poetry run pytest tests/cli" | ||
|
||
- name: postgres, duckdb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we were also running dummy
destination
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not in the local tests, but I moved it here now from the remote tests where it also was before (in test_destinations.yml)
filesystem_drivers: "[\"memory\"]" | ||
extras: "motherduck s3 gs az parquet" | ||
|
||
# MSSQL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mssql was always running a full suite. this one is fast. we should do the same for remote postgres
extras: "postgres redshift postgis s3 gs az parquet duckdb" | ||
with: ",adbc" | ||
|
||
# Qdrant (disabled, TODO: explain why) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we lost our test account. we only test local version now
poetry run pytest tests/load --ignore tests/load/sources | ||
poetry run pytest tests/cli | ||
env: | ||
DESTINATION__POSTGRES__CREDENTIALS: postgresql://loader:loader@localhost:5432/dlt_data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where did those go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to this new file, see comment above. there is one step in the workflow that copies these back in.
always run all mssql and postgres tests
4aa0388
to
1149593
Compare
e86fb4a
to
f0cffee
Compare
* use both pull request and pull request target on destination workflows * remove additional triggers * marks one test as smoke test and only runs this for the time being * only run one test in common, needs to be reverted later * run common tests only on linter success * fix common workflow * only start workflows on call (do not call them yet) * test master workflow * remove docs changes step from lint * remove local destinations docs change * rename master trigger workflows * change concurrency key * try other dependencies * add destination tests with authorize step * remove authorize and docs step from destination tests * fix destination test * rename main workflow * test inherit secrets * add more workflows to main file * fix starting conditions for some workflows * rename plus tests matrix job * remove concurrency settings for now * add first remote destinations workflow version * move some more remote destinations * remove pytest args * try to fix extras string * add more remote destination tests * rename some workflows and add concurrency settings to main workflow * move test_destinations * fix link to called workflow * add better main workflow labels move clickhouse remote tests * create local destinations test * disabled some workflows * disable clickhouse oss for now split duckdb and postgres local tests into own matrix job * copy ssh agent key * move all local destination secrets into template secrets file * small fixes * enable all tests again * fix local tests * add missing openai dep * try to fix qdrant creds * fix qdrant server / local file differentiation * fix cli test * change workflow dependencies * remove telemetry info and other small changes * run dummy destination with the local tests * remove duckdb from remote tests, always run all mssql and postgres tests * enable clickhouse oss * fix condition for always running all tests * move cli commands to postgres tests * rename clickhouse-compose to be inline with other services * fix clickhouse local credentials and disable tests which require staging destinations * adapt postgres to postgres example to new fixture * fix clickhouse excluded configs * update essential test handling * skip gcs compat test for local clickhouse tests
Description
We need to be more economical with our CI minutes. Proposed changes are:
TODO before finalizing
Review checklist