Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 16 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,9 @@ The way we differentiate jobs defined from code from the ones defined from the U

⚠️ Important: If you plan to use this tool but have existing jobs ending with `[[...]]` you should rename them before running any command.

> [!NOTE]
> If keeping job names clean in the dbt Cloud UI is a requirement, the `--use-desc-for-id` flag (or `DBT_JOBS_AS_CODE_USE_DESC_FOR_ID=True` env var) moves the `[[<identifier>]]` tag from the job name to the job description instead. This is an advanced option intended for specific cases — the default name-based approach is recommended for most users.

Below is a demonstration of how to use dbt-jobs-as-code as part of CI/CD, leveraging the new templating features.

[!<img src="screenshot.png" width="600">](https://www.loom.com/share/7c263c560d2044cea9fc82ac8ec125ea?sid=4c2fe693-0aa5-4021-9e94-69d826f3eac5)
Expand All @@ -25,7 +28,7 @@ Terraform is much more powerful but using it requires some knowledge about the t

With this package's approach, people don't need to learn another tool and can configure dbt Cloud using YAML, a language used across the dbt ecosystem:

- **no state file required**: the link between the YAML jobs and the dbt Cloud jobs is stored in the jobs name, in the `[[<identifier>]]` part
- **no state file required**: the link between the YAML jobs and the dbt Cloud jobs is stored in the jobs name, in the `[[<identifier>]]` part (or in the job description when using `--use-desc-for-id`)
- **YAML**: dbt users are familiar with YAML and we created a JSON schema allowing people to verify that their YAML files are correct
- by using filters like `--project-id`, `--environment-id` or `--limit-projects-envs-to-yml` people can limit the projects and environments checked by the tool, which can be used to "promote" jobs between different dbt Cloud environments

Expand Down Expand Up @@ -127,11 +130,13 @@ To do so, the program looks at the YAML file for the config `linked_id`.

Accepts a `--dry-run` flag to see what jobs would be changed, without actually changing them.

When using `--use-desc-for-id`, the `[[ ... ]]` tag is stored in the job description instead of the job name.

#### `unlink`

Command: `dbt-jobs-as-code unlink --config <config_file_or_pattern.yml>` or `dbt-jobs-as-code unlink --account-id <account-id>`

Unlinking jobs removes the `[[ ... ]]` part of the job name in dbt Cloud.
Unlinking jobs removes the `[[ ... ]]` part of the job name (or description, when using `--use-desc-for-id`) in dbt Cloud.

⚠️ This can't be rolled back by the tool. Doing a `unlink` followed by a `sync` will create new instances of the jobs, with the `[[<identifier>]]` part

Expand Down Expand Up @@ -186,15 +191,15 @@ The tool will raise errors if:

### Summary of parameters

| Command | `--project-id` / `-p` | `--environment-id` / `-e` | `--limit-projects-envs-to-yml` / `-l` | `--vars-yml` / `-v` | `--online` | `--job-id` / `-j` | `--identifier` / `-i` | `--dry-run` | `--include-linked-id` |
| --------------- | :-------------------: | :-----------------------: | :-----------------------------------: | :-----------------: | :--------: | :---------------: | :-------------------: | :---------: | :-------------------: |
| plan | ✅ | ✅ | ✅ | ✅ | | | | | |
| sync | ✅ | ✅ | ✅ | ✅ | | | | | |
| validate | | | | ✅ | ✅ | | | | |
| import-jobs | ✅ | ✅ | | | | ✅ | | | ✅ |
| link | | | | | | | | ✅ | |
| unlink | | | | | | | ✅ | ✅ | |
| deactivate-jobs | | | | | | ✅ | | | |
| Command | `--project-id` / `-p` | `--environment-id` / `-e` | `--limit-projects-envs-to-yml` / `-l` | `--vars-yml` / `-v` | `--online` | `--job-id` / `-j` | `--identifier` / `-i` | `--dry-run` | `--include-linked-id` | `--use-desc-for-id` |
| --------------- | :-------------------: | :-----------------------: | :-----------------------------------: | :-----------------: | :--------: | :---------------: | :-------------------: | :---------: | :-------------------: | :-----------------: |
| plan | ✅ | ✅ | ✅ | ✅ | | | | | | ✅ |
| sync | ✅ | ✅ | ✅ | ✅ | | | | | | ✅ |
| validate | | | | ✅ | ✅ | | | | | ✅ |
| import-jobs | ✅ | ✅ | | | | ✅ | | | ✅ | ✅ |
| link | | | | | | | | ✅ | | ✅ |
| unlink | | | | | | | ✅ | ✅ | | ✅ |
| deactivate-jobs | | | | | | ✅ | | | | ✅ |

As a reminder using `--project-id` and/or `--environment-id` is not compatible with using `--limit-projects-envs-to-yml`.
We can only restricts by providing the IDs or by forcing to restrict on the environments and projects in the YML file.
Expand Down
2 changes: 1 addition & 1 deletion docs/advanced_config/jobs_importing.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ To do, so the identifier of the job should be in the format `[[envs_filter:ident
- the jobs with an `envs_filter` that is empty are imported
- the jobs with an `envs_filter` equal to `*` are imported

As an example, if a job is named `My daily job [[uat:my-daily-job]]` :
As an example, if a job is named `My daily job [[uat:my-daily-job]]` (or has `[[uat:my-daily-job]]` in its description when using `--use-desc-for-id`) :

- `dbt-jobs-as-code import-jobs ... --filter uat` will import the job ✅
- `dbt-jobs-as-code import-jobs ...` without a filter will import the job ✅
Expand Down
9 changes: 9 additions & 0 deletions docs/changelog.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,15 @@

To see the details of all changes, head to the GitHub repo

### 1.18

- Add `--use-desc-for-id` flag (env var: `DBT_JOBS_AS_CODE_USE_DESC_FOR_ID`) to store the `[[<identifier>]]` tag in the job description instead of the job name. This is useful when keeping job names clean in the dbt Cloud UI is a requirement. Supported by all commands: `plan`, `sync`, `validate`, `import-jobs`, `link`, `unlink`, `deactivate-jobs`.

### 1.17

- Validate that job descriptions don't exceed the 255 character limit before sending to the API.
- Drop Python 3.9 support (EOL) and update dependencies.

### 1.16

- Add support for `cost_optimization_features` in job definitions. Valid values are `state_aware_orchestration` and `efficient_testing`. This allows for dbt users on the Fusion engine to configure cost optimization natively in their YAML job definitions.
Expand Down
1 change: 1 addition & 0 deletions docs/getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ The following environment variables are used to run the code:

- `DBT_API_KEY`: [Mandatory] The dbt Cloud API key to interact with dbt Cloud. Can be a Service Token (preferred, would require the "job admin" scope) or the API token of a given user
- `DBT_BASE_URL`: [Optional] By default, the tool queries `https://cloud.getdbt.com`, if your dbt Cloud instance is hosted on another domain, define it in this env variable (e.g. `https://emea.dbt.com`)
- `DBT_JOBS_AS_CODE_USE_DESC_FOR_ID`: [Optional] When set to `True`, stores the `[[<identifier>]]` tag in the job description instead of the job name. See `--use-desc-for-id` for details.

## How to use `dbt-jobs-as-code`

Expand Down
50 changes: 45 additions & 5 deletions src/dbt_jobs_as_code/client/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import os
import re

import requests
from beartype.typing import Any, Dict, List, Optional
Expand Down Expand Up @@ -35,9 +36,11 @@ def __init__(
api_key: Optional[str],
base_url: str = "https://cloud.getdbt.com",
disable_ssl_verification: bool = False,
use_desc_for_id: bool = False,
) -> None:
self.account_id = account_id
self._api_key = api_key
self._use_desc_for_id = use_desc_for_id
self._environment_variable_cache: Dict[
int, Dict[str, CustomEnvironmentVariablePayload]
] = {}
Expand All @@ -61,6 +64,30 @@ def _clear_env_var_cache(self, job_definition_id: Optional[int]) -> None:
if job_definition_id in self._environment_variable_cache:
del self._environment_variable_cache[job_definition_id]

@staticmethod
def _pre_process_job_data(data: dict) -> dict:
"""Move [[identifier]] from description back to name for internal processing."""
description = data.get("description", "")
if not description:
return data

identifier_info = JobDefinition._extract_identifier_from_description(description)
if not identifier_info.identifier:
return data

data = dict(data) # shallow copy to avoid mutating caller's dict
raw_id = identifier_info.raw_identifier
# Strip " [[raw_id]]" or "[[raw_id]]" from description (first occurrence only)
data["description"] = re.sub(
r" ?\[\[" + re.escape(raw_id) + r"\]\]",
"",
description,
count=1,
)
# Move identifier to name (where JobDefinition.__init__ expects it)
data["name"] = f"{data['name']} [[{raw_id}]]"
return data

def _check_for_creds(self):
"""Confirm the presence of credentials"""
if not self._api_key:
Expand Down Expand Up @@ -91,7 +118,7 @@ def update_job(self, job: JobDefinition) -> JobDefinition:
response = self._session.post( # Yes, it's actually a POST. Ew.
url=f"{self.base_url}/api/v2/accounts/{self.account_id}/jobs/{job.id}/",
headers=self._headers,
data=job.to_payload(),
data=job.to_payload(use_desc_for_id=self._use_desc_for_id),
verify=self._verify,
)

Expand All @@ -101,7 +128,11 @@ def update_job(self, job: JobDefinition) -> JobDefinition:
else:
logger.success("Job updated successfully.")

return JobDefinition(**(response.json()["data"]), identifier=job.identifier)
raw_data = response.json()["data"]
if self._use_desc_for_id:
raw_data = DBTCloud._pre_process_job_data(raw_data)
return JobDefinition(**raw_data)
return JobDefinition(**raw_data, identifier=job.identifier)

def create_job(self, job: JobDefinition) -> Optional[JobDefinition]:
"""Create a dbt Cloud Job using a JobDefinition"""
Expand All @@ -111,7 +142,7 @@ def create_job(self, job: JobDefinition) -> Optional[JobDefinition]:
response = self._session.post(
url=f"{self.base_url}/api/v2/accounts/{self.account_id}/jobs/",
headers=self._headers,
data=job.to_payload(),
data=job.to_payload(use_desc_for_id=self._use_desc_for_id),
verify=self._verify,
)

Expand All @@ -122,7 +153,11 @@ def create_job(self, job: JobDefinition) -> Optional[JobDefinition]:
else:
logger.success("Job created successfully.")

return JobDefinition(**(response.json()["data"]), identifier=job.identifier)
raw_data = response.json()["data"]
if self._use_desc_for_id:
raw_data = DBTCloud._pre_process_job_data(raw_data)
return JobDefinition(**raw_data)
return JobDefinition(**raw_data, identifier=job.identifier)

def delete_job(self, job: JobDefinition) -> None:
"""Delete a dbt Cloud job."""
Expand Down Expand Up @@ -154,7 +189,10 @@ def get_job(self, job_id: int) -> JobDefinition:
if response.status_code > 200:
logger.error(f"Issue getting the job {job_id}")
raise DBTCloudException(f"Error getting the job {job_id}")
return JobDefinition(**response.json()["data"])
raw_data = response.json()["data"]
if self._use_desc_for_id:
raw_data = DBTCloud._pre_process_job_data(raw_data)
return JobDefinition(**raw_data)

def get_job_missing_fields(self, job_id: int) -> Optional[JobMissingFields]:
"""Generate a Job based on a dbt Cloud job."""
Expand Down Expand Up @@ -191,6 +229,8 @@ def get_jobs(
else:
jobs = self._fetch_jobs(project_ids, None)

if self._use_desc_for_id:
jobs = [DBTCloud._pre_process_job_data(job) for job in jobs]
return [JobDefinition(**job) for job in jobs]

def _fetch_jobs(self, project_ids: List[int], environment_id: Optional[int]) -> List[dict]:
Expand Down
2 changes: 2 additions & 0 deletions src/dbt_jobs_as_code/cloud_yaml_mapping/change_set.py
Original file line number Diff line number Diff line change
Expand Up @@ -235,6 +235,7 @@ def build_change_set(
limit_projects_envs_to_yml: bool = False,
exclude_identifiers_matching: Optional[str] = None,
output_json: bool = False,
use_desc_for_id: bool = False,
):
"""Compares the config of YML files versus dbt Cloud.
Depending on the value of no_update, it will either update the dbt Cloud config or not.
Expand Down Expand Up @@ -284,6 +285,7 @@ def build_change_set(
api_key=os.environ.get("DBT_API_KEY"),
base_url=os.environ.get("DBT_BASE_URL", "https://cloud.getdbt.com"),
disable_ssl_verification=disable_ssl_verification,
use_desc_for_id=use_desc_for_id,
)

cloud_jobs = dbt_cloud.get_jobs(project_ids=project_ids, environment_ids=environment_ids)
Expand Down
Loading
Loading