Skip to content

Commit 1645dbc

Browse files
authored
Merge pull request #208 from dbt-labs/feature/use-desc-for-id
feat: store [[identifier]] in job description with --use-desc-for-id
2 parents 4856cfa + 95c07a9 commit 1645dbc

12 files changed

Lines changed: 805 additions & 26 deletions

File tree

README.md

Lines changed: 16 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,9 @@ The way we differentiate jobs defined from code from the ones defined from the U
1313

1414
⚠️ Important: If you plan to use this tool but have existing jobs ending with `[[...]]` you should rename them before running any command.
1515

16+
> [!NOTE]
17+
> If keeping job names clean in the dbt Cloud UI is a requirement, the `--use-desc-for-id` flag (or `DBT_JOBS_AS_CODE_USE_DESC_FOR_ID=True` env var) moves the `[[<identifier>]]` tag from the job name to the job description instead. This is an advanced option intended for specific cases — the default name-based approach is recommended for most users.
18+
1619
Below is a demonstration of how to use dbt-jobs-as-code as part of CI/CD, leveraging the new templating features.
1720

1821
[!<img src="screenshot.png" width="600">](https://www.loom.com/share/7c263c560d2044cea9fc82ac8ec125ea?sid=4c2fe693-0aa5-4021-9e94-69d826f3eac5)
@@ -25,7 +28,7 @@ Terraform is much more powerful but using it requires some knowledge about the t
2528

2629
With this package's approach, people don't need to learn another tool and can configure dbt Cloud using YAML, a language used across the dbt ecosystem:
2730

28-
- **no state file required**: the link between the YAML jobs and the dbt Cloud jobs is stored in the jobs name, in the `[[<identifier>]]` part
31+
- **no state file required**: the link between the YAML jobs and the dbt Cloud jobs is stored in the jobs name, in the `[[<identifier>]]` part (or in the job description when using `--use-desc-for-id`)
2932
- **YAML**: dbt users are familiar with YAML and we created a JSON schema allowing people to verify that their YAML files are correct
3033
- by using filters like `--project-id`, `--environment-id` or `--limit-projects-envs-to-yml` people can limit the projects and environments checked by the tool, which can be used to "promote" jobs between different dbt Cloud environments
3134

@@ -127,11 +130,13 @@ To do so, the program looks at the YAML file for the config `linked_id`.
127130

128131
Accepts a `--dry-run` flag to see what jobs would be changed, without actually changing them.
129132

133+
When using `--use-desc-for-id`, the `[[ ... ]]` tag is stored in the job description instead of the job name.
134+
130135
#### `unlink`
131136

132137
Command: `dbt-jobs-as-code unlink --config <config_file_or_pattern.yml>` or `dbt-jobs-as-code unlink --account-id <account-id>`
133138

134-
Unlinking jobs removes the `[[ ... ]]` part of the job name in dbt Cloud.
139+
Unlinking jobs removes the `[[ ... ]]` part of the job name (or description, when using `--use-desc-for-id`) in dbt Cloud.
135140

136141
⚠️ This can't be rolled back by the tool. Doing a `unlink` followed by a `sync` will create new instances of the jobs, with the `[[<identifier>]]` part
137142

@@ -186,15 +191,15 @@ The tool will raise errors if:
186191

187192
### Summary of parameters
188193

189-
| Command | `--project-id` / `-p` | `--environment-id` / `-e` | `--limit-projects-envs-to-yml` / `-l` | `--vars-yml` / `-v` | `--online` | `--job-id` / `-j` | `--identifier` / `-i` | `--dry-run` | `--include-linked-id` |
190-
| --------------- | :-------------------: | :-----------------------: | :-----------------------------------: | :-----------------: | :--------: | :---------------: | :-------------------: | :---------: | :-------------------: |
191-
| plan | ✅ | ✅ | ✅ | ✅ | | | | | |
192-
| sync | ✅ | ✅ | ✅ | ✅ | | | | | |
193-
| validate | | | | ✅ | ✅ | | | | |
194-
| import-jobs | ✅ | ✅ | | | | ✅ | | | ✅ |
195-
| link | | | | | | | | ✅ | |
196-
| unlink | | | | | | | ✅ | ✅ | |
197-
| deactivate-jobs | | | | | | ✅ | | | |
194+
| Command | `--project-id` / `-p` | `--environment-id` / `-e` | `--limit-projects-envs-to-yml` / `-l` | `--vars-yml` / `-v` | `--online` | `--job-id` / `-j` | `--identifier` / `-i` | `--dry-run` | `--include-linked-id` | `--use-desc-for-id` |
195+
| --------------- | :-------------------: | :-----------------------: | :-----------------------------------: | :-----------------: | :--------: | :---------------: | :-------------------: | :---------: | :-------------------: | :-----------------: |
196+
| plan | ✅ | ✅ | ✅ | ✅ | | | | | | ✅ |
197+
| sync | ✅ | ✅ | ✅ | ✅ | | | | | | ✅ |
198+
| validate | | | | ✅ | ✅ | | | | | ✅ |
199+
| import-jobs | ✅ | ✅ | | | | ✅ | | | ✅ | ✅ |
200+
| link | | | | | | | | ✅ | | ✅ |
201+
| unlink | | | | | | | ✅ | ✅ | | ✅ |
202+
| deactivate-jobs | | | | | | ✅ | | | | ✅ |
198203

199204
As a reminder using `--project-id` and/or `--environment-id` is not compatible with using `--limit-projects-envs-to-yml`.
200205
We can only restricts by providing the IDs or by forcing to restrict on the environments and projects in the YML file.

docs/advanced_config/jobs_importing.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -112,7 +112,7 @@ To do, so the identifier of the job should be in the format `[[envs_filter:ident
112112
- the jobs with an `envs_filter` that is empty are imported
113113
- the jobs with an `envs_filter` equal to `*` are imported
114114

115-
As an example, if a job is named `My daily job [[uat:my-daily-job]]` :
115+
As an example, if a job is named `My daily job [[uat:my-daily-job]]` (or has `[[uat:my-daily-job]]` in its description when using `--use-desc-for-id`) :
116116

117117
- `dbt-jobs-as-code import-jobs ... --filter uat` will import the job ✅
118118
- `dbt-jobs-as-code import-jobs ...` without a filter will import the job ✅

docs/changelog.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,15 @@
11

22
To see the details of all changes, head to the GitHub repo
33

4+
### 1.18
5+
6+
- Add `--use-desc-for-id` flag (env var: `DBT_JOBS_AS_CODE_USE_DESC_FOR_ID`) to store the `[[<identifier>]]` tag in the job description instead of the job name. This is useful when keeping job names clean in the dbt Cloud UI is a requirement. Supported by all commands: `plan`, `sync`, `validate`, `import-jobs`, `link`, `unlink`, `deactivate-jobs`.
7+
8+
### 1.17
9+
10+
- Validate that job descriptions don't exceed the 255 character limit before sending to the API.
11+
- Drop Python 3.9 support (EOL) and update dependencies.
12+
413
### 1.16
514

615
- Add support for `cost_optimization_features` in job definitions. Valid values are `state_aware_orchestration` and `efficient_testing`. This allows for dbt users on the Fusion engine to configure cost optimization natively in their YAML job definitions.

docs/getting_started.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ The following environment variables are used to run the code:
3939

4040
- `DBT_API_KEY`: [Mandatory] The dbt Cloud API key to interact with dbt Cloud. Can be a Service Token (preferred, would require the "job admin" scope) or the API token of a given user
4141
- `DBT_BASE_URL`: [Optional] By default, the tool queries `https://cloud.getdbt.com`, if your dbt Cloud instance is hosted on another domain, define it in this env variable (e.g. `https://emea.dbt.com`)
42+
- `DBT_JOBS_AS_CODE_USE_DESC_FOR_ID`: [Optional] When set to `True`, stores the `[[<identifier>]]` tag in the job description instead of the job name. See `--use-desc-for-id` for details.
4243

4344
## How to use `dbt-jobs-as-code`
4445

src/dbt_jobs_as_code/client/__init__.py

Lines changed: 45 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
import os
2+
import re
23

34
import requests
45
from beartype.typing import Any, Dict, List, Optional
@@ -35,9 +36,11 @@ def __init__(
3536
api_key: Optional[str],
3637
base_url: str = "https://cloud.getdbt.com",
3738
disable_ssl_verification: bool = False,
39+
use_desc_for_id: bool = False,
3840
) -> None:
3941
self.account_id = account_id
4042
self._api_key = api_key
43+
self._use_desc_for_id = use_desc_for_id
4144
self._environment_variable_cache: Dict[
4245
int, Dict[str, CustomEnvironmentVariablePayload]
4346
] = {}
@@ -61,6 +64,30 @@ def _clear_env_var_cache(self, job_definition_id: Optional[int]) -> None:
6164
if job_definition_id in self._environment_variable_cache:
6265
del self._environment_variable_cache[job_definition_id]
6366

67+
@staticmethod
68+
def _pre_process_job_data(data: dict) -> dict:
69+
"""Move [[identifier]] from description back to name for internal processing."""
70+
description = data.get("description", "")
71+
if not description:
72+
return data
73+
74+
identifier_info = JobDefinition._extract_identifier_from_description(description)
75+
if not identifier_info.identifier:
76+
return data
77+
78+
data = dict(data) # shallow copy to avoid mutating caller's dict
79+
raw_id = identifier_info.raw_identifier
80+
# Strip " [[raw_id]]" or "[[raw_id]]" from description (first occurrence only)
81+
data["description"] = re.sub(
82+
r" ?\[\[" + re.escape(raw_id) + r"\]\]",
83+
"",
84+
description,
85+
count=1,
86+
)
87+
# Move identifier to name (where JobDefinition.__init__ expects it)
88+
data["name"] = f"{data['name']} [[{raw_id}]]"
89+
return data
90+
6491
def _check_for_creds(self):
6592
"""Confirm the presence of credentials"""
6693
if not self._api_key:
@@ -91,7 +118,7 @@ def update_job(self, job: JobDefinition) -> JobDefinition:
91118
response = self._session.post( # Yes, it's actually a POST. Ew.
92119
url=f"{self.base_url}/api/v2/accounts/{self.account_id}/jobs/{job.id}/",
93120
headers=self._headers,
94-
data=job.to_payload(),
121+
data=job.to_payload(use_desc_for_id=self._use_desc_for_id),
95122
verify=self._verify,
96123
)
97124

@@ -101,7 +128,11 @@ def update_job(self, job: JobDefinition) -> JobDefinition:
101128
else:
102129
logger.success("Job updated successfully.")
103130

104-
return JobDefinition(**(response.json()["data"]), identifier=job.identifier)
131+
raw_data = response.json()["data"]
132+
if self._use_desc_for_id:
133+
raw_data = DBTCloud._pre_process_job_data(raw_data)
134+
return JobDefinition(**raw_data)
135+
return JobDefinition(**raw_data, identifier=job.identifier)
105136

106137
def create_job(self, job: JobDefinition) -> Optional[JobDefinition]:
107138
"""Create a dbt Cloud Job using a JobDefinition"""
@@ -111,7 +142,7 @@ def create_job(self, job: JobDefinition) -> Optional[JobDefinition]:
111142
response = self._session.post(
112143
url=f"{self.base_url}/api/v2/accounts/{self.account_id}/jobs/",
113144
headers=self._headers,
114-
data=job.to_payload(),
145+
data=job.to_payload(use_desc_for_id=self._use_desc_for_id),
115146
verify=self._verify,
116147
)
117148

@@ -122,7 +153,11 @@ def create_job(self, job: JobDefinition) -> Optional[JobDefinition]:
122153
else:
123154
logger.success("Job created successfully.")
124155

125-
return JobDefinition(**(response.json()["data"]), identifier=job.identifier)
156+
raw_data = response.json()["data"]
157+
if self._use_desc_for_id:
158+
raw_data = DBTCloud._pre_process_job_data(raw_data)
159+
return JobDefinition(**raw_data)
160+
return JobDefinition(**raw_data, identifier=job.identifier)
126161

127162
def delete_job(self, job: JobDefinition) -> None:
128163
"""Delete a dbt Cloud job."""
@@ -154,7 +189,10 @@ def get_job(self, job_id: int) -> JobDefinition:
154189
if response.status_code > 200:
155190
logger.error(f"Issue getting the job {job_id}")
156191
raise DBTCloudException(f"Error getting the job {job_id}")
157-
return JobDefinition(**response.json()["data"])
192+
raw_data = response.json()["data"]
193+
if self._use_desc_for_id:
194+
raw_data = DBTCloud._pre_process_job_data(raw_data)
195+
return JobDefinition(**raw_data)
158196

159197
def get_job_missing_fields(self, job_id: int) -> Optional[JobMissingFields]:
160198
"""Generate a Job based on a dbt Cloud job."""
@@ -191,6 +229,8 @@ def get_jobs(
191229
else:
192230
jobs = self._fetch_jobs(project_ids, None)
193231

232+
if self._use_desc_for_id:
233+
jobs = [DBTCloud._pre_process_job_data(job) for job in jobs]
194234
return [JobDefinition(**job) for job in jobs]
195235

196236
def _fetch_jobs(self, project_ids: List[int], environment_id: Optional[int]) -> List[dict]:

src/dbt_jobs_as_code/cloud_yaml_mapping/change_set.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -235,6 +235,7 @@ def build_change_set(
235235
limit_projects_envs_to_yml: bool = False,
236236
exclude_identifiers_matching: Optional[str] = None,
237237
output_json: bool = False,
238+
use_desc_for_id: bool = False,
238239
):
239240
"""Compares the config of YML files versus dbt Cloud.
240241
Depending on the value of no_update, it will either update the dbt Cloud config or not.
@@ -284,6 +285,7 @@ def build_change_set(
284285
api_key=os.environ.get("DBT_API_KEY"),
285286
base_url=os.environ.get("DBT_BASE_URL", "https://cloud.getdbt.com"),
286287
disable_ssl_verification=disable_ssl_verification,
288+
use_desc_for_id=use_desc_for_id,
287289
)
288290

289291
cloud_jobs = dbt_cloud.get_jobs(project_ids=project_ids, environment_ids=environment_ids)

0 commit comments

Comments
 (0)