diff --git a/README.md b/README.md index 52af3d8..1cca9ab 100644 --- a/README.md +++ b/README.md @@ -13,6 +13,9 @@ The way we differentiate jobs defined from code from the ones defined from the U ⚠️ Important: If you plan to use this tool but have existing jobs ending with `[[...]]` you should rename them before running any command. +> [!NOTE] +> If keeping job names clean in the dbt Cloud UI is a requirement, the `--use-desc-for-id` flag (or `DBT_JOBS_AS_CODE_USE_DESC_FOR_ID=True` env var) moves the `[[]]` tag from the job name to the job description instead. This is an advanced option intended for specific cases — the default name-based approach is recommended for most users. + Below is a demonstration of how to use dbt-jobs-as-code as part of CI/CD, leveraging the new templating features. [!](https://www.loom.com/share/7c263c560d2044cea9fc82ac8ec125ea?sid=4c2fe693-0aa5-4021-9e94-69d826f3eac5) @@ -25,7 +28,7 @@ Terraform is much more powerful but using it requires some knowledge about the t With this package's approach, people don't need to learn another tool and can configure dbt Cloud using YAML, a language used across the dbt ecosystem: -- **no state file required**: the link between the YAML jobs and the dbt Cloud jobs is stored in the jobs name, in the `[[]]` part +- **no state file required**: the link between the YAML jobs and the dbt Cloud jobs is stored in the jobs name, in the `[[]]` part (or in the job description when using `--use-desc-for-id`) - **YAML**: dbt users are familiar with YAML and we created a JSON schema allowing people to verify that their YAML files are correct - by using filters like `--project-id`, `--environment-id` or `--limit-projects-envs-to-yml` people can limit the projects and environments checked by the tool, which can be used to "promote" jobs between different dbt Cloud environments @@ -127,11 +130,13 @@ To do so, the program looks at the YAML file for the config `linked_id`. Accepts a `--dry-run` flag to see what jobs would be changed, without actually changing them. +When using `--use-desc-for-id`, the `[[ ... ]]` tag is stored in the job description instead of the job name. + #### `unlink` Command: `dbt-jobs-as-code unlink --config ` or `dbt-jobs-as-code unlink --account-id ` -Unlinking jobs removes the `[[ ... ]]` part of the job name in dbt Cloud. +Unlinking jobs removes the `[[ ... ]]` part of the job name (or description, when using `--use-desc-for-id`) in dbt Cloud. ⚠️ This can't be rolled back by the tool. Doing a `unlink` followed by a `sync` will create new instances of the jobs, with the `[[]]` part @@ -186,15 +191,15 @@ The tool will raise errors if: ### Summary of parameters -| Command | `--project-id` / `-p` | `--environment-id` / `-e` | `--limit-projects-envs-to-yml` / `-l` | `--vars-yml` / `-v` | `--online` | `--job-id` / `-j` | `--identifier` / `-i` | `--dry-run` | `--include-linked-id` | -| --------------- | :-------------------: | :-----------------------: | :-----------------------------------: | :-----------------: | :--------: | :---------------: | :-------------------: | :---------: | :-------------------: | -| plan | ✅ | ✅ | ✅ | ✅ | | | | | | -| sync | ✅ | ✅ | ✅ | ✅ | | | | | | -| validate | | | | ✅ | ✅ | | | | | -| import-jobs | ✅ | ✅ | | | | ✅ | | | ✅ | -| link | | | | | | | | ✅ | | -| unlink | | | | | | | ✅ | ✅ | | -| deactivate-jobs | | | | | | ✅ | | | | +| Command | `--project-id` / `-p` | `--environment-id` / `-e` | `--limit-projects-envs-to-yml` / `-l` | `--vars-yml` / `-v` | `--online` | `--job-id` / `-j` | `--identifier` / `-i` | `--dry-run` | `--include-linked-id` | `--use-desc-for-id` | +| --------------- | :-------------------: | :-----------------------: | :-----------------------------------: | :-----------------: | :--------: | :---------------: | :-------------------: | :---------: | :-------------------: | :-----------------: | +| plan | ✅ | ✅ | ✅ | ✅ | | | | | | ✅ | +| sync | ✅ | ✅ | ✅ | ✅ | | | | | | ✅ | +| validate | | | | ✅ | ✅ | | | | | ✅ | +| import-jobs | ✅ | ✅ | | | | ✅ | | | ✅ | ✅ | +| link | | | | | | | | ✅ | | ✅ | +| unlink | | | | | | | ✅ | ✅ | | ✅ | +| deactivate-jobs | | | | | | ✅ | | | | ✅ | As a reminder using `--project-id` and/or `--environment-id` is not compatible with using `--limit-projects-envs-to-yml`. We can only restricts by providing the IDs or by forcing to restrict on the environments and projects in the YML file. diff --git a/docs/advanced_config/jobs_importing.md b/docs/advanced_config/jobs_importing.md index 789a6af..4c55dd5 100644 --- a/docs/advanced_config/jobs_importing.md +++ b/docs/advanced_config/jobs_importing.md @@ -112,7 +112,7 @@ To do, so the identifier of the job should be in the format `[[envs_filter:ident - the jobs with an `envs_filter` that is empty are imported - the jobs with an `envs_filter` equal to `*` are imported -As an example, if a job is named `My daily job [[uat:my-daily-job]]` : +As an example, if a job is named `My daily job [[uat:my-daily-job]]` (or has `[[uat:my-daily-job]]` in its description when using `--use-desc-for-id`) : - `dbt-jobs-as-code import-jobs ... --filter uat` will import the job ✅ - `dbt-jobs-as-code import-jobs ...` without a filter will import the job ✅ diff --git a/docs/changelog.md b/docs/changelog.md index b5576b4..1e5e7fe 100644 --- a/docs/changelog.md +++ b/docs/changelog.md @@ -1,6 +1,15 @@ To see the details of all changes, head to the GitHub repo +### 1.18 + +- Add `--use-desc-for-id` flag (env var: `DBT_JOBS_AS_CODE_USE_DESC_FOR_ID`) to store the `[[]]` tag in the job description instead of the job name. This is useful when keeping job names clean in the dbt Cloud UI is a requirement. Supported by all commands: `plan`, `sync`, `validate`, `import-jobs`, `link`, `unlink`, `deactivate-jobs`. + +### 1.17 + +- Validate that job descriptions don't exceed the 255 character limit before sending to the API. +- Drop Python 3.9 support (EOL) and update dependencies. + ### 1.16 - Add support for `cost_optimization_features` in job definitions. Valid values are `state_aware_orchestration` and `efficient_testing`. This allows for dbt users on the Fusion engine to configure cost optimization natively in their YAML job definitions. diff --git a/docs/getting_started.md b/docs/getting_started.md index 003e6ec..2370885 100644 --- a/docs/getting_started.md +++ b/docs/getting_started.md @@ -39,6 +39,7 @@ The following environment variables are used to run the code: - `DBT_API_KEY`: [Mandatory] The dbt Cloud API key to interact with dbt Cloud. Can be a Service Token (preferred, would require the "job admin" scope) or the API token of a given user - `DBT_BASE_URL`: [Optional] By default, the tool queries `https://cloud.getdbt.com`, if your dbt Cloud instance is hosted on another domain, define it in this env variable (e.g. `https://emea.dbt.com`) +- `DBT_JOBS_AS_CODE_USE_DESC_FOR_ID`: [Optional] When set to `True`, stores the `[[]]` tag in the job description instead of the job name. See `--use-desc-for-id` for details. ## How to use `dbt-jobs-as-code` diff --git a/src/dbt_jobs_as_code/client/__init__.py b/src/dbt_jobs_as_code/client/__init__.py index 2c887ee..ac6f100 100644 --- a/src/dbt_jobs_as_code/client/__init__.py +++ b/src/dbt_jobs_as_code/client/__init__.py @@ -1,4 +1,5 @@ import os +import re import requests from beartype.typing import Any, Dict, List, Optional @@ -35,9 +36,11 @@ def __init__( api_key: Optional[str], base_url: str = "https://cloud.getdbt.com", disable_ssl_verification: bool = False, + use_desc_for_id: bool = False, ) -> None: self.account_id = account_id self._api_key = api_key + self._use_desc_for_id = use_desc_for_id self._environment_variable_cache: Dict[ int, Dict[str, CustomEnvironmentVariablePayload] ] = {} @@ -61,6 +64,30 @@ def _clear_env_var_cache(self, job_definition_id: Optional[int]) -> None: if job_definition_id in self._environment_variable_cache: del self._environment_variable_cache[job_definition_id] + @staticmethod + def _pre_process_job_data(data: dict) -> dict: + """Move [[identifier]] from description back to name for internal processing.""" + description = data.get("description", "") + if not description: + return data + + identifier_info = JobDefinition._extract_identifier_from_description(description) + if not identifier_info.identifier: + return data + + data = dict(data) # shallow copy to avoid mutating caller's dict + raw_id = identifier_info.raw_identifier + # Strip " [[raw_id]]" or "[[raw_id]]" from description (first occurrence only) + data["description"] = re.sub( + r" ?\[\[" + re.escape(raw_id) + r"\]\]", + "", + description, + count=1, + ) + # Move identifier to name (where JobDefinition.__init__ expects it) + data["name"] = f"{data['name']} [[{raw_id}]]" + return data + def _check_for_creds(self): """Confirm the presence of credentials""" if not self._api_key: @@ -91,7 +118,7 @@ def update_job(self, job: JobDefinition) -> JobDefinition: response = self._session.post( # Yes, it's actually a POST. Ew. url=f"{self.base_url}/api/v2/accounts/{self.account_id}/jobs/{job.id}/", headers=self._headers, - data=job.to_payload(), + data=job.to_payload(use_desc_for_id=self._use_desc_for_id), verify=self._verify, ) @@ -101,7 +128,11 @@ def update_job(self, job: JobDefinition) -> JobDefinition: else: logger.success("Job updated successfully.") - return JobDefinition(**(response.json()["data"]), identifier=job.identifier) + raw_data = response.json()["data"] + if self._use_desc_for_id: + raw_data = DBTCloud._pre_process_job_data(raw_data) + return JobDefinition(**raw_data) + return JobDefinition(**raw_data, identifier=job.identifier) def create_job(self, job: JobDefinition) -> Optional[JobDefinition]: """Create a dbt Cloud Job using a JobDefinition""" @@ -111,7 +142,7 @@ def create_job(self, job: JobDefinition) -> Optional[JobDefinition]: response = self._session.post( url=f"{self.base_url}/api/v2/accounts/{self.account_id}/jobs/", headers=self._headers, - data=job.to_payload(), + data=job.to_payload(use_desc_for_id=self._use_desc_for_id), verify=self._verify, ) @@ -122,7 +153,11 @@ def create_job(self, job: JobDefinition) -> Optional[JobDefinition]: else: logger.success("Job created successfully.") - return JobDefinition(**(response.json()["data"]), identifier=job.identifier) + raw_data = response.json()["data"] + if self._use_desc_for_id: + raw_data = DBTCloud._pre_process_job_data(raw_data) + return JobDefinition(**raw_data) + return JobDefinition(**raw_data, identifier=job.identifier) def delete_job(self, job: JobDefinition) -> None: """Delete a dbt Cloud job.""" @@ -154,7 +189,10 @@ def get_job(self, job_id: int) -> JobDefinition: if response.status_code > 200: logger.error(f"Issue getting the job {job_id}") raise DBTCloudException(f"Error getting the job {job_id}") - return JobDefinition(**response.json()["data"]) + raw_data = response.json()["data"] + if self._use_desc_for_id: + raw_data = DBTCloud._pre_process_job_data(raw_data) + return JobDefinition(**raw_data) def get_job_missing_fields(self, job_id: int) -> Optional[JobMissingFields]: """Generate a Job based on a dbt Cloud job.""" @@ -191,6 +229,8 @@ def get_jobs( else: jobs = self._fetch_jobs(project_ids, None) + if self._use_desc_for_id: + jobs = [DBTCloud._pre_process_job_data(job) for job in jobs] return [JobDefinition(**job) for job in jobs] def _fetch_jobs(self, project_ids: List[int], environment_id: Optional[int]) -> List[dict]: diff --git a/src/dbt_jobs_as_code/cloud_yaml_mapping/change_set.py b/src/dbt_jobs_as_code/cloud_yaml_mapping/change_set.py index 2c3b458..be528ac 100644 --- a/src/dbt_jobs_as_code/cloud_yaml_mapping/change_set.py +++ b/src/dbt_jobs_as_code/cloud_yaml_mapping/change_set.py @@ -235,6 +235,7 @@ def build_change_set( limit_projects_envs_to_yml: bool = False, exclude_identifiers_matching: Optional[str] = None, output_json: bool = False, + use_desc_for_id: bool = False, ): """Compares the config of YML files versus dbt Cloud. Depending on the value of no_update, it will either update the dbt Cloud config or not. @@ -284,6 +285,7 @@ def build_change_set( api_key=os.environ.get("DBT_API_KEY"), base_url=os.environ.get("DBT_BASE_URL", "https://cloud.getdbt.com"), disable_ssl_verification=disable_ssl_verification, + use_desc_for_id=use_desc_for_id, ) cloud_jobs = dbt_cloud.get_jobs(project_ids=project_ids, environment_ids=environment_ids) diff --git a/src/dbt_jobs_as_code/main.py b/src/dbt_jobs_as_code/main.py index 8265a12..18208fe 100644 --- a/src/dbt_jobs_as_code/main.py +++ b/src/dbt_jobs_as_code/main.py @@ -73,6 +73,15 @@ help="Exclude jobs from dbt Cloud if their identifiers match this regex pattern.", ) +option_use_desc_for_id = click.option( + "--use-desc-for-id", + is_flag=True, + envvar="DBT_JOBS_AS_CODE_USE_DESC_FOR_ID", + show_envvar=True, + default=False, + help="Store the [[identifier]] tag in the job description instead of the job name.", +) + @click.group( help=f"dbt-jobs-as-code {VERSION}\n\nA CLI to allow defining dbt Cloud jobs as code", @@ -92,6 +101,7 @@ def cli() -> None: @option_limit_projects_envs_to_yml @option_json_output @option_exclude_identifiers_matching +@option_use_desc_for_id @click.option( "--fail-fast", is_flag=True, @@ -106,6 +116,7 @@ def sync( disable_ssl_verification, output_json: bool, exclude_identifiers_matching: str, + use_desc_for_id: bool, fail_fast: bool, ): """Synchronize a dbt Cloud job config file against dbt Cloud. @@ -138,6 +149,7 @@ def sync( limit_projects_envs_to_yml, exclude_identifiers_matching, output_json=output_json, + use_desc_for_id=use_desc_for_id, ) plan_json = ( change_set.to_json() @@ -178,6 +190,7 @@ def sync( @option_limit_projects_envs_to_yml @option_json_output @option_exclude_identifiers_matching +@option_use_desc_for_id def plan( config: str, vars_yml: str, @@ -187,6 +200,7 @@ def plan( disable_ssl_verification: bool, output_json: bool, exclude_identifiers_matching: str, + use_desc_for_id: bool, ): """Check the difference between a local file and dbt Cloud without updating dbt Cloud. This command will not update dbt Cloud. @@ -217,6 +231,7 @@ def plan( limit_projects_envs_to_yml, exclude_identifiers_matching, output_json=output_json, + use_desc_for_id=use_desc_for_id, ) if len(change_set) == 0: if output_json: @@ -237,7 +252,8 @@ def plan( @click.argument("config", type=str) @option_vars_yml @click.option("--online", is_flag=True, help="Connect to dbt Cloud to check that IDs are correct.") -def validate(config, vars_yml, online, disable_ssl_verification): +@option_use_desc_for_id +def validate(config, vars_yml, online, disable_ssl_verification, use_desc_for_id): """Check that the config file is valid CONFIG is the path to your YML jobs config file (also supports glob patterns for those files or a directory). @@ -262,6 +278,7 @@ def validate(config, vars_yml, online, disable_ssl_verification): api_key=os.environ.get("DBT_API_KEY"), base_url=os.environ.get("DBT_BASE_URL", "https://cloud.getdbt.com"), disable_ssl_verification=disable_ssl_verification, + use_desc_for_id=use_desc_for_id, ) all_environments = dbt_cloud.get_environments(project_ids=list(config_project_ids)) cloud_project_ids = set([env["project_id"] for env in all_environments]) @@ -372,6 +389,7 @@ def validate(config, vars_yml, online, disable_ssl_verification): type=str, help="Only import jobs where the identifier prefix, before `:` contains this value, is empty or is '*'.", ) +@option_use_desc_for_id def import_jobs( config, account_id, @@ -384,6 +402,7 @@ def import_jobs( managed_only=False, templated_fields=None, filter=None, + use_desc_for_id: bool = False, ): """ Generate YML file for import. @@ -412,6 +431,7 @@ def import_jobs( api_key=os.environ.get("DBT_API_KEY"), base_url=os.environ.get("DBT_BASE_URL", "https://cloud.getdbt.com"), disable_ssl_verification=disable_ssl_verification, + use_desc_for_id=use_desc_for_id, ) if check_missing_fields: @@ -450,7 +470,8 @@ def import_jobs( @option_project_ids @option_environment_ids @click.option("--dry-run", is_flag=True, help="In dry run mode we don't update dbt Cloud.") -def link(config, project_id, environment_id, dry_run, disable_ssl_verification): +@option_use_desc_for_id +def link(config, project_id, environment_id, dry_run, disable_ssl_verification, use_desc_for_id): """ Link the YML file to dbt Cloud by adding the identifier to the job name. All relevant jobs get the part [[...]] added to their name @@ -467,6 +488,7 @@ def link(config, project_id, environment_id, dry_run, disable_ssl_verification): api_key=os.environ.get("DBT_API_KEY"), base_url=os.environ.get("DBT_BASE_URL", "https://cloud.getdbt.com"), disable_ssl_verification=disable_ssl_verification, + use_desc_for_id=use_desc_for_id, ) # Filter jobs based on project_id and environment_id if provided @@ -528,8 +550,16 @@ def link(config, project_id, environment_id, dry_run, disable_ssl_verification): multiple=True, help="[Optional] The identifiers we want to unlink. If not provided, all jobs are unlinked.", ) +@option_use_desc_for_id def unlink( - config, account_id, project_id, environment_id, dry_run, identifier, disable_ssl_verification + config, + account_id, + project_id, + environment_id, + dry_run, + identifier, + disable_ssl_verification, + use_desc_for_id, ): """ Unlink the YML file to dbt Cloud. @@ -562,6 +592,7 @@ def unlink( api_key=os.environ.get("DBT_API_KEY"), base_url=os.environ.get("DBT_BASE_URL", "https://cloud.getdbt.com"), disable_ssl_verification=disable_ssl_verification, + use_desc_for_id=use_desc_for_id, ) cloud_jobs = dbt_cloud.get_jobs(project_ids=project_ids, environment_ids=environment_ids) selected_jobs = [job for job in cloud_jobs if job.identifier is not None] @@ -614,8 +645,15 @@ def unlink( multiple=True, help="The ID of the job to deactivate.", ) +@option_use_desc_for_id def deactivate_jobs( - config, account_id, project_id, environment_id, job_id, disable_ssl_verification + config, + account_id, + project_id, + environment_id, + job_id, + disable_ssl_verification, + use_desc_for_id, ): """ Deactivate jobs triggers in dbt Cloud (schedule and CI/CI triggers) without remoing the jobs. @@ -638,6 +676,7 @@ def deactivate_jobs( api_key=os.environ.get("DBT_API_KEY"), base_url=os.environ.get("DBT_BASE_URL", "https://cloud.getdbt.com"), disable_ssl_verification=disable_ssl_verification, + use_desc_for_id=use_desc_for_id, ) cloud_jobs = dbt_cloud.get_jobs() diff --git a/src/dbt_jobs_as_code/schemas/job.py b/src/dbt_jobs_as_code/schemas/job.py index 58de069..b270813 100644 --- a/src/dbt_jobs_as_code/schemas/job.py +++ b/src/dbt_jobs_as_code/schemas/job.py @@ -25,6 +25,7 @@ # Characters allowed in a YAML identifier key (embedded as [[identifier]] in job names). # Must match the character class used when extracting identifiers from job names. VALID_IDENTIFIER_RE = re.compile(r"^[a-zA-Z0-9_-]+$") +DESCRIPTION_MAX_LENGTH = 255 @dataclass @@ -105,7 +106,7 @@ class JobDefinition(BaseModel): generate_docs: bool schedule: Optional[Schedule] = None triggers: Triggers - description: str = Field(default="", max_length=255) + description: str = Field(default="", max_length=DESCRIPTION_MAX_LENGTH) state: int = 1 run_compare_changes: bool = False compare_changes_flags: str = "--select state:modified" @@ -201,15 +202,29 @@ def _extract_identifier_from_name(name: str) -> IdentifierInfo: else: raise ValueError(f"Invalid job identifier - More than 1 colon: '{raw_identifier}'") - def to_payload(self): + _extract_identifier_from_description = _extract_identifier_from_name + + def to_payload(self, use_desc_for_id: bool = False): """Create a dbt Cloud API payload for a JobDefinition.""" - # Rewrite the job name to embed the job ID from job.yml + # Rewrite the job name (or description) to embed the job ID from job.yml payload = self.model_copy() - # if there is an identifier, add it to the name + # if there is an identifier, add it to the name or description # otherwise, it means that we are "unlinking" the job from the job.yml if self.identifier: - payload.name = f"{self.name} [[{self.identifier}]]" + if use_desc_for_id: + stored_desc = ( + f"{self.description} [[{self.identifier}]]" + if self.description + else f"[[{self.identifier}]]" + ) + if len(stored_desc) > DESCRIPTION_MAX_LENGTH: + raise ValueError( + f"Job description too long: '{stored_desc[:50]}...' is {len(stored_desc)} chars (max {DESCRIPTION_MAX_LENGTH})" + ) + payload.description = stored_desc + else: + payload.name = f"{self.name} [[{self.identifier}]]" return payload.model_dump_json( exclude={"linked_id", "identifier", "custom_environment_variables"} ) diff --git a/tests/client/test_use_desc_for_id.py b/tests/client/test_use_desc_for_id.py new file mode 100644 index 0000000..340e3b6 --- /dev/null +++ b/tests/client/test_use_desc_for_id.py @@ -0,0 +1,326 @@ +import json +from unittest.mock import MagicMock + +from dbt_jobs_as_code.client import DBTCloud +from dbt_jobs_as_code.schemas.job import JobDefinition + + +class TestPreProcessJobData: + """Tests for DBTCloud._pre_process_job_data.""" + + def _make_client(self): + return DBTCloud( + account_id=1, + api_key="test-key", + use_desc_for_id=True, + ) + + def test_pre_process_extracts_identifier_from_description(self): + """Extracts identifier from description and sets it as the job identifier.""" + client = self._make_client() + data = {"name": "Daily Job", "description": "Runs nightly [[daily_job]]"} + result = client._pre_process_job_data(data) + assert result["name"] == "Daily Job [[daily_job]]" + assert result["description"] == "Runs nightly" + + def test_pre_process_strips_identifier_from_description_empty(self): + """When description is only the tag, result is empty string.""" + client = self._make_client() + data = {"name": "Daily Job", "description": "[[daily_job]]"} + result = client._pre_process_job_data(data) + assert result["name"] == "Daily Job [[daily_job]]" + assert result["description"] == "" + + def test_pre_process_no_identifier_in_description(self): + """When description has no identifier, data is returned unchanged.""" + client = self._make_client() + data = {"name": "Daily Job", "description": "No identifier here"} + result = client._pre_process_job_data(data) + assert result["name"] == "Daily Job" + assert result["description"] == "No identifier here" + + def test_pre_process_no_description_field(self): + """When description key is missing, data is returned unchanged.""" + client = self._make_client() + data = {"name": "Daily Job"} + result = client._pre_process_job_data(data) + assert result == {"name": "Daily Job"} + + def test_pre_process_with_filter_in_identifier(self): + """Handles [[filter:id]] format correctly.""" + client = self._make_client() + data = {"name": "Daily Job", "description": "Runs nightly [[prod:daily_job]]"} + result = client._pre_process_job_data(data) + assert result["name"] == "Daily Job [[prod:daily_job]]" + assert result["description"] == "Runs nightly" + + def test_pre_process_does_not_mutate_caller_dict(self): + """_pre_process_job_data must not mutate the original dict.""" + client = self._make_client() + original = {"name": "Daily Job", "description": "Runs nightly [[daily_job]]"} + original_description = original["description"] + original_name = original["name"] + client._pre_process_job_data(original) + assert original["description"] == original_description + assert original["name"] == original_name + + def test_client_stores_use_desc_for_id_flag(self): + """DBTCloud stores use_desc_for_id on the instance.""" + client = DBTCloud(account_id=1, api_key="test-key", use_desc_for_id=True) + assert client._use_desc_for_id is True + + def test_client_defaults_use_desc_for_id_to_false(self): + """use_desc_for_id defaults to False.""" + client = DBTCloud(account_id=1, api_key="test-key") + assert client._use_desc_for_id is False + + +class TestGetJobsDescMode: + """Integration tests: get_job/get_jobs call _pre_process_job_data when use_desc_for_id=True.""" + + def _make_client(self, use_desc_for_id=True): + return DBTCloud( + account_id=1, + api_key="test-key", + use_desc_for_id=use_desc_for_id, + ) + + def _raw_job(self, name="Daily Job", description="Runs nightly [[daily_job]]"): + """Minimal API response dict for a job.""" + return { + "id": 42, + "name": name, + "description": description, + "account_id": 1, + "project_id": 100, + "environment_id": 200, + "settings": {}, + "triggers": {}, + "execute_steps": ["dbt build"], + "run_generate_sources": False, + "generate_docs": False, + "schedule": {"cron": "0 0 * * *"}, + "state": 1, + } + + def test_get_job_extracts_identifier_from_description(self): + """get_job pre-processes API response to move [[id]] from description to name.""" + client = self._make_client(use_desc_for_id=True) + raw = self._raw_job() + + mock_resp = MagicMock() + mock_resp.status_code = 200 + mock_resp.json.return_value = {"data": raw} + client._session.get = MagicMock(return_value=mock_resp) + + job = client.get_job(job_id=42) + + assert job.identifier == "daily_job" + assert job.name == "Daily Job" + assert job.description == "Runs nightly" + + def test_get_job_no_preprocessing_when_flag_off(self): + """get_job does NOT pre-process when use_desc_for_id=False; identifier stays in description.""" + client = self._make_client(use_desc_for_id=False) + # Raw API form: identifier is in description, not in name + raw = self._raw_job(name="Daily Job", description="Runs nightly [[daily_job]]") + + mock_resp = MagicMock() + mock_resp.status_code = 200 + mock_resp.json.return_value = {"data": raw} + client._session.get = MagicMock(return_value=mock_resp) + + job = client.get_job(job_id=42) + + # Without preprocessing, name has no [[id]], so identifier is None + assert job.identifier is None + assert job.name == "Daily Job" + # Description is untouched — still contains the tag + assert job.description == "Runs nightly [[daily_job]]" + + def test_get_jobs_extracts_identifiers_from_descriptions(self): + """get_jobs pre-processes all jobs in the API response.""" + client = self._make_client(use_desc_for_id=True) + raw_jobs = [ + self._raw_job(name="Job A", description="Desc A [[job_a]]"), + self._raw_job(name="Job B", description="Desc B [[job_b]]"), + ] + + mock_resp = MagicMock() + mock_resp.status_code = 200 + mock_resp.json.return_value = { + "data": raw_jobs, + "extra": { + "filters": {"limit": 100, "offset": 0}, + "pagination": {"total_count": 2}, + }, + } + client._session.get = MagicMock(return_value=mock_resp) + + jobs = client.get_jobs(project_ids=[100]) + + assert len(jobs) == 2 + jobs_by_id = {j.identifier: j for j in jobs} + assert jobs_by_id["job_a"].identifier == "job_a" + assert jobs_by_id["job_a"].description == "Desc A" + assert jobs_by_id["job_b"].identifier == "job_b" + assert jobs_by_id["job_b"].description == "Desc B" + + def test_get_jobs_no_preprocessing_when_flag_off(self): + """get_jobs does NOT pre-process when use_desc_for_id=False.""" + client = self._make_client(use_desc_for_id=False) + raw_jobs = [ + self._raw_job(name="Job A", description="Desc A [[job_a]]"), + ] + + mock_resp = MagicMock() + mock_resp.status_code = 200 + mock_resp.json.return_value = { + "data": raw_jobs, + "extra": { + "filters": {"limit": 100, "offset": 0}, + "pagination": {"total_count": 1}, + }, + } + client._session.get = MagicMock(return_value=mock_resp) + + jobs = client.get_jobs(project_ids=[100]) + + assert len(jobs) == 1 + job = jobs[0] + # Without preprocessing, identifier is not extracted + assert job.identifier is None + assert job.name == "Job A" + assert job.description == "Desc A [[job_a]]" + + +class TestUpdateCreateDescMode: + """Tests that update_job/create_job pass use_desc_for_id to to_payload.""" + + def _make_job(self, identifier="daily_job", description="Runs nightly"): + return JobDefinition( + id=42, + name=f"Daily Job [[{identifier}]]", + description=description, + account_id=1, + project_id=100, + environment_id=200, + settings={}, + triggers={}, + execute_steps=["dbt build"], + run_generate_sources=False, + generate_docs=False, + schedule={"cron": "0 0 * * *"}, + ) + + def _make_mock_response(self, job: JobDefinition, use_desc_for_id: bool = False): + """Build a MagicMock response that looks like a successful API response.""" + mock_resp = MagicMock() + mock_resp.status_code = 200 + raw = json.loads(job.to_payload(use_desc_for_id=use_desc_for_id)) + raw["id"] = job.id + raw["state"] = 1 + mock_resp.json.return_value = {"data": raw} + return mock_resp + + def test_update_job_uses_desc_for_id_when_flag_on(self): + """update_job sends [[id]] in description when use_desc_for_id=True.""" + client = DBTCloud(account_id=1, api_key="test-key", use_desc_for_id=True) + job = self._make_job() + + captured = {} + + def capture_post(**kwargs): + captured["data"] = kwargs.get("data") or kwargs.get("json") + return self._make_mock_response(job, use_desc_for_id=True) + + client._session.post = capture_post + client.update_job(job) + + payload = json.loads(captured["data"]) + assert "[[daily_job]]" in payload["description"] + assert "[[daily_job]]" not in payload["name"] + + def test_create_job_uses_desc_for_id_when_flag_on(self): + """create_job sends [[id]] in description when use_desc_for_id=True.""" + client = DBTCloud(account_id=1, api_key="test-key", use_desc_for_id=True) + job = self._make_job() + + captured = {} + + def capture_post(**kwargs): + captured["data"] = kwargs.get("data") or kwargs.get("json") + return self._make_mock_response(job, use_desc_for_id=True) + + client._session.post = capture_post + client.create_job(job) + + payload = json.loads(captured["data"]) + assert "[[daily_job]]" in payload["description"] + assert "[[daily_job]]" not in payload["name"] + + def test_update_job_uses_name_for_id_when_flag_off(self): + """update_job sends [[id]] in name when use_desc_for_id=False (default).""" + client = DBTCloud(account_id=1, api_key="test-key", use_desc_for_id=False) + job = self._make_job() + + captured = {} + + def capture_post(**kwargs): + captured["data"] = kwargs.get("data") or kwargs.get("json") + return self._make_mock_response(job, use_desc_for_id=False) + + client._session.post = capture_post + client.update_job(job) + + payload = json.loads(captured["data"]) + assert "[[daily_job]]" in payload["name"] + assert "[[daily_job]]" not in payload["description"] + + def test_create_job_uses_name_for_id_when_flag_off(self): + """create_job sends [[id]] in name when use_desc_for_id=False (default).""" + client = DBTCloud(account_id=1, api_key="test-key", use_desc_for_id=False) + job = self._make_job() + + captured = {} + + def capture_post(**kwargs): + captured["data"] = kwargs.get("data") or kwargs.get("json") + return self._make_mock_response(job, use_desc_for_id=False) + + client._session.post = capture_post + client.create_job(job) + + payload = json.loads(captured["data"]) + assert "[[daily_job]]" in payload["name"] + assert "[[daily_job]]" not in payload["description"] + + def test_update_job_return_value_has_identifier_in_desc_mode(self): + """update_job pre-processes the API response so the returned JobDefinition has a clean identifier and description.""" + client = DBTCloud(account_id=1, api_key="test-key", use_desc_for_id=True) + job = self._make_job(description="Runs nightly") + + def mock_post(**kwargs): + return self._make_mock_response(job, use_desc_for_id=True) + + client._session.post = mock_post + result = client.update_job(job) + + assert result.identifier == "daily_job" + assert result.name == "Daily Job" + assert result.description == "Runs nightly" + + def test_create_job_return_value_has_identifier_in_desc_mode(self): + """create_job pre-processes the API response so the returned JobDefinition has a clean identifier and description.""" + client = DBTCloud(account_id=1, api_key="test-key", use_desc_for_id=True) + job = self._make_job(description="Runs nightly") + + def mock_post(**kwargs): + return self._make_mock_response(job, use_desc_for_id=True) + + client._session.post = mock_post + result = client.create_job(job) + + assert result.identifier == "daily_job" + assert result.name == "Daily Job" + assert result.description == "Runs nightly" diff --git a/tests/cloud_yaml_mapping/test_exclude_identifiers.py b/tests/cloud_yaml_mapping/test_exclude_identifiers.py index e68c4b3..714fc11 100644 --- a/tests/cloud_yaml_mapping/test_exclude_identifiers.py +++ b/tests/cloud_yaml_mapping/test_exclude_identifiers.py @@ -344,3 +344,70 @@ def test_exclude_identifiers_matching_with_json_output( # Verify the result is a valid ChangeSet assert isinstance(result, ChangeSet) # JSON output should not affect the filtering behavior + + +@patch("dbt_jobs_as_code.cloud_yaml_mapping.change_set.load_job_configuration") +@patch("dbt_jobs_as_code.cloud_yaml_mapping.change_set.DBTCloud") +@patch("dbt_jobs_as_code.cloud_yaml_mapping.change_set.glob.glob") +def test_build_change_set_passes_use_desc_for_id_true( + mock_glob, mock_dbt_cloud_class, mock_load_config, sample_jobs +): + """Test that build_change_set passes use_desc_for_id=True to DBTCloud constructor""" + sample_job = sample_jobs[0] + + mock_config = Mock() + mock_config.jobs = {"test-job": sample_job} + mock_load_config.return_value = mock_config + mock_glob.return_value = ["test.yml"] + + mock_dbt_cloud = Mock() + mock_dbt_cloud_class.return_value = mock_dbt_cloud + mock_dbt_cloud.get_jobs.return_value = [] + mock_dbt_cloud.build_mapping_job_identifier_job_id.return_value = {} + + build_change_set( + config="test.yml", + yml_vars=None, + disable_ssl_verification=False, + project_ids=[], + environment_ids=[], + use_desc_for_id=True, + ) + + # Verify DBTCloud was constructed with use_desc_for_id=True + mock_dbt_cloud_class.assert_called_once() + _, kwargs = mock_dbt_cloud_class.call_args + assert kwargs.get("use_desc_for_id") is True + + +@patch("dbt_jobs_as_code.cloud_yaml_mapping.change_set.load_job_configuration") +@patch("dbt_jobs_as_code.cloud_yaml_mapping.change_set.DBTCloud") +@patch("dbt_jobs_as_code.cloud_yaml_mapping.change_set.glob.glob") +def test_build_change_set_use_desc_for_id_defaults_to_false( + mock_glob, mock_dbt_cloud_class, mock_load_config, sample_jobs +): + """Test that build_change_set defaults use_desc_for_id to False""" + sample_job = sample_jobs[0] + + mock_config = Mock() + mock_config.jobs = {"test-job": sample_job} + mock_load_config.return_value = mock_config + mock_glob.return_value = ["test.yml"] + + mock_dbt_cloud = Mock() + mock_dbt_cloud_class.return_value = mock_dbt_cloud + mock_dbt_cloud.get_jobs.return_value = [] + mock_dbt_cloud.build_mapping_job_identifier_job_id.return_value = {} + + build_change_set( + config="test.yml", + yml_vars=None, + disable_ssl_verification=False, + project_ids=[], + environment_ids=[], + ) + + # Verify DBTCloud was constructed with use_desc_for_id=False (default) + mock_dbt_cloud_class.assert_called_once() + _, kwargs = mock_dbt_cloud_class.call_args + assert kwargs.get("use_desc_for_id") is False diff --git a/tests/schemas/test_job.py b/tests/schemas/test_job.py index 8f5d92e..13a54e7 100644 --- a/tests/schemas/test_job.py +++ b/tests/schemas/test_job.py @@ -112,6 +112,39 @@ def test_empty_identifier(self): result = JobDefinition._extract_identifier_from_name(name) assert result == IdentifierInfo(identifier=None, import_filter="", raw_identifier="") + def test_extract_identifier_from_description_simple(self): + """Test extracting simple identifier from job description.""" + result = JobDefinition._extract_identifier_from_description("Runs nightly [[daily_job]]") + assert result == IdentifierInfo( + identifier="daily_job", import_filter="", raw_identifier="daily_job" + ) + + def test_extract_identifier_from_description_with_filter(self): + """Test extracting identifier with filter from job description.""" + result = JobDefinition._extract_identifier_from_description( + "Runs nightly [[prod:daily_job]]" + ) + assert result == IdentifierInfo( + identifier="daily_job", import_filter="prod", raw_identifier="prod:daily_job" + ) + + def test_extract_identifier_from_description_no_identifier(self): + """Test when description has no identifier.""" + result = JobDefinition._extract_identifier_from_description("Runs nightly") + assert result == IdentifierInfo(identifier=None, import_filter="", raw_identifier="") + + def test_extract_identifier_from_description_empty(self): + """Test when description is empty.""" + result = JobDefinition._extract_identifier_from_description("") + assert result == IdentifierInfo(identifier=None, import_filter="", raw_identifier="") + + def test_extract_identifier_from_description_only_tag(self): + """Test when description contains only the identifier tag.""" + result = JobDefinition._extract_identifier_from_description("[[daily_job]]") + assert result == IdentifierInfo( + identifier="daily_job", import_filter="", raw_identifier="daily_job" + ) + class TestJobFiltering: """Tests for the filter_jobs_by_import_filter function.""" @@ -365,3 +398,69 @@ def test_json_schema_rejects_description_exceeding_limit(self, json_schema): } with pytest.raises(JsonSchemaValidationError, match="description"): validate(instance=instance, schema=json_schema) + + +class TestToPayloadDescMode: + """Tests for to_payload() with use_desc_for_id=True.""" + + def _make_job(self, name="Test Job", description="", identifier=None): + job = JobDefinition( + **{ + **BASE_JOB_DATA, + "schedule": {"cron": "0 0 * * *"}, + "name": f"{name} [[{identifier}]]" if identifier else name, + "description": description, + } + ) + return job + + def test_to_payload_use_desc_for_id(self): + """Identifier goes to description, name is clean.""" + job = self._make_job(description="Runs nightly", identifier="daily_job") + payload = json.loads(job.to_payload(use_desc_for_id=True)) + assert payload["name"] == "Test Job" + assert payload["description"] == "Runs nightly [[daily_job]]" + + def test_to_payload_use_desc_for_id_empty_description(self): + """Empty description stores [[id]] without leading space.""" + job = self._make_job(description="", identifier="daily_job") + payload = json.loads(job.to_payload(use_desc_for_id=True)) + assert payload["name"] == "Test Job" + assert payload["description"] == "[[daily_job]]" + + def test_to_payload_use_desc_for_id_no_identifier(self): + """No identifier: both fields remain clean.""" + job = self._make_job(description="Runs nightly") + payload = json.loads(job.to_payload(use_desc_for_id=True)) + assert payload["name"] == "Test Job" + assert payload["description"] == "Runs nightly" + + def test_to_payload_default_mode_unchanged(self): + """use_desc_for_id=False (default): identifier still goes in name.""" + job = self._make_job(description="Runs nightly", identifier="daily_job") + payload = json.loads(job.to_payload()) + assert payload["name"] == "Test Job [[daily_job]]" + assert payload["description"] == "Runs nightly" + + def test_to_payload_description_at_limit(self): + """Description + [[identifier]] at exactly 255 chars is accepted.""" + # "x" * 240 + " [[daily_job]]" = 240 + 14 = 254 chars — within limit + long_desc = "x" * 240 + job = self._make_job(description=long_desc, identifier="daily_job") + payload = json.loads(job.to_payload(use_desc_for_id=True)) + assert len(payload["description"]) == 254 + + def test_to_payload_description_over_limit(self): + """ValueError when description + [[identifier]] exceeds 255 chars.""" + # "x" * 242 + " [[daily_job]]" = 242 + 14 = 256 chars — over limit + too_long_desc = "x" * 242 + job = self._make_job(description=too_long_desc, identifier="daily_job") + with pytest.raises(ValueError, match="description"): + job.to_payload(use_desc_for_id=True) + + def test_to_payload_description_barely_over_with_long_base(self): + """ValueError when a nearly-full base description pushes the stored string over 255.""" + # "x" * 250 + " [[id]]" = 257 chars — should fail + job = self._make_job(description="x" * 250, identifier="id") + with pytest.raises(ValueError, match="description"): + job.to_payload(use_desc_for_id=True) diff --git a/tests/test_main.py b/tests/test_main.py index 4484988..d1f5d39 100644 --- a/tests/test_main.py +++ b/tests/test_main.py @@ -486,3 +486,179 @@ def test_sync_command_with_json_and_exclude_pattern(mock_build_change_set, mock_ assert call_args[0][6] == "temp:.*" # exclude_identifiers_matching # Check that output_json is True assert call_args.kwargs.get("output_json") is True + + +# ============= use_desc_for_id Option Tests ============= + + +@patch("dbt_jobs_as_code.main.build_change_set") +def test_use_desc_for_id_option_sync(mock_build_change_set, mock_empty_change_set): + """Test that sync command accepts --use-desc-for-id and passes it to build_change_set""" + mock_build_change_set.return_value = mock_empty_change_set + + runner = CliRunner() + result = runner.invoke(cli, ["sync", "--use-desc-for-id", "config.yml"]) + + assert result.exit_code == 0 + + mock_build_change_set.assert_called_once() + call_args = mock_build_change_set.call_args + assert call_args.kwargs.get("use_desc_for_id") is True + + +@patch("dbt_jobs_as_code.main.build_change_set") +def test_use_desc_for_id_option_plan(mock_build_change_set, mock_empty_change_set): + """Test that plan command accepts --use-desc-for-id and passes it to build_change_set""" + mock_build_change_set.return_value = mock_empty_change_set + + runner = CliRunner() + result = runner.invoke(cli, ["plan", "--use-desc-for-id", "config.yml"]) + + assert result.exit_code == 0 + + mock_build_change_set.assert_called_once() + call_args = mock_build_change_set.call_args + assert call_args.kwargs.get("use_desc_for_id") is True + + +@patch("dbt_jobs_as_code.main.build_change_set") +def test_use_desc_for_id_default_false(mock_build_change_set, mock_empty_change_set): + """Test that omitting --use-desc-for-id defaults to False""" + mock_build_change_set.return_value = mock_empty_change_set + + runner = CliRunner() + result = runner.invoke(cli, ["plan", "config.yml"]) + + assert result.exit_code == 0 + + mock_build_change_set.assert_called_once() + call_args = mock_build_change_set.call_args + assert call_args.kwargs.get("use_desc_for_id") is False + + +@patch("dbt_jobs_as_code.main.DBTCloud") +@patch("dbt_jobs_as_code.main.load_job_configuration") +@patch("dbt_jobs_as_code.main.resolve_file_paths") +def test_use_desc_for_id_option_validate( + mock_resolve_file_paths, mock_load_job_configuration, mock_DBTCloud +): + """Test that validate --online passes use_desc_for_id=True to DBTCloud""" + from dbt_jobs_as_code.schemas.common_types import Settings, Triggers + from dbt_jobs_as_code.schemas.job import JobDefinition + + mock_resolve_file_paths.return_value = (["config.yml"], []) + + job = JobDefinition( + project_id=123, + environment_id=456, + account_id=789, + name="Test Job", + settings=Settings(threads=4), + run_generate_sources=False, + execute_steps=["dbt run"], + generate_docs=False, + schedule={"cron": "0 * * * *"}, + triggers=Triggers(schedule=True), + ) + mock_config = Mock() + mock_config.jobs = {"test-job": job} + mock_load_job_configuration.return_value = mock_config + + instance = mock_DBTCloud.return_value + instance.get_environments.return_value = [{"id": 456, "project_id": 123}] + instance.get_jobs.return_value = [] + + runner = CliRunner() + result = runner.invoke(cli, ["validate", "--online", "--use-desc-for-id", "config.yml"]) + + assert result.exit_code == 0 + mock_DBTCloud.assert_called_once() + assert mock_DBTCloud.call_args.kwargs["use_desc_for_id"] is True + + +@patch("dbt_jobs_as_code.main.DBTCloud") +def test_use_desc_for_id_option_import_jobs(mock_DBTCloud): + """Test that import-jobs passes use_desc_for_id=True to DBTCloud""" + instance = mock_DBTCloud.return_value + instance.get_jobs.return_value = [] + instance.get_env_vars.return_value = {} + + runner = CliRunner() + result = runner.invoke( + cli, + ["import-jobs", "--account-id", "789", "--use-desc-for-id"], + ) + + assert result.exit_code == 0 + mock_DBTCloud.assert_called_once() + assert mock_DBTCloud.call_args.kwargs["use_desc_for_id"] is True + + +@patch("dbt_jobs_as_code.main.DBTCloud") +@patch("dbt_jobs_as_code.main.load_job_configuration") +@patch("dbt_jobs_as_code.main.resolve_file_paths") +def test_use_desc_for_id_option_link( + mock_resolve_file_paths, mock_load_job_configuration, mock_DBTCloud +): + """Test that link passes use_desc_for_id=True to DBTCloud""" + from dbt_jobs_as_code.schemas.common_types import Settings, Triggers + from dbt_jobs_as_code.schemas.job import JobDefinition + + mock_resolve_file_paths.return_value = (["config.yml"], []) + + job = JobDefinition( + project_id=123, + environment_id=456, + account_id=789, + name="Test Job", + settings=Settings(threads=4), + run_generate_sources=False, + execute_steps=["dbt run"], + generate_docs=False, + schedule={"cron": "0 * * * *"}, + triggers=Triggers(schedule=True), + ) + mock_config = Mock() + mock_config.jobs = {"test-job": job} + mock_load_job_configuration.return_value = mock_config + + runner = CliRunner() + result = runner.invoke(cli, ["link", "--dry-run", "--use-desc-for-id", "config.yml"]) + + assert result.exit_code == 0 + mock_DBTCloud.assert_called_once() + assert mock_DBTCloud.call_args.kwargs["use_desc_for_id"] is True + + +@patch("dbt_jobs_as_code.main.DBTCloud") +def test_use_desc_for_id_option_unlink(mock_DBTCloud): + """Test that unlink passes use_desc_for_id=True to DBTCloud""" + instance = mock_DBTCloud.return_value + instance.get_jobs.return_value = [] + + runner = CliRunner() + result = runner.invoke( + cli, + ["unlink", "--account-id", "789", "--use-desc-for-id"], + ) + + assert result.exit_code == 0 + mock_DBTCloud.assert_called_once() + assert mock_DBTCloud.call_args.kwargs["use_desc_for_id"] is True + + +@patch("dbt_jobs_as_code.main.DBTCloud") +def test_use_desc_for_id_option_deactivate_jobs(mock_DBTCloud): + """Test that deactivate-jobs passes use_desc_for_id=True to DBTCloud""" + instance = mock_DBTCloud.return_value + instance.get_jobs.return_value = [] + + runner = CliRunner() + result = runner.invoke( + cli, + ["deactivate-jobs", "--account-id", "789", "--use-desc-for-id"], + ) + + assert result.exit_code == 0 + mock_DBTCloud.assert_called_once() + assert mock_DBTCloud.call_args.kwargs["use_desc_for_id"] is True