-
Notifications
You must be signed in to change notification settings - Fork 127
Description
Expected Behavior
A deployment with dbx deploy [...] --write-specs-to-file=spec.json with a named_parameters definition like so:
named_parameters:
conf-file: "file:fuse://path/to/some-file.yaml"Should result in a specs file with the following named_parameters object:
"named_parameters": {
"conf-file": "/dbfs/[...]/artifacts/path/to/some-file.yaml",
}And only conf-file should be created as a keyword argument parameter in the workflow task on the web GUI.
Current Behavior
Instead, I get this:
"named_parameters": {
"conf-file": "/dbfs/[...]/artifacts/path/to/some-file.yaml",
"named_parameters": "/dbfs/[...]/artifacts/path/to/some-file.yaml"
}And the workflow task on the web GUI ends up with two keyword argument parameters: conf-file and named_parameters.
Steps to Reproduce (for bugs)
Run a dbx deploy with a workflow that has python_wheel_task tasks with named_parameters, at least one of which has a value that starts with file:// or file:fuse://. Then check the keyword argument parameters of the task in the web GUI.
Context
I've traced the problem to this function:
dbx/dbx/api/adjuster/adjuster.py
Lines 165 to 169 in 34bd186
| def file_traverse(self, workflows, file_adjuster: FileReferenceAdjuster): | |
| for element, parent, index in self.traverse(workflows): | |
| if isinstance(element, str): | |
| if element.startswith("file://") or element.startswith("file:fuse://"): | |
| file_adjuster.adjust_file_ref(element, parent, index) |
And this part in PropertyAdjuster.traverse():
dbx/dbx/api/adjuster/adjuster.py
Lines 43 to 48 in 34bd186
| if isinstance(_object, dict): | |
| for key in list(_object.keys()): | |
| item = _object[key] | |
| yield item, _object, key | |
| for _out in self.traverse(item, _object, index_in_parent): | |
| yield _out |
After yielding the correct tuple for conf-file:
('file:fuse://path/to/config.yaml', {'conf-file': 'file:fuse://path/to/config.yaml'}, 'conf-file')It then attempts to traverse item with index_in_parent, which is named_parameters, and since item is a string, traverse jumps here and terminates:
dbx/dbx/api/adjuster/adjuster.py
Line 66 in 34bd186
| yield _object, parent, index_in_parent |
And yields essentially a duplicate tuple except with the wrong index_in_parent:
('file:fuse://path/to/config.yaml', {'conf-file': 'file:fuse://path/to/config.yaml'}, 'named_parameters')Your Environment
- dbx version used: 0.8.18
- Databricks Runtime version: 12.2.x-scala2.12