Skip to content

Export fails with trace callbacks and incorrect task counts #126

@MohnJadden

Description

@MohnJadden

I am running a dbsync export. It runs for a while and seems to function but fails after it processes the Databricks user accounts:

2023-04-25 10:55:45 [INFO] Writing to path C:\Repos\Infra_Azure_Terraform_Source\Infra_Azure_Terraform_Source\exports\identity\databricks_group_admins_members.tf.json 2023-04-25 10:55:46 [INFO] Processing: databricks_group with name: databricks_scim_groups and id: databricks_scim_groups 2023-04-25 10:55:46 [INFO] Writing to path C:\Repos\Infra_Azure_Terraform_Source\Infra_Azure_Terraform_Source\exports\identity\databricks_scim_groups.tf.json Traceback (most recent call last): File "C:\Users\myaccount\AppData\Roaming\Python\Python311\site-packages\databricks_sync\sdk\sync\export.py", line 74, in export exp.run() File "C:\Users\myaccount\AppData\Roaming\Python\Python311\site-packages\databricks_sync\sdk\pipeline.py", line 595, in run self.__generate_all() File "C:\Users\myaccount\AppData\Roaming\Python\Python311\site-packages\databricks_sync\sdk\pipeline.py", line 584, in __generate_all loop.run_until_complete(groups) File "C:\Program Files\Python311\Lib\asyncio\base_events.py", line 650, in run_until_complete return future.result() ^^^^^^^^^^^^^^^ File "C:\Users\myaccount\AppData\Roaming\Python\Python311\site-packages\databricks_sync\sdk\pipeline.py", line 87, in trigger async for item in self.generate(): File "C:\Users\myaccount\AppData\Roaming\Python\Python311\site-packages\databricks_sync\sdk\pipeline.py", line 92, in generate async for item in self._generate(): File "C:\Users\myaccount\AppData\Roaming\Python\Python311\site-packages\databricks_sync\sdk\generators\notebook.py", line 251, in _generate object_data = self.__create_notebook_data(notebook) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\myaccount\AppData\Roaming\Python\Python311\site-packages\databricks_sync\sdk\generators\notebook.py", line 184, in __create_notebook_data return self._create_data(ResourceCatalog.NOTEBOOK_RESOURCE, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\myaccount\AppData\Roaming\Python\Python311\site-packages\databricks_sync\sdk\pipeline.py", line 145, in _create_data self.get_local_hcl_path(custom_file_name or identifier, custom_folder_path), ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\myaccount\AppData\Roaming\Python\Python311\site-packages\databricks_sync\sdk\pipeline.py", line 62, in get_local_hcl_path return ExportFileUtils.make_local_path( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\myaccount\AppData\Roaming\Python\Python311\site-packages\databricks_sync\sdk\pipeline.py", line 261, in make_local_path ExportFileUtils.__ensure_parent_dirs(dir_path) File "C:\Users\myaccount\AppData\Roaming\Python\Python311\site-packages\databricks_sync\sdk\pipeline.py", line 226, in __ensure_parent_dirs Path(dir_path).mkdir(parents=True, exist_ok=True) File "C:\Program Files\Python311\Lib\pathlib.py", line 1116, in mkdir os.mkdir(self, mode) OSError: [WinError 123] The filename, directory name, or volume label syntax is incorrect: 'C:\\Repos\\Infra_Azure_Terraform_Source\\Infra_Azure_Terraform_Source\\exports\\notebook\\hcl\\Users\\[email protected]\\databricks_automl\\23-04-20-19:25-AF_ NotebookName\\trials'

It also outputs an error log with 75 errors, all of which are invalid task count types:

│  0 │ EXPORT ERROR │ <class 'databricks_sync.sdk.generators.jobs.JobWorkflowTaskCountError'>: Task count for job id: 1003097168641501 and name: Name1 is invalid. Must be 
exactly 1.Currently: 3 tasks               │           1 │
├────┼──────────────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼─────────────┤
│  1 │ EXPORT ERROR │ <class 'databricks_sync.sdk.generators.jobs.JobWorkflowTaskCountError'>: Task count for job id: 1010694613246177 and name: Name2 is invalid. Must be exactly 1.Currently: 7 tasks                                     │           1 │
├────┼──────────────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼─────────────┤
│  2 │ EXPORT ERROR │ <class 'databricks_sync.sdk.generators.jobs.JobWorkflowTaskCountError'>: Task count for job id: 1022171816180374 and name: Name3 is invalid. Must be exactly 1.Currently: 3 tasks                               │           1 │

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions