Skip to content

delete_dataset does not completely delete the specified dataset #1449

@ai-ignatyev

Description

@ai-ignatyev

Description

Let's consider the following code.

import datachain as dc

chain = dc.read_values(x=[0, 1, 2, 3])
chain = chain.persist()

chain.save('tmp')
dc.delete_dataset('tmp', force=True)
chain.save('tmp')

If we run it, we will get the following exception.

Traceback (most recent call last):
  File "/home/aignatyev/main.py", line 14, in <module>
    chain.save('tmp')
  File "/home/aignatyev/.venv/lib/python3.12/site-packages/datachain/lib/dc/datachain.py", line 653, in save
    query=self._query.save(
          ^^^^^^^^^^^^^^^^^
  File "/home/aignatyev/.venv/lib/python3.12/site-packages/datachain/query/dataset.py", line 1907, in save
    dataset = self.catalog.create_dataset(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aignatyev/.venv/lib/python3.12/site-packages/datachain/catalog/catalog.py", line 862, in create_dataset
    return self.create_new_dataset_version(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aignatyev/.venv/lib/python3.12/site-packages/datachain/catalog/catalog.py", line 916, in create_new_dataset_version
    self.warehouse.create_dataset_rows_table(table_name, columns=columns)
  File "/home/aignatyev/.venv/lib/python3.12/site-packages/datachain/data_storage/sqlite.py", line 673, in create_dataset_rows_table
    table = self.schema.dataset_row_cls.new_table(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/aignatyev/.venv/lib/python3.12/site-packages/datachain/data_storage/schema.py", line 212, in new_table
    return sa.Table(name, metadata, *columns)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<string>", line 2, in __new__
  File "/home/aignatyev/.venv/lib/python3.12/site-packages/sqlalchemy/util/deprecations.py", line 281, in warned
    return fn(*args, **kwargs)  # type: ignore[no-any-return]
           ^^^^^^^^^^^^^^^^^^^
  File "/home/aignatyev/.venv/lib/python3.12/site-packages/sqlalchemy/sql/schema.py", line 429, in __new__
    return cls._new(*args, **kw)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/aignatyev/.venv/lib/python3.12/site-packages/sqlalchemy/sql/schema.py", line 461, in _new
    raise exc.InvalidRequestError(
sqlalchemy.exc.InvalidRequestError: Table 'ds_local_local_tmp_1_0_0' is already defined for this MetaData instance.  Specify 'extend_existing=True' to redefine options and columns on an existing Table object.

Version Info

datachain -V
0.37.7

python -V
Python 3.12.9

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions