Skip to content

Feature: Seed data for DbContainer #541

Open
@OverkillGuy

Description

What are you trying to do?

In order to do meaningful tests using my fresh database, I wish to load arbitrary SQL scripts from my DbContainer-s.

I can do (and have done) this per-project, in test folders, but believe there's value in allowing an arbitrary list of SQL scripts to be run before the testcontainer is yielded, and making this a pytest fixture to reuse.

I do this usually by first passing a path to a folder with scripts, volume-mount it to /seeds path in the database container, and also pass a list of script files in that folder to be executed. Then before yielding the ready db-container, I loop over each script filename and do container.exec_run(["mysql", "-e" f"source /seeds/{script}").

The list of scripts means we can feed both schema first, then arbitrary sample data files.

I suggest a new DbContainer._seed() method, defaulting to raise NotImplementedError, to be overriden per database, allowing a new optional parameter seed.

Each database then just has to define their _seed() function to the specific way to exec_run the scripts (psql for Postgres, mysql for MySQL CLI...)

Example from #542 implementation draft:

>>> import sqlalchemy
>>> from testcontainers.mysql import MySqlContainer
>>> seed_data = ("db/", ["schema.sql", "data.sql"])
>>> with MySqlContainer(seed=seed_data) as mysql:
...     engine = sqlalchemy.create_engine(mysql.get_connection_url())
...     with engine.begin() as connection:
...         query = "select * from stuff"  # Can now rely on schema/data
...         result = connection.execute(sqlalchemy.text(query))
...         first_stuff, = result.fetchone()

Why should it be done this way?

I believe most users of testcontainers, beyond just testing connectivity with a database, want a simple-to-reproduce test environment with realistic datastructure and mock data. This is currently annoying to do locally, and can be upstreamed into a simple-looking seed parameter, which could enable the feature for most databases, by pushing it into DbContainer itself.

See #542 for a draft implementation using MySQL as target (generalizeable to others of course)

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions