Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
f02352d
ci(datasets): Migrate Flower Datasets to uv
danieljanes Feb 4, 2026
2067715
Split build and publish into two scripts
danieljanes Feb 4, 2026
dd96a4f
Improve rm-caches.sh
danieljanes Feb 4, 2026
98a451c
Make Flower Datasets compatible with the latest version of HF Datasets
danieljanes Feb 4, 2026
04fa80c
Merge branch 'main' into uv-migration-datasets
danieljanes Feb 4, 2026
341ad19
Merge remote-tracking branch 'refs/remotes/origin/uv-migration-datase…
danieljanes Feb 4, 2026
902ba7a
Exclude .venv from taplo fmt
danieljanes Feb 4, 2026
a5ee3bb
Merge branch 'main' into uv-migration-datasets
danieljanes Feb 9, 2026
d2a7113
Fix docs build
danieljanes Feb 9, 2026
e019eaf
Merge branch 'main' into uv-migration-datasets
danieljanes Feb 10, 2026
2387641
Merge branch 'main' into uv-migration-datasets
danieljanes Feb 10, 2026
0562c33
Merge branch 'main' into uv-migration-datasets
danieljanes Feb 14, 2026
440df66
Merge branch 'main' into uv-migration-datasets
danieljanes Feb 14, 2026
269f701
Update uv
danieljanes Feb 14, 2026
8adc585
ci(datasets): Migrate datasets E2E to uv
danieljanes Feb 14, 2026
f01494e
Restore deleted tests
danieljanes Feb 14, 2026
e8a5d8e
ci(datasets): Migrate Flower Datasets to uv
danieljanes Feb 4, 2026
689ce38
Split build and publish into two scripts
danieljanes Feb 4, 2026
5805884
Make Flower Datasets compatible with the latest version of HF Datasets
danieljanes Feb 4, 2026
753ef96
Exclude .venv from taplo fmt
danieljanes Feb 4, 2026
cb278f0
Fix docs build
danieljanes Feb 9, 2026
32f54ce
Update uv
danieljanes Feb 14, 2026
00ed667
Merge remote-tracking branch 'refs/remotes/origin/uv-migration-datase…
danieljanes Feb 14, 2026
d60283b
Undo
danieljanes Feb 14, 2026
898d6e6
ci(datasets): Migrate Flower Datasets to uv
danieljanes Feb 4, 2026
aaf2dd1
Make Flower Datasets compatible with the latest version of HF Datasets
danieljanes Feb 4, 2026
8621f4f
Revert
danieljanes Feb 14, 2026
6e73989
Remove new docs page
danieljanes Feb 14, 2026
76da4cc
Reorder pyproject.toml
danieljanes Feb 14, 2026
5747591
Reorder pyproject.toml
danieljanes Feb 14, 2026
2e1dc65
Add comment
danieljanes Feb 14, 2026
f3ed784
Reorder dependencies
danieljanes Feb 14, 2026
5adf2dd
Lower-case keys
danieljanes Feb 14, 2026
7bcad6e
Add docs
danieljanes Feb 14, 2026
9840590
Merge branch 'main' into uv-migration-datasets
danieljanes Feb 16, 2026
b187bcb
Merge branch 'main' into uv-migration-datasets
danieljanes Feb 16, 2026
b726ba4
Merge branch 'main' into uv-migration-datasets
danieljanes Feb 16, 2026
15bf78c
Lock dependencies
danieljanes Feb 16, 2026
0eb280f
Bump pillow
danieljanes Feb 16, 2026
42745e8
Merge branch 'main' into uv-migration-datasets
danieljanes Feb 16, 2026
b0449df
Merge branch 'main' into uv-migration-datasets
danieljanes Feb 28, 2026
b4c0928
Apply suggestion from @jafermarq
jafermarq Feb 28, 2026
75073b8
Merge branch 'main' into uv-migration-datasets
danieljanes Feb 28, 2026
d21789a
Update datasets/docs/source/contributor-how-to-develop-flwr-datasets.rst
danieljanes Feb 28, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions .github/workflows/datasets-e2e.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ concurrency:

env:
FLWR_TELEMETRY_ENABLED: 0
UV_NO_MANAGED_PYTHON: 1
UV_PYTHON_DOWNLOADS: never

jobs:
frameworks:
Expand Down Expand Up @@ -46,6 +48,11 @@ jobs:
uses: ./.github/actions/bootstrap
with:
python-version: 3.10.19
poetry-skip: "true"
- name: Set up uv
uses: astral-sh/setup-uv@v4
with:
version: "0.9.11"
- name: Free Disk Space (Ubuntu)
uses: jlumbroso/free-disk-space@v1.3.1
with:
Expand All @@ -61,6 +68,6 @@ jobs:
sudo apt-get update
sudo apt-get install -y ffmpeg
- name: Install dependencies
run: python -m poetry install
run: uv sync --frozen
- name: Run tests
run: python -m unittest discover -p '*_test.py'
run: uv run python -m unittest discover -p '*_test.py'
18 changes: 14 additions & 4 deletions .github/workflows/datasets.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ concurrency:

env:
FLWR_TELEMETRY_ENABLED: 0
UV_NO_MANAGED_PYTHON: 1
UV_PYTHON_DOWNLOADS: never

jobs:
test_core:
Expand All @@ -44,6 +46,11 @@ jobs:
uses: ./.github/actions/bootstrap
with:
python-version: ${{ matrix.python }}
poetry-skip: "true"
- name: Set up uv
uses: astral-sh/setup-uv@v4
with:
version: "0.9.11"
- name: Free Disk Space (Ubuntu)
uses: jlumbroso/free-disk-space@v1.3.1
with:
Expand All @@ -58,10 +65,10 @@ jobs:
run: |
sudo apt-get update
sudo apt-get install -y ffmpeg
- name: Install dependencies (mandatory only)
- name: Install dependencies
run: |
cd datasets
python -m poetry install --all-extras
uv sync --frozen --all-extras
- name: Cache Hugging Face datasets
uses: actions/cache@v3
with:
Expand All @@ -70,11 +77,14 @@ jobs:
restore-keys: hf-datasets-
- name: Set Hugging Face token
run: |
cd datasets
if [ -n "${{ secrets.HF_TOKEN }}" ]; then
echo "Logging into Hugging Face..."
hf auth login --token ${{ secrets.HF_TOKEN }}
uv run hf auth login --token ${{ secrets.HF_TOKEN }}
else
echo "Skipping Hugging Face login stage (HF_TOKEN not set)"
fi
- name: Test (formatting + unit tests)
run: ./datasets/dev/test.sh
run: |
cd datasets
uv run ./dev/test.sh
2 changes: 1 addition & 1 deletion .github/workflows/framework-docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,8 @@ jobs:
- name: Install Flower Framework and Flower Datasets with dependencies
run: |
cd framework
python -m poetry add ../datasets
python -m poetry install
python -m pip install -e ../datasets
- name: Build docs
run: ./dev/build-docs.sh ${{ github.ref == 'refs/heads/main' && github.repository == 'adap/flower' && !github.event.pull_request.head.repo.fork }}
- name: Deploy docs
Expand Down
67 changes: 67 additions & 0 deletions datasets/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,73 @@ For a complete installation guide visit the [Flower Datasets Documentation](http
pip install flwr-datasets[vision]
```

> [!NOTE]
> The `audio` extra currently supports Python `<3.12` due to upstream dependency
> constraints.

## Development (uv)

Flower Datasets uses `uv` for development and CI.

### Setup

```bash
cd datasets
uv sync --all-extras
```

> [!TIP]
> Use `uv sync --frozen --all-extras` to ensure `uv.lock` is not modified.

### Run checks (formatting + unit tests)

```bash
cd datasets
uv run ./dev/test.sh
```

### Format

```bash
cd datasets
uv run ./dev/format.sh
```

### Build docs

```bash
cd datasets
uv run ./dev/build-flwr-datasets-docs.sh
```

### Run E2E tests

```bash
cd datasets/e2e/pytorch
uv sync --frozen
uv run python -m unittest discover -p '*_test.py'
```

Repeat for `datasets/e2e/scikit-learn` and `datasets/e2e/tensorflow`.

### Dependency management (no `uv pip`)

```bash
cd datasets

# Add a runtime dependency
uv add <package>

# Add a dev dependency
uv add --dev <package>

# Add a dependency to an extra (e.g. "vision")
uv add --optional vision <package>

# Update lockfile (commit the result)
uv lock
```

## Overview

Flower Datasets library supports:
Expand Down
21 changes: 21 additions & 0 deletions datasets/dev/build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#!/bin/bash

# Copyright 2024 Flower Labs GmbH. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

set -e
cd "$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"/../

uv build --clear
2 changes: 1 addition & 1 deletion datasets/dev/publish.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,4 @@
set -e
cd "$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"/../

python -m poetry publish -u __token__ -p ${PYPI_TOKEN}
uv publish --token "${PYPI_TOKEN}"
82 changes: 82 additions & 0 deletions datasets/docs/source/contributor-how-to-develop.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
How to develop Flower Datasets
==============================

Flower Datasets uses `uv <https://docs.astral.sh/uv/>`_ for development and CI.

Setup
-----

Install dependencies (including all extras):

.. code-block:: bash

cd datasets
uv sync --all-extras

.. tip::

Use ``uv sync --frozen --all-extras`` to ensure ``uv.lock`` is not modified.

Run checks
----------

Run formatting and unit tests:

.. code-block:: bash

cd datasets
uv run ./dev/test.sh

Format
------

.. code-block:: bash

cd datasets
uv run ./dev/format.sh

Build docs
----------

.. code-block:: bash

cd datasets
uv run ./dev/build-flwr-datasets-docs.sh

Run E2E tests
-------------

Each E2E directory is a standalone `uv` project.

.. code-block:: bash

cd datasets/e2e/pytorch
uv sync --frozen
uv run python -m unittest discover -p '*_test.py'

Repeat for ``datasets/e2e/scikit-learn`` and ``datasets/e2e/tensorflow``.

Dependency management
---------------------

Avoid using ``uv pip``.

.. code-block:: bash

cd datasets

# Add a runtime dependency
uv add <package>

# Add a dev dependency
uv add --dev <package>

# Add a dependency to an extra (e.g. "vision")
uv add --optional vision <package>

After changing dependencies, update the lockfile and commit it:

.. code-block:: bash

cd datasets
uv lock
8 changes: 6 additions & 2 deletions datasets/docs/source/how-to-install-flwr-datasets.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Installation
Python Version
--------------

Flower Datasets requires `Python 3.9 <https://docs.python.org/3.9/>`_ or above.
Flower Datasets requires `Python 3.10 <https://docs.python.org/3.10/>`_ or above.


Install stable release (pip)
Expand All @@ -28,6 +28,11 @@ For audio datasets (e.g. Speech Command) ``flwr-datasets`` should be installed w

python -m pip install "flwr-datasets[audio]"

.. note::

The ``audio`` extra currently supports Python ``<3.12`` due to upstream dependency
constraints.

Install directly from GitHub (pip)
----------------------------------

Expand Down Expand Up @@ -70,4 +75,3 @@ If everything works, it should print the version of Flower Datasets to the comma
.. code-block:: none

0.5.0

1 change: 1 addition & 0 deletions datasets/docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@ Information-oriented API reference and other reference material.
:maxdepth: 1
:caption: Contributor tutorials

contributor-how-to-develop
contributor-how-to-contribute-dataset


Expand Down
23 changes: 10 additions & 13 deletions datasets/e2e/pytorch/pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,17 +1,14 @@
[build-system]
requires = ["poetry-core>=2.1.3"]
build-backend = "poetry.core.masonry.api"

[tool.poetry]
[project]
name = "fds-e2e-pytorch"
version = "0.1.0"
description = "Flower Datasets with PyTorch"
authors = ["The Flower Authors <hello@flower.ai>"]
package-mode = false
requires-python = ">=3.10"
dependencies = [
"flwr-datasets[vision]",
"torch>=1.12.0,<3.0.0",
"torchvision>=0.19.0,<1.0.0",
"parameterized==0.9.0",
]

[tool.poetry.dependencies]
python = "^3.10"
flwr-datasets = { path = "./../../", extras = ["vision"] }
torch = ">=1.12.0,<3.0.0"
torchvision = ">=0.19.0,<1.0.0"
parameterized = "==0.9.0"
[tool.uv.sources]
flwr-datasets = { path = "../..", editable = true }
Loading
Loading