Skip to content

Standardized checks #685

Open
Open
@dsmedia

Description

@dsmedia

The vega-datasets repository currently lacks a straightforward, documented way to run code quality checks locally that match CI checks (.github/workflows/test.yml). This diverges from projects like Altair (@dangotbanned) that use uv and taskipy for an efficient developer workflow. Such an approach paves the way for a very user-friendly CONTRIBUTING.md in Altair.

Current Situation

Currently, developers can run each CI check manually using separate commands:

# TOML formatting check
uvx taplo fmt --check --diff
# Python linting
uvx ruff check
# Python formatting check
uvx ruff format --check

While this works, it's cumbersome to run multiple commands and inconsistent with related projects like Altair that provide a single command for all checks. There's no documented way to run all checks with one command.

Proposed Solutions

Option 1: Shell Script

A simple, direct approach:

  1. Create run_checks.sh:
#!/bin/bash
set -e
uvx taplo fmt --check --diff
uvx ruff check
uvx ruff format --check
echo "All checks passed!"
  1. Usage: ./run_checks.sh

Advantages:

  • No additional dependencies needed
  • Simple to understand and maintain
  • Directly mirrors CI commands
  • Quick to implement
  • Easy to modify or extend

Option 2: Full taskipy Implementation

Adopt an Altair-like taskipy setup for better structure and extensibility:

  1. Add to pyproject.toml dependency groups:
[dependency-groups]
dev = ["taskipy>=1.14.1"]
  1. Define individual tasks:
[tool.taskipy.tasks]
check-toml = "taplo fmt --check --diff"
check-lint = "ruff check"
check-format = "ruff format --check"
checks = "task check-toml && task check-lint && task check-format"
  1. Usage: uv run task checks

Advantages:

  • Structured approach matching Altair
  • Centralized task definitions
  • Extensible for future additions
  • Familiar interface for Altair contributors (uv run task ...)

Disadvantages:

  • Slightly more complex initial setup (though minimal)

What approach would work best?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions