Skip to content

Commit f6419be

Browse files
authored
Merge pull request spotify#111 from elbersb/master
Modernize infrastructure
2 parents 08f1827 + 3884c88 commit f6419be

24 files changed

+391
-199
lines changed

.flake8

Lines changed: 0 additions & 3 deletions
This file was deleted.

.github/workflows/confidence.yml

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -13,18 +13,22 @@ jobs:
1313
runs-on: ubuntu-latest
1414
strategy:
1515
matrix:
16-
python-version: ['3.9', '3.10', '3.11']
16+
python-version: ['3.9', '3.10', '3.11', '3.12']
1717

1818
steps:
19-
- uses: actions/checkout@v1
19+
- uses: actions/checkout@v4
2020
- name: Set up Python ${{ matrix.python-version }}
21-
uses: actions/setup-python@v2
21+
uses: actions/setup-python@v5
2222
with:
2323
python-version: ${{ matrix.python-version }}
24+
- name: Install uv
25+
uses: astral-sh/setup-uv@v5
26+
with:
27+
enable-cache: true
28+
cache-dependency-glob: "**/pyproject.toml"
2429
- name: Install dependencies
2530
run: |
26-
python -m pip install --upgrade pip
27-
if [ -f requirements_dev.txt ]; then pip install -r requirements_dev.txt; fi
28-
python -m pip install tox tox-gh-actions
31+
uv pip install --system -e ".[dev]"
32+
uv pip install --system tox tox-gh-actions
2933
- name: Test with tox
3034
run: tox

.github/workflows/python-publish.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,11 +18,11 @@ jobs:
1818
runs-on: ubuntu-latest
1919

2020
steps:
21-
- uses: actions/checkout@v2
21+
- uses: actions/checkout@v4
2222
- name: Set up Python
23-
uses: actions/setup-python@v2
23+
uses: actions/setup-python@v5
2424
with:
25-
python-version: '3.9'
25+
python-version: '3.11'
2626
- name: Install dependencies
2727
run: |
2828
python -m pip install --upgrade pip

.gitignore

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -90,4 +90,8 @@ ENV/
9090

9191
.DS_store
9292

93-
.idea/
93+
.idea/
94+
95+
# uv
96+
uv.lock
97+
.venv/

CLAUDE.md

Lines changed: 157 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,157 @@
1+
# CLAUDE.md
2+
3+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4+
5+
## Project Overview
6+
7+
Spotify Confidence is a Python library for A/B test analysis. It provides convenience wrappers around statsmodel's functions for computing p-values and confidence intervals. The library supports both frequentist (Z-test, Student's T-test, Chi-squared) and Bayesian (BetaBinomial) statistical methods, with features for variance reduction, sequential testing, and sample size calculations.
8+
9+
## Development Commands
10+
11+
### Setup
12+
```bash
13+
# Install with development dependencies (including tox-uv)
14+
uv pip install -e ".[dev]"
15+
```
16+
17+
### Testing
18+
```bash
19+
# Run all tests with coverage
20+
uv run pytest
21+
22+
# Run tests without coverage reports
23+
uv run pytest --no-cov
24+
25+
# Run specific test file
26+
uv run pytest tests/frequentist/test_z_test.py
27+
28+
# Run specific test
29+
uv run pytest tests/frequentist/test_z_test.py::test_name
30+
31+
# Run all tests across Python versions
32+
uv run tox
33+
```
34+
35+
### Code Quality
36+
```bash
37+
# Format code with black (line length: 119)
38+
uv run black spotify_confidence tests
39+
40+
# Check formatting without making changes
41+
uv run black --check --diff spotify_confidence tests
42+
43+
# Lint with flake8 (max line length: 120)
44+
uv run flake8 spotify_confidence tests
45+
46+
# Run all quality checks (as done in CI)
47+
uv run black --check --diff spotify_confidence tests && uv run flake8 spotify_confidence tests && uv run pytest
48+
```
49+
50+
### Build
51+
```bash
52+
# Build distribution packages
53+
uv run python -m build
54+
```
55+
56+
## Architecture
57+
58+
### Core Design Pattern
59+
60+
The library follows an object-oriented design with separation of concerns:
61+
62+
1. **Statistical Test Classes**: High-level APIs (`ZTest`, `StudentsTTest`, `ChiSquared`, `BetaBinomial`, `ZTestLinreg`)
63+
2. **Experiment Class**: Base class containing shared analysis methods for frequentist tests
64+
3. **Computer Classes**: Perform the actual statistical computations
65+
4. **Grapher Classes**: Generate visualizations using Chartify
66+
67+
All main test classes inherit from abstract base classes in `spotify_confidence/analysis/abstract_base_classes/`:
68+
- `ConfidenceABC`: Base for all statistical test classes
69+
- `ConfidenceComputerABC`: Base for computation logic
70+
- `ConfidenceGrapherABC`: Base for visualization logic
71+
72+
### Module Structure
73+
74+
```
75+
spotify_confidence/
76+
├── analysis/
77+
│ ├── abstract_base_classes/ # ABC definitions for the framework
78+
│ ├── frequentist/ # Frequentist statistical methods
79+
│ │ ├── confidence_computers/ # Statistical computation logic
80+
│ │ ├── experiment.py # Base class for frequentist tests
81+
│ │ ├── z_test.py # Z-test implementation
82+
│ │ ├── t_test.py # Student's T-test implementation
83+
│ │ ├── chi_squared.py # Chi-squared test
84+
│ │ ├── z_test_linreg.py # Z-test with linear regression variance reduction
85+
│ │ ├── sequential_bound_solver.py # Group sequential testing
86+
│ │ ├── multiple_comparison.py # Multiple testing correction
87+
│ │ └── sample_size_calculator.py
88+
│ ├── bayesian/ # Bayesian methods
89+
│ │ └── bayesian_models.py # BetaBinomial implementation
90+
│ ├── constants.py # Shared constants
91+
│ └── confidence_utils.py # Shared utility functions
92+
├── samplesize/ # Sample size calculations
93+
├── examples.py # Example data generators
94+
├── chartgrid.py # Chart grid utilities
95+
└── options.py # Global configuration
96+
```
97+
98+
### Key Classes and Their Relationships
99+
100+
- **Experiment** (in `frequentist/experiment.py`): The core base class for frequentist tests. Provides methods like:
101+
- `summary()`: Overall metric summaries
102+
- `difference()`: Pairwise comparisons
103+
- `multiple_difference()`: Multiple comparisons with correction
104+
- `difference_plot()`, `summary_plot()`, etc.: Visualization methods
105+
- `sample_size()`: Required sample size calculations
106+
- `statistical_power()`: Power analysis
107+
108+
- **ZTest, StudentsTTest, ChiSquared**: Thin wrappers that initialize `Experiment` with the appropriate computer and method
109+
110+
- **Computer Classes** (in `frequentist/confidence_computers/`): Handle the statistical calculations
111+
- `ZTestComputer`, `TTestComputer`, `ChiSquaredComputer`: Specific computation implementations
112+
- All inherit from `ConfidenceComputerABC`
113+
114+
- **ChartifyGrapher**: Implements visualization using the Chartify library
115+
116+
### Data Model
117+
118+
The library works with DataFrames containing sufficient statistics:
119+
- `numerator_column`: Sum or count (e.g., sum of conversions)
120+
- `denominator_column`: Total observations (e.g., total users)
121+
- `numerator_sum_squares_column`: Sum of squares (optional, for variance calculations)
122+
- `categorical_group_columns`: Treatment/control groups and other dimensions
123+
- `ordinal_group_column`: Time-based grouping for sequential analysis
124+
125+
### Important Conventions
126+
127+
1. **Method Column**: Tests add a `METHOD_COLUMN_NAME` to data indicating the test type (e.g., "z-test", "t-test")
128+
129+
2. **Multiple Comparison Correction**: Supported methods defined in `constants.py`:
130+
- Standard: bonferroni, holm, hommel, sidak, FDR methods
131+
- SPOT-1 variants: Custom Spotify methods for specific use cases
132+
133+
3. **Non-Inferiority Margins (NIMs)**: Can be specified as absolute values or relative percentages
134+
135+
4. **Sequential Testing**: The `sequential_bound_solver.py` module implements group sequential designs with spending functions
136+
137+
5. **Variance Reduction**: `ZTestLinreg` uses pre-exposure data to fit a linear model and reduce variance (CUPED method)
138+
139+
## Testing Guidelines
140+
141+
- Tests are organized to mirror the source structure under `tests/`
142+
- Use pytest fixtures for common test data
143+
- Tests check both DataFrame outputs and chart generation
144+
- Coverage target is configured in `pyproject.toml`
145+
146+
## Python Version Support
147+
148+
Supports Python 3.9, 3.10, 3.11, and 3.12. The `tox.ini` includes a `py39-min` environment that tests with minimum dependency versions.
149+
150+
The project uses `tox-uv` to leverage uv's fast package installation and environment management in tox, significantly speeding up multi-environment testing. The GitHub Actions CI workflow also uses uv for faster dependency installation.
151+
152+
## Code Style
153+
154+
- Black formatting with 119 character line length
155+
- Flake8 linting with max line length 120
156+
- Ignored flake8 rules: E203, E231, W503
157+
- Excluded from linting: `.venv`, `.tox`, `dist`, `build`, `scratch.py`, `confidence_dev`

CONTRIBUTING.rst

Lines changed: 48 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -57,41 +57,55 @@ Get Started!
5757

5858
Ready to contribute? Here's how to set up `confidence` for local development.
5959

60+
**Prerequisites:**
61+
62+
* `uv <https://docs.astral.sh/uv/>`_ - Fast Python package installer (recommended)
63+
* Python 3.9 or later
64+
6065
1. Fork the `confidence` repo on GitHub.
6166
2. Clone your fork locally::
6267

63-
$ git clone https://github.com/spotify/confidence
68+
$ git clone git@github.com:your_username/confidence.git
69+
$ cd confidence
70+
71+
3. Set up your development environment using uv::
72+
73+
$ uv venv
74+
$ uv pip install -e ".[dev]"
6475

65-
3. Install your local copy into a virtualenv. Assuming you have virtualenvwrapper installed, this is how you set up your fork for local development::
76+
This creates a virtual environment and installs the package in editable mode with all development dependencies.
6677

67-
$ mkvirtualenv confidence_dev
68-
$ cd confidence/
69-
$ tox
78+
4. Verify your setup by running the tests::
7079

71-
The tox command will install the dev requirements in requirements_dev.txt and run all tests.
80+
$ uv run pytest
7281

73-
4. Create a branch for local development::
82+
This should run all tests and show they pass.
83+
84+
5. Create a branch for local development::
7485

7586
$ git checkout -b name-of-your-bugfix-or-feature
7687

7788
Now you can make your changes locally.
7889

79-
5. When you're done making changes, format using `make black`, check that your changes pass flake8 and the tests, including testing other Python versions with tox::
90+
6. When you're done making changes, check that your changes pass all quality checks::
91+
92+
$ uv run black spotify_confidence tests --line-length 119 # Format code
93+
$ uv run flake8 spotify_confidence tests # Lint code
94+
$ uv run pytest # Run tests
95+
96+
To test across all supported Python versions (3.9, 3.10, 3.11, 3.12)::
8097

81-
$ make black
82-
$ flake8 confidence tests
83-
$ python setup.py test or py.test
84-
$ tox
98+
$ uv run tox -p auto
8599

86-
To get flake8 and tox, just pip install them into your virtualenv.
100+
Note: tox requires all Python versions to be installed on your system.
87101

88-
6. Commit your changes and push your branch to GitHub::
102+
7. Commit your changes and push your branch to GitHub::
89103

90104
$ git add .
91105
$ git commit -m "Your detailed description of your changes."
92106
$ git push origin name-of-your-bugfix-or-feature
93107

94-
7. Submit a pull request through the GitHub website.
108+
8. Submit a pull request through the GitHub website.
95109

96110
Pull Request Guidelines
97111
-----------------------
@@ -101,23 +115,36 @@ Before you submit a pull request, check that it meets these guidelines:
101115
1. The pull request should include tests.
102116
2. If the pull request adds functionality, the docs should be updated. Put
103117
your new functionality into a function with a docstring, and add the
104-
feature to the list in README.rst.
105-
3. The pull request should work for Python 3.6 and 3.7. Check
106-
and make sure that the tests pass for all supported Python versions.
118+
feature to the list in README.md.
119+
3. The pull request should work for Python 3.9, 3.10, 3.11, and 3.12. The CI
120+
pipeline will automatically test all supported Python versions.
107121

108122
Tips
109123
----
110124

111125
To run a subset of tests::
112126

113-
$ py.test tests.test_confidence
127+
$ uv run pytest tests/frequentist/test_ttest.py
128+
129+
To run a specific test::
130+
131+
$ uv run pytest tests/frequentist/test_ttest.py::TestCategorical::test_summary
132+
133+
To run tests with verbose output::
134+
135+
$ uv run pytest -v
136+
137+
To see test coverage::
138+
139+
$ uv run pytest --cov=spotify_confidence --cov-report=html
140+
$ open htmlcov/index.html
114141

115142

116143
Release Process
117144
-----------------------
118145

119146
While commits and pull requests are welcome from any contributor, we try to
120-
simplify the distribution process for everyone by managing the release
147+
simplify the distribution process for everyone by managing the release
121148
process with specific contributors serving in the role of Release Managers.
122149

123150
Release Managers are responsible for:
@@ -142,7 +169,7 @@ PATCH version when you make backwards-compatible bug fixes.
142169

143170
Release Stategy
144171
~~~~~~~~~~~~~~~~
145-
Each new release will be made on its own branch, with the branch Master
172+
Each new release will be made on its own branch, with the branch Master
146173
representing the most recent, furthest release. Releases are published to PyPi
147174
automatically once a new release branch is merged to Master. Additionally,
148175
rew releases are also tracked manually on `github

MANIFEST.in

Lines changed: 0 additions & 10 deletions
This file was deleted.

Makefile

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -47,14 +47,17 @@ clean-test: ## remove test and coverage artifacts
4747
rm -f .coverage
4848
rm -fr htmlcov/
4949

50+
format: ## format code with black
51+
black spotify_confidence tests --line-length 119
52+
5053
lint: ## check style with flake8
51-
flake8 confidence tests
54+
flake8 spotify_confidence tests
5255

5356
test: ## run tests quickly with the default Python
5457
python3 -m pytest
5558

5659
coverage: ## check code coverage quickly with the default Python
57-
coverage run --source confidence -m pytest
60+
coverage run --source spotify_confidence -m pytest
5861
coverage report -m
5962
coverage html
6063
$(BROWSER) htmlcov/index.html
@@ -86,10 +89,8 @@ install: clean ## install the package to the active Python's site-packages
8689
pip install -e .
8790

8891
install-test: clean
89-
pip3 install --index-url https://test.pypi.org/simple/ confidence-spotify
92+
pip3 install --index-url https://test.pypi.org/simple/ spotify-confidence
9093

9194
install-prod: clean
92-
pip3 install confidence-spotify
95+
pip3 install spotify-confidence
9396

94-
black:
95-
black spotify_confidence tests --line-length 119

0 commit comments

Comments
 (0)