Skip to content

Commit b2f1c06

Browse files
tonybaloneyAnthony ShawCopilot
authored
[WIP] Rewrite backends in Rust using Ruff's parser use parquet for storage and faster indexing (#238)
* start rust conversion * Update CI * simplify CI and reformat * ignore artifacts * further simplify * rust function returns tuple collection * use rich progress bars * fix variable name reuse * Simplify rust module * working prototype before crate denormalization * Use Ruff's API * remove another radon dependency * cyclomatic harvestor * remove use of builtin exit * halstead metrics * use stdlib mode function * Implement Halstead harvesters * update assertions * MI harvester * Create a new file iterator and remove radon * update lockfile * Format tests * happier syntax with 3.10 * tidy up * cleanup deps * better naming * Use rust backend for processing * ruff fixes * Diff uses new parallel function * remove unused import * Update mocks in tests * Improve halstead * Enhance Maintainability Index calculation with improved Halstead and Cyclomatic complexity metrics * align halstead * update tests. update versions * cleanup redundant code * build wheels in CI * rust cleanup * simplify halstead cases * add instructions for arch * Use ujson for cache builds * resort imports * force arm on windows * Use rich tables, don't archive shebang files by default (it's very slow) * formatting * don't assert on color * remove redundant function * formatting * remove redundant tests * baseline metrics * BREAKING: Only store full metrics for the seed by default * gen import lib * build package as v2 * show commit/sec speed * run metrics in parallel * Render progress and speed together * formatting * iterate through git revisions in rust * refactor the build process to call out the first pass as a separate step * formatting updates * Use rich logging * run all tests on windows * update test for new design * include comma in loc stat * use conversion traits * cleaner conversions * fix silly log assertions in tests * cleanup * Use traits for conversion * Use unix-style paths across the index to remove a lot of the switching and make the index platform independent * lint * run aggregates in rust * (temp) move the JSON serialization to rust before refactoring the whole thing into parquet * parquet 1 * remove all the old json code * Don't store the mega index * use lz4 compression * hold open the index as a context manager * Don't look for indexes * Start migrating the index and cache code to rust using the arrow libraries in rust * move more of the index code out of Python * Continue removing Python indexer * linting * we don't need ujson anymore * move report command to new index format * Update diff command * cleanup old APIs * lint updates * ipynb support * fix a bug with granular reports * fix unit tests * Compare each file against the last time it was indexed * Move around imports * fix argument * we ended up reversing twice * start refactor git * Do all the filtering on Python files inside rust to minimise the data going back and forth. Use iterators in revisions to yield through them more efficiently * Formatting * fix import issues * Simplify file indexer and work through broken tests * remove noisy build debug logs * Don't return details in index getitem, fix diff * remove rank threshold * format update * remove the test for the deprecated flag * rust linting * Put sorting in the results for getitem * Improve test * Remove multi argument * add diag * add zstd compression * drop diagnostic * setup benchmarks * Add extra timing stats * why * hashmap allocations * big refactor to single parse * update benchmarks * Use compact strings for operand hashmaps * Patch out diff for now * Use some constr * fix primitive sequence * cleanup old test * only add rank when needed in diff * fix diff bug * fully implement --no-detail flag * allow diff against specific revision * don't allow diff for non-indexed revisions * refactor diff * Add cognitive complexity metric and update related components * Refactor CI workflow for release process and update versioning to 2.0.0-alpha.1; enhance cognitive complexity metrics and improve documentation in HISTORY.md * Clean up code by removing unnecessary blank lines and optimizing list comprehensions for better readability * Add before-script to install dependencies for Linux builds in CI * Clean up crappy tests * Update index tests to assert minimum occurrences of "An author" and clean up report test * Enhance before-script for Linux builds to support multiple package managers * Update CI before-script for Linux and add OpenSSL dependency in Cargo files * Fix Linux wheel build by installing OpenSSL dev headers and missing Perl modules The manylinux2014 container was missing openssl-devel and perl-Time-Piece, causing openssl-sys to fail when building OpenSSL from source. 🐨 Generated with Crush Assisted-by: Claude Opus 4.6 via Crush <crush@charm.land> * Drop macos x64 since it doesnt' exist anymore * Update to new sort command * Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Anthony Shaw <anthonyshaw@microsoft.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
1 parent 2d9685a commit b2f1c06

73 files changed

Lines changed: 11983 additions & 4910 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/workflows/ci.yml

Lines changed: 181 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,20 @@
22
name: CI
33
on:
44
push:
5-
branches:
5+
branches:
66
- master
7+
- v2
8+
tags:
9+
- "**"
710
pull_request:
811
branches:
912
- master
13+
- v2
1014
workflow_dispatch: {}
1115

16+
env:
17+
COLUMNS: 150
18+
1219
jobs:
1320
test:
1421
strategy:
@@ -19,56 +26,195 @@ jobs:
1926
- windows-latest
2027
- macos-latest
2128
python-version:
22-
- '3.7'
23-
- '3.8'
24-
- '3.9'
2529
- '3.10'
2630
- '3.11'
2731
- '3.12'
32+
- '3.13'
33+
- '3.14'
2834
runs-on: "${{ matrix.os }}"
2935
continue-on-error: false
3036
steps:
31-
- uses: actions/checkout@v3
32-
- uses: actions/setup-python@v4
33-
with:
34-
python-version: "${{ matrix.python-version }}"
35-
- name: install dependencies
36-
run: |
37-
python -m pip install --upgrade pip setuptools wheel
38-
pip install flit
39-
flit install --extras=all
40-
- name: lint and test
41-
run: make ci
42-
- name: Inclusiveness Analyzer
43-
uses: microsoft/InclusivenessAnalyzer@main
44-
with:
45-
excludeTerms: master
37+
- uses: actions/checkout@v4
38+
- name: Install Rust toolchain
39+
uses: dtolnay/rust-toolchain@stable
40+
- name: Cache Rust
41+
uses: Swatinem/rust-cache@v2
42+
- name: Install uv
43+
uses: astral-sh/setup-uv@v2
44+
- name: Sync project environment
45+
run: uv sync --all-groups --python "${{ matrix.python-version }}"
46+
- name: test
47+
run: uv run pytest
4648
- name: upload coverage to codecov
4749
uses: codecov/codecov-action@v3
4850
with:
4951
# TODO(tonybaloney): move token to `secrets.CODECOV_TOKEN`
5052
token: 48f9ff3a-6358-4607-aa5d-9cb7cada539c
5153
files: .tests-reports/coverage.xml
5254
fail_ci_if_error: true
55+
5356
ruff:
5457
runs-on: ubuntu-latest
5558
steps:
56-
- uses: actions/checkout@v3
57-
- run: pip install --user ruff
58-
- run: ruff --format=github .
59+
- uses: actions/checkout@v4
60+
- uses: astral-sh/setup-uv@v2
61+
- run: uv tool run ruff check .
62+
63+
# Rust linting
64+
clippy:
65+
runs-on: ubuntu-latest
66+
steps:
67+
- uses: actions/checkout@v4
68+
- name: Install Rust toolchain
69+
uses: dtolnay/rust-toolchain@stable
70+
with:
71+
components: clippy
72+
- name: Cache Rust
73+
uses: Swatinem/rust-cache@v2
74+
- name: Run clippy
75+
run: cargo clippy --manifest-path backend/Cargo.toml -- -D warnings
76+
77+
rustfmt:
78+
runs-on: ubuntu-latest
79+
steps:
80+
- uses: actions/checkout@v4
81+
- name: Install Rust toolchain
82+
uses: dtolnay/rust-toolchain@stable
83+
with:
84+
components: rustfmt
85+
- name: Check formatting
86+
run: cargo fmt --manifest-path backend/Cargo.toml --check
87+
88+
# Build source distribution
89+
build-sdist:
90+
name: build sdist
91+
runs-on: ubuntu-latest
92+
steps:
93+
- uses: actions/checkout@v4
94+
- uses: actions/setup-python@v5
95+
with:
96+
python-version: "3.13"
97+
- uses: PyO3/maturin-action@v1
98+
with:
99+
command: sdist
100+
args: --out dist
101+
rust-toolchain: stable
102+
- uses: actions/upload-artifact@v4
103+
with:
104+
name: pypi_files_sdist
105+
path: dist
106+
107+
# Build wheels for all supported platforms
108+
build:
109+
name: build on ${{ matrix.os }} (${{ matrix.target }})
110+
strategy:
111+
fail-fast: false
112+
matrix:
113+
include:
114+
# Linux x86_64
115+
- os: linux
116+
target: x86_64
117+
runs-on: ubuntu-latest
118+
manylinux: auto
119+
# Linux aarch64
120+
- os: linux
121+
target: aarch64
122+
runs-on: ubuntu-latest
123+
manylinux: auto
124+
# macOS aarch64 (Apple Silicon)
125+
- os: macos
126+
target: aarch64
127+
runs-on: macos-latest
128+
# Windows x86_64
129+
- os: windows
130+
target: x86_64
131+
runs-on: windows-latest
132+
# Windows aarch64
133+
- os: windows
134+
target: aarch64
135+
python-architecture: arm64
136+
runs-on: windows-11-arm
137+
138+
runs-on: ${{ matrix.runs-on }}
139+
steps:
140+
- uses: actions/checkout@v4
141+
142+
- name: Set up Python
143+
uses: actions/setup-python@v6
144+
with:
145+
python-version: "3.13"
146+
architecture: ${{ matrix.python-architecture || 'x64' }}
147+
148+
- name: Install twine
149+
run: pip install -U twine
150+
151+
- name: Build wheels
152+
uses: PyO3/maturin-action@v1
153+
with:
154+
target: ${{ matrix.target }}
155+
manylinux: ${{ matrix.manylinux || 'auto' }}
156+
args: --release --out dist --interpreter 3.10 3.11 3.12 3.13 3.14
157+
rust-toolchain: stable
158+
docker-options: -e CI
159+
before-script-linux: |
160+
if command -v yum &> /dev/null; then
161+
yum install -y openssl-devel cmake3 perl-IPC-Cmd perl-Time-Piece
162+
which cmake3 && ln -sf $(which cmake3) /usr/local/bin/cmake || true
163+
elif command -v apk &> /dev/null; then
164+
apk add --no-cache openssl-dev cmake make perl
165+
elif command -v apt-get &> /dev/null; then
166+
apt-get update && apt-get install -y libssl-dev cmake perl
167+
fi
168+
169+
- name: List dist files
170+
run: ls -lh dist/
171+
shell: bash
172+
173+
- name: Check wheels
174+
run: twine check --strict dist/*
59175

60-
pyright:
176+
- uses: actions/upload-artifact@v4
177+
with:
178+
name: pypi_files_${{ matrix.os }}_${{ matrix.target }}
179+
path: dist
180+
181+
# Publish to PyPI on tagged releases
182+
release:
183+
needs: [test, ruff, clippy, rustfmt, build-sdist, build]
184+
if: startsWith(github.ref, 'refs/tags/')
61185
runs-on: ubuntu-latest
186+
environment:
187+
name: release
188+
permissions:
189+
id-token: write
190+
contents: write
62191
steps:
63-
- uses: actions/checkout@v3
64-
- uses: actions/setup-python@v4
65-
with:
66-
python-version: 3.11
67-
- name: install dependencies
68-
run: |
69-
python -m pip install --upgrade pip setuptools wheel
70-
pip install flit
71-
flit install --extras=all
72-
- uses: jakebailey/pyright-action@v1
73-
with:
74-
working-directory: .
192+
- uses: actions/checkout@v4
193+
194+
- name: Get dist artifacts
195+
uses: actions/download-artifact@v4
196+
with:
197+
pattern: pypi_files_*
198+
merge-multiple: true
199+
path: dist
200+
201+
- name: List dist files
202+
run: |
203+
ls -lh dist/
204+
echo "Total files: $(ls dist | wc -l)"
205+
206+
- name: Test wheels integrity
207+
run: for whl in dist/*.whl; do unzip -qt "$whl"; done
208+
209+
- name: Install uv
210+
uses: astral-sh/setup-uv@v2
211+
212+
- name: Publish to PyPI
213+
run: uv publish --trusted-publishing always
214+
215+
- name: Upload to GitHub Release
216+
uses: softprops/action-gh-release@v2
217+
with:
218+
files: |
219+
dist/*
220+
prerelease: ${{ contains(github.ref, 'alpha') || contains(github.ref, 'beta') }}

.gitignore

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,9 @@ docs/_build/
6565
# PyBuilder
6666
target/
6767

68+
# Rust builds
69+
rust/target/
70+
6871
# Jupyter Notebook
6972
.ipynb_checkpoints
7073

@@ -107,9 +110,14 @@ test_granular.html
107110
.idea/
108111
.vscode/
109112
docs/build/
110-
113+
.vs/
111114
# OSX
112115
.DS_Store
113116

114117
# tests
115118
.tests-reports/
119+
120+
# test artifacts
121+
foo/
122+
plotly.min.js
123+
wily_report/

0 commit comments

Comments
 (0)