Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for conda packages #204

Merged
merged 57 commits into from
Feb 7, 2024
Merged

add support for conda packages #204

merged 57 commits into from
Feb 7, 2024

Conversation

jameslamb
Copy link
Owner

@jameslamb jameslamb commented Jan 24, 2024

Closes #200.

Adds support for checking conda packages with pydistcheck 🥳

It supports both formats:

  • .tar.bz2 = the format conda-build has supported for most of its existence
  • .conda = the format introduced in conda 4.7 (4ish years ago) that's gaining broader acceptance across different conda tools and channels

An example

What's in the pytorch conda packages?

# install pydistcheck
pip install .

# download some pytorch conda packages
mkdir -p ./downloads

python ./bin/get-conda-release-files.py \
        'pytorch' \
        'pytorch' \
        ./downloads

# check them
pydistcheck --inspect ./downloads/*

Lots of interesting stuff to see!

For example, for linux-64-pytorch-2.2.0-py3.10_cpu_0.tar.bz2:

  • it's 79.4MB compressed and 0.4GB uncompressed
  • it contains 12,000+ files
  • looks like it's bundling a copy of the cpython shared library (maybe?) and that that was compiled with debug symbols included
==================== running pydistcheck ====================

checking './downloads/linux-64-pytorch-2.2.0-py3.10_cpu_0.tar.bz2'
----- package inspection summary -----
file size
  * compressed size: 79.4M
  * uncompressed size: 0.4G
  * compression space saving: 79.3%
contents
  * directories: 0
  * files: 12076 (11 compiled)
size by extension
  * .so - 298893.6K (76.1%)
  * .py - 25532.7K (6.5%)
  * .h - 25511.6K (6.5%)
  * .pyc - 19139.5K (4.9%)
  * .txt - 6748.1K (1.7%)
  * no-extension - 6236.5K (1.6%)
  * .0 - 5213.9K (1.3%)
  * .json - 3005.1K (0.8%)
  * .yaml - 742.5K (0.2%)
  * .pyi - 626.9K (0.2%)
  * .cuh - 600.7K (0.2%)
  * .cmake - 254.5K (0.1%)
  * .hpp - 241.5K (0.1%)
  * .cpp - 101.8K (0.0%)
  * .js - 18.8K (0.0%)
  * .mjs - 11.0K (0.0%)
  * .sh - 5.9K (0.0%)
  * .bat - 3.9K (0.0%)
  * .template - 2.9K (0.0%)
  * .cu - 0.5K (0.0%)
  * .ini - 0.5K (0.0%)
  * .html - 0.4K (0.0%)
  * .bzl - 0.3K (0.0%)
  * .in - 0.2K (0.0%)
  * .md - 0.1K (0.0%)
  * .bazel - 0.1K (0.0%)
  * .typed - 0.0K (0.0%)
largest files
  * (0.3G) lib/python3.10/site-packages/torch/lib/libtorch_cpu.so
  * (22.1M) lib/python3.10/site-packages/torch/lib/libtorch_python.so
  * (6.6M) lib/python3.10/site-packages/torch-2.2.0-py3.10.egg-info/SOURCES.txt
  * (5.1M) lib/python3.10/site-packages/torch/bin/protoc-3.13.0.0
  * (5.1M) lib/python3.10/site-packages/torch/bin/protoc
------------ check results -----------
1. [compiled-objects-have-debug-symbols] Found compiled object containing debug symbols. For details, extract the distribution contents and run 'objdump --all-headers "lib/python3.10/site-packages/torch/_C.cpython-310-x86_64-linux-gnu.so"'.
2. [distro-too-large-compressed] Compressed size 79.4M is larger than the allowed size (50.0M).
3. [distro-too-large-uncompressed] Uncompressed size 0.4G is larger than the allowed size (75.0M).
4. [too-many-files] Found 12076 files. Only 2000 allowed.

Hopefully this addition to pydistcheck will be helpful for project maintainers to detect unexpected package changes, and identify areas for improvement.

References

conda:

working with zstd-compressed stuff in Python:

@jameslamb jameslamb added the enhancement New feature or request label Jan 24, 2024
@jameslamb jameslamb changed the title WIP: add support for conda packages add support for conda packages Feb 7, 2024
@jameslamb jameslamb marked this pull request as ready for review February 7, 2024 04:42
@jameslamb jameslamb merged commit da55044 into main Feb 7, 2024
21 checks passed
@jameslamb jameslamb deleted the conda-packages branch February 7, 2024 04:47
@jameslamb
Copy link
Owner Author

Thanks very much @indygreg for making and maintaining python-zstandard! And especially for publishing wheels for so many platforms (https://pypi.org/project/zstandard/#files).

Without python-zstandard, adding this feature would have taken a lot longer and whatever I came up with would probably have been less performant and less portable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[feature request] support checking conda packages
1 participant