Skip to content

Conversation

@rsp2k
Copy link

@rsp2k rsp2k commented Jun 12, 2025

📦 Add Professional Packaging Infrastructure and CI/CD Pipeline

This PR transforms the project into a professionally packaged Python project with comprehensive testing and CI/CD infrastructure, making it ready for PyPI distribution.

🎯 Overview

This PR adds a complete packaging and testing ecosystem that enables:

  • Professional Distribution: Package can be built, installed, and distributed via PyPI
  • Cross-Platform Testing: Automated testing on Ubuntu, Windows, and macOS
  • Modern Tooling: Uses UV for fast, reliable dependency management
  • Quality Assurance: Comprehensive packaging tests and validation
  • Instant Usage: Users can run uvx mcp-crawl4ai-rag without installation

📋 Key Changes Made

1. Entry Point Configuration

  • ✅ Added [project.scripts] section to pyproject.toml
  • ✅ Configured "mcp-crawl4ai-rag" entry point to launch crawl4ai_mcp:main
  • ✅ Users can now run mcp-crawl4ai-rag command after installation

2. Professional Package Structure

  • ✅ Reorganized code into proper src/crawl4ai_mcp/ package structure
  • ✅ Added __init__.py with proper imports and version information
  • ✅ Fixed relative imports to use package-qualified imports
  • ✅ Maintains backward compatibility while enabling proper packaging

3. Comprehensive CI/CD Pipeline (.github/workflows/)

  • Multi-platform testing: Ubuntu, Windows, macOS
  • Multi-version testing: Python 3.12 and 3.13
  • Modern tooling: UV for dependency management and package installation
  • Three specialized jobs:
    • test-packaging: Full packaging lifecycle testing
    • test-import: Minimal import verification
    • validate-config: Configuration validation
  • PyPI Deployment: Automated publishing on tag creation

4. Professional Testing Suite (tests/test_packaging.py)

  • Package Building: Tests wheel generation with python -m build
  • Import Validation: Verifies main function is importable
  • Installation Testing: Tests package installation and import
  • Configuration Validation: Validates pyproject.toml structure
  • Cross-platform compatibility: Windows and Unix-specific approaches

5. Build System Configuration

  • ✅ Added [build-system] configuration for setuptools
  • ✅ Proper [tool.setuptools] package discovery configuration
  • ✅ Package data inclusion for all Python files

🚀 User Benefits

⚡ Instant Run (Recommended)

# Run directly without installation - UV handles everything!
uvx mcp-crawl4ai-rag

Perfect for: Testing, one-time usage, or keeping your system clean

📦 Traditional Installation

# Install globally
pip install mcp-crawl4ai-rag

# Then run anywhere
mcp-crawl4ai-rag

Perfect for: Regular usage, integration with other tools

🏃‍♂️ UV Installation (Fastest)

# Install with UV (faster than pip)
uv pip install mcp-crawl4ai-rag

# Run the server
mcp-crawl4ai-rag

Perfect for: UV users who want the fastest installation

🔧 Technical Highlights

UV-First Approach

  • Modern Python package manager for speed and reliability
  • Consistent across development, testing, and CI environments
  • Proper virtual environment handling and caching
  • uvx support: Users can run the server instantly without installation

Cross-Platform Compatibility

  • Windows: PowerShell shell, UTF-8 encoding, platform-specific file handling
  • macOS/Linux: Bash shell with native wildcard expansion
  • Platform Detection: Smart handling of OS-specific requirements

Comprehensive Test Coverage

  1. Configuration Validation: pyproject.toml structure and entry points
  2. Build Testing: Wheel generation and integrity
  3. Installation Testing: Package installation across platforms
  4. Import Testing: Module importability and function availability
  5. Entry Point Testing: Command availability in PATH

📊 Test Matrix

Platform Python 3.12 Python 3.13
Ubuntu Latest
Windows Latest
macOS Latest

Total: 6 test combinations + specialized validation jobs

🛠️ Files Added/Modified

  • pyproject.toml - Added build system and entry point configuration
  • .github/workflows/build-test.yml - Complete CI/CD pipeline
  • .github/workflows/deploy-pypi.yml - PyPI deployment workflow
  • tests/test_packaging.py - Comprehensive packaging test suite
  • test_packaging_local.sh - Local testing script
  • src/crawl4ai_mcp/__init__.py - Package initialization
  • src/crawl4ai_mcp/crawl4ai_mcp.py - Main module (restructured)
  • src/crawl4ai_mcp/utils.py - Utilities module (moved)
  • DEPLOYMENT.md - Deployment and usage instructions

🎉 Ready for Production

This PR makes the project ready for:

  • PyPI Distribution: Professional packaging standards
  • Instant Execution: Single-command usage with uvx
  • User Installation: Simple pip install workflow
  • CI/CD Integration: Automated quality assurance
  • Cross-platform Deployment: Reliable operation everywhere

🔌 MCP Client Integration

The uvx approach makes integration incredibly simple:

Claude Desktop

{
  "mcpServers": {
    "crawl4ai-rag": {
      "command": "uvx",
      "args": ["mcp-crawl4ai-rag"],
      "env": {
        "TRANSPORT": "stdio"
      }
    }
  }
}

Benefits of uvx Approach

  • Zero Installation Friction - Try immediately without setup
  • Always Latest Version - UV fetches the newest release automatically
  • Isolated Execution - No global package pollution
  • Consistent Across Platforms - Same command works everywhere

🧪 Testing

All tests can be run locally with:

# Quick local test
./test_packaging_local.sh

# Or run the full test suite
uv run python tests/test_packaging.py

The MCP server can now be distributed, installed, and used as a professional Python package with confidence in its reliability across all major platforms. Users can start using it immediately with a single uvx command!


Note: This PR includes recent PyPI packaging standards and practices for modern Python distribution, making the package ready for immediate publication to PyPI.

rsp2k and others added 5 commits June 12, 2025 16:29
* Add project.scripts entry point for mcp-crawl4ai-rag

- Add [project.scripts] section to pyproject.toml
- Configure "mcp-crawl4ai-rag" entry point to launch crawl4ai_mcp:main
- This allows the package to be installed and run as a command line script

* Add basic packaging tests

- Create test_packaging.py with comprehensive packaging validation
- Tests package building, installation, and entry point functionality
- Validates pyproject.toml configuration
- Provides clear pass/fail reporting

* Add comprehensive GitHub Actions workflow for build/package testing

- Multi-OS testing (Ubuntu, Windows, macOS)
- Multi-Python version testing (3.12, 3.13)
- Uses uv for dependency management (matching project setup)
- Tests package building, installation, and entry point functionality
- Validates pyproject.toml configuration
- Uploads build artifacts for inspection
- Includes import testing and config validation jobs

* Add local test script for quick packaging validation

- Bash script to run packaging tests locally before pushing
- Validates pyproject.toml configuration
- Runs the Python packaging tests
- Tests the build process with either uv or pip
- Provides clear feedback and early validation

* Update pyproject.toml with proper build system and src layout configuration

- Add [build-system] section with setuptools backend
- Configure [tool.setuptools] for src layout 
- Specify package directory mapping
- Ensure proper package discovery for entry point resolution

* Create package __init__.py to expose main function

- Add __init__.py to create proper Python package structure
- Import and expose main function for entry point access
- Set package version and exports

* Move utils.py into crawl4ai_mcp package directory

- Move utils.py to src/crawl4ai_mcp/utils.py for proper package structure
- Required for the entry point to work correctly with setuptools

* finish moving source

* Update workflow to use uv consistently across all jobs

- Add uv setup to test-import and validate-config jobs
- Replace pip install with uv pip install in validation tools
- Add consistent caching configuration
- Use uv run python for executing validation scripts

* Fix UV virtual environment error in validate-config job

- Add --system flag to uv pip install command
- This allows installation into the system Python environment in CI
- Fixes error: No virtual environment found

* finish moving source

* finish moving source

* finish moving source

* Fix ModuleNotFoundError in validate-config job

- Change from 'uv run python -c' to 'python -c' for validation scripts
- uv run creates isolated environment without access to --system installed packages
- Regular python command uses system Python with tomli available

* Update test_packaging.py to use uv instead of pip

- Replace subprocess call to pip with uv pip install
- Change python -m build to uv run python -m build
- Maintain all existing test functionality
- Fix inconsistent dependency management in tests

* Add arch-latest to test-packaging job matrix

- Include Arch Linux testing alongside Ubuntu, Windows, and macOS
- Provides broader Linux distribution coverage for packaging tests
- Ensures compatibility across different Linux environments

* Fix Windows Unicode encoding issue by using PowerShell

- Set default shell to PowerShell for Windows runners to handle Unicode properly
- Update Windows entry point test to use PowerShell commands (Get-Command)
- Prevents UnicodeEncodeError with emoji characters in test output
- Maintains bash for Unix systems (Ubuntu, macOS, Arch)

* Fix Python Unicode encoding by setting PYTHONIOENCODING

- Add PYTHONIOENCODING=utf-8 environment variable to force UTF-8 encoding
- This ensures Python can properly handle Unicode characters on Windows
- Fixes UnicodeEncodeError when printing emoji characters in test output

* Fix Windows file locking issue in test_entry_point_after_install

- Use platform-specific approach: subprocess test on Windows, temp directory on Unix
- Avoids Windows file locking issues with .pyd files in temporary directories
- Maintains thorough testing on Unix systems while working around Windows limitations
- Uses uv run python subprocess to test import after installation on Windows

* Fix PowerShell wildcard expansion for wheel installation

- Split package installation into separate Unix/Windows steps
- Use PowerShell Get-ChildItem to properly find wheel files on Windows
- Add error handling for missing wheel files
- Maintains original wildcard behavior on Unix systems

* Remove arch-latest runner (not officially supported)

- Remove arch-latest from test matrix as it's not an official GitHub Actions runner
- Keep the three official runners: ubuntu-latest, windows-latest, macos-latest
- This prevents hanging jobs waiting for unavailable runners
- Still provides comprehensive cross-platform testing coverage

---------

Co-authored-by: Ryan Malloy <[email protected]>
@rsp2k
Copy link
Author

rsp2k commented Jun 12, 2025

@coleam00 Thank you for your work! I started looking at the code because I wanted to run with uvx from pypi, then saw there were several PR's and no tests. Since the PR's are idle, I'm going to try and pull them into this new testable environment so they can be further reviewed once they pass basic tests.

@coleam00
Copy link
Owner

Thank you so much for this PR @rsp2k ! This MCP has started as something more experimental that I later want to turn into Archon V2 (lot of work happening for that behind the scenes), hence quite a few pending PRs and no tests.

The plan is to move this into Archon at some point and there I'll have a comprehensive test suite and will certainly be incorporating what you have done here, so thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants