Skip to content

Latest commit

 

History

History
108 lines (74 loc) · 6.06 KB

File metadata and controls

108 lines (74 loc) · 6.06 KB

anything-to-ai Development Guidelines

Auto-generated from all feature plans. Last updated: 2025-09-28

Active Technologies

  • Python 3.13 (per project requirements) + lightning-whisper-mlx (audio transcription with MLX optimization), existing audio_processor modules (models, processor, cli, formatters) (014-timestamp-support-for)

  • N/A (in-memory processing, no persistent storage) (014-timestamp-support-for)

  • Python 3.11+ (project requires >=3.11) + pdfplumber (PDF), mlx-vlm (VLM), Pillow (images), lightning-whisper-mlx (audio), pydantic (text_summarizer validation), httpx (LLM client) (015-extend-all-result)

  • N/A (in-memory processing only, no persistent storage) (015-extend-all-result)

  • Python 3.11+ (per clarifications) + setuptools, wheel, build (packaging tools); existing module dependencies (pdfplumber, mlx-vlm, lightning-whisper-mlx, etc.) (012-prepare-this-repository)

  • File system only (package distribution files) (012-prepare-this-repository)

  • Python 3.13 + alive-progress (CLI rendering), asyncio (async support), dataclasses (models) (010-unify-progress-bars)

  • N/A (in-memory state only) (010-unify-progress-bars)

  • Python 3.13 + pdfplumber (PDF), mlx-vlm (VLM), Pillow (images), lightning-whisper-mlx (audio), alive-progress (CLI), httpx (LLM client) (011-mkdn-markdown-output)

  • File system only (no persistent storage) (011-mkdn-markdown-output)

  • Python 3.8+ (for compatibility with standard library features) + PyPDF2 or pdfplumber for PDF parsing (minimal external dependencies per constitution) (001-a-simple-python)

  • Python 3.13 (per project requirements) + mlx-vlm (VLM processing), PIL/Pillow (image handling) (002-implement-a-module)

  • File system (image files), no persistent storage required (002-implement-a-module)

  • Python 3.13 (per project requirements) + mlx-vlm (VLM processing), PIL/Pillow (image handling), existing image_processor module (003-real-vlm-insegration)

  • Python 3.13 (per project requirements) + pdfplumber (PDF parsing), mlx-vlm (VLM processing), PIL/Pillow (image handling), existing pdf_extractor and image_processor modules (005-augment-pdf-extraction)

  • File system (PDF and image files), no persistent storage required (005-augment-pdf-extraction)

  • Python 3.13 (per project requirements) + lightning-whisper-mlx (MLX-optimized Whisper for audio transcription) (006-audio-to-text)

  • File system (audio files: mp3, wav, m4a), no persistent storage required (006-audio-to-text)

  • Python 3.13 + pre-commit (hook framework), ruff (linting/formatting), pytest (testing), pytest-cov (coverage measurement) (007-add-linting-and)

  • N/A (configuration files only) (007-add-linting-and)

  • Python 3.13 + Standard library (urllib, json), OpenAI-compatible client libraries (to be researched - potentially openai SDK or httpx for direct API calls) (008-utility-module-to)

  • In-memory caching for model listings, no persistent storage (008-utility-module-to)

  • Python 3.13 + llm_client module (OpenAI-compatible client), standard library (json, argparse, sys) (009-summarizer-module-this)

  • N/A (in-memory processing) (009-summarizer-module-this)

Project Structure

src/
tests/

Commands

uv run pytest uv run ruff check . uv run python check_file_lengths.py

Pre-commit hooks

uv run pre-commit install uv run pre-commit run --all-files git commit --no-verify

PDF Extraction CLI

uv run python -m pdf_extractor extract <pdf_file> [--format plain|json|csv|markdown] [--stream] [--progress]

Image Processing CLI

uv run python -m image_processor <image_files> [--style brief|detailed|technical] [--format json|csv|plain|markdown]

Audio Transcription CLI

uv run python -m audio_processor <audio_files> [--format plain|json|markdown] [--model tiny|small|base|medium|large|large-v3] [--quantization none|4bit|8bit] [--language LANG] [--output FILE] [--verbose] uv run python -m audio_processor <audio_files> --timestamps --format markdown uv run python -m audio_processor <audio_files> --timestamps --format json

Note: Default quantization is 'none' due to MLX compatibility. Use --quantization 4bit/8bit only if your MLX version supports it.

Text Summarization CLI

uv run python -m text_summarizer <text_file> [--format json|plain|markdown] [--output FILE] [--no-metadata] [--verbose] [--model MODEL] [--provider PROVIDER] uv run python -m text_summarizer --stdin [--format json|plain|markdown] [--output FILE] [--no-metadata] [--model MODEL] [--provider PROVIDER]

Use different models

uv run python -m text_summarizer document.txt --model llama2 uv run python -m text_summarizer document.txt --model mistral:latest

Use different providers (ollama, lmstudio, mlx)

uv run python -m text_summarizer document.txt --provider lmstudio --model mistral uv run python -m text_summarizer document.txt --provider mlx --model mlx-community/llama-3

Pipeline examples

uv run python -m pdf_extractor extract document.pdf --format plain | python -m text_summarizer --stdin uv run python -m audio_processor audio.mp3 --format plain | python -m text_summarizer --stdin --model mistral --provider ollama

Markdown format examples

uv run python -m pdf_extractor extract document.pdf --format markdown > output.md uv run python -m image_processor image.jpg --format markdown uv run python -m audio_processor podcast.mp3 --format markdown uv run python -m text_summarizer article.txt --format markdown

Code Style

Python 3.8+ (for compatibility with standard library features): Follow standard conventions

Recent Changes

  • 015-extend-all-result: Added Python 3.11+ (project requires >=3.11) + pdfplumber (PDF), mlx-vlm (VLM), Pillow (images), lightning-whisper-mlx (audio), pydantic (text_summarizer validation), httpx (LLM client)

  • 014-timestamp-support-for: Added Python 3.13 (per project requirements) + lightning-whisper-mlx (audio transcription with MLX optimization), existing audio_processor modules (models, processor, cli, formatters)

  • 012-prepare-this-repository: Added Python 3.11+ (per clarifications) + setuptools, wheel, build (packaging tools); existing module dependencies (pdfplumber, mlx-vlm, lightning-whisper-mlx, etc.)