Releases: flamehaven01/Dir2md
[1.2.2] - 2026-01-06
Security & Compatibility Patch
This release hardens symlink handling, preserves secret masking for large files, and removes pathspec deprecation warnings.
Fixed
- CRITICAL: Prevented symlinked directory traversal from escaping root when
--follow-symlinksis enabled - CRITICAL: Large inputs are now masked in chunks instead of bypassing masking entirely
- COMPAT: PathSpec matching uses
gitignoreengine to avoid deprecation warnings
Changed
- Updated internal version markers to
1.2.2
[1.2.1] - 2025-12-18
Security & Reliability Patch
This release addresses critical security and reliability issues identified in the SIDRCE Spicy Audit.
Fixed
CRITICAL: Markdown Fence Injection (Issue #5)
- Problem: File content containing triple backticks (```) could break markdown output and potentially inject misleading content
- Solution: Implemented dynamic fence escaping that counts consecutive backticks in content and uses N+1 backticks for outer fence
- Impact: Prevents markdown structure corruption and potential injection attacks
- Files:
src/dir2md/markdown.py
HIGH: Subprocess RCE Vector (Issue #1)
- Problem:
vulturesubprocess call created RCE vector if malicious binary in PATH + unreliable dependency on external tool - Solution: Removed subprocess.run() call entirely; documented future approach using AST or library API
- Impact: Eliminates security risk and removes unreliable external dependency
- Files:
src/dir2md/spicy.py
MEDIUM: Silent Exception Failures (Issue #2)
- Problem:
try-except-passblocks silently ignored .env loading failures, making debugging impossible - Solution: Replaced bare
except Exception: passwith specific exception handling (OSError, UnicodeDecodeError) and logging.warning() - Impact: Users now see warnings when configuration loading fails
- Files:
src/dir2md/cli.py
LOW: Aggressive Glob Expansion (Issue #3)
- Problem: Auto-expansion of patterns (e.g.,
foo/→[foo/, foo/**, **/foo, **/foo/**]) violated user intent and caused performance issues in large repos - Solution: Removed automatic expansion; now respects gitignore standard (
foo/meansfoo/,**/foofor recursive) - Impact: Better performance, predictable behavior, respects principle of least surprise
- Files:
src/dir2md/walker.py
LOW: Hardcoded DEFAULT_EXCLUDES (Issue #4)
- Problem: 20+ hardcoded exclude patterns with personal preferences (
.pytest_cache_local) and no easy way to customize - Solution: Moved to external
defaults.jsonfile, removed personal preferences, added graceful fallback - Impact: Easier maintenance, user can modify excludes by editing JSON file, cleaner codebase
- Files:
src/dir2md/cli.py,src/dir2md/defaults.json(new)
Added
- Configuration File:
src/dir2md/defaults.json- External default exclusion patterns - Package Data: Configured
pyproject.tomlto includedefaults.jsonin distribution - Priority System: Three-tier exclusion pattern priority (user > project > system)
- System defaults:
defaults.jsonor custom file via--defaults-file - Project config:
pyproject.toml[tool.dir2md]excludes = [...] - User CLI:
--exclude-glob(highest priority)
- System defaults:
- CLI Argument:
--defaults-file- Specify custom defaults.json path - pyproject.toml Support:
[tool.dir2md.excludes]- Project-level default excludes
Removed
- External tool dependency:
vulturesubprocess execution (security risk) - Unused imports:
subprocess,shutilfromspicy.py - Personal preferences:
.pytest_cache_localfrom default excludes - Aggressive glob expansion: Non-glob patterns no longer auto-expanded
Changed
- Glob pattern handling now respects user intent per gitignore standard
- Default excludes now loaded from JSON file with graceful fallback
- Exclusion pattern system redesigned with three-tier priority (backwards compatible)
Notes
- All fixes maintain backward compatibility
- No breaking changes to CLI interface
- Users who relied on auto-glob expansion should explicitly use
**/patternfor recursive matching - Files containing ``` now render correctly (markdown fence escaping)
Usage Examples
Priority System:
# System defaults only (defaults.json)
dir2md .
# Custom system defaults
dir2md . --defaults-file my-defaults.json
# Project + system defaults (pyproject.toml overrides system)
# In pyproject.toml:
# [tool.dir2md]
# excludes = ["*.log", "temp/"]
# User overrides everything (highest priority)
dir2md . --exclude-glob "secret-data/"pyproject.toml Configuration:
[tool.dir2md]
excludes = [
"*.log",
"temp/",
"cache/",
"*.tmp"
]
# These patterns will be added to system defaults
# User CLI args (--exclude-glob) take precedence over these[1.2.0] - 2025-12-15
Philosophy: Intelligence Without Complexity
This release removes configuration overhead while adding sophisticated optimizations. All features auto-activate based on preset choice - zero flags, maximum intelligence.
Added
Phase 1: Gravitas Compression (SAIQL-Inspired)
- Symbolic Compression: Unicode symbol substitution reduces tokens by 30-50%
- Basic level: Common metadata patterns (
File:→§,Lines:→⊞) - Medium level: + File type symbols (
.py→🐍,.js→📜) - Full level: + Code patterns (
function→ƒ,class→©)
- Basic level: Common metadata patterns (
- Auto-Activation:
propreset = basic,aipreset = medium - Stats Reporting: Compression statistics embedded as HTML comments in output
Phase 2: Smart Query Processing
- Typo Auto-Correction: Levenshtein distance-based correction
- 80+ programming term dictionary (auth, payment, database, API, etc.)
- Automatic suggestions: "atuh" → "auth", "databse" → "database"
- Zero-dependency implementation
- Query Expansion: Pattern-based synonym expansion
- 50+ domain-specific patterns
- Accuracy improvement: 60% → 90%
- Example: "auth" → "auth OR login OR signin OR session OR token"
- Query Suggestions: File-based keyword extraction
- Analyzes matched files for related terms
- Directory-grouped suggestions
- Query history tracking
- Auto-Activation: Enabled when
--queryprovided
Phase 3: AST Semantic Sampling
- Python Structure Extraction: Intelligent code sampling using AST analysis
- Priority-based extraction: Classes > Functions > Implementation
- Preserves: Class definitions, function signatures, docstrings
- Reduces: Implementation details, private methods
- Additional 30-40% token reduction
- NodePriority System:
- CRITICAL: Public classes, main/entry functions
- HIGH: Public functions, class methods
- MEDIUM: Private functions
- LOW: Implementation details
- Auto-Activation: Enabled for .py files in
pro/aipresets - Fallback: Gracefully handles unparseable files
Changed
Radical Simplification
- REMOVED:
--gravitasflag (now preset-based) - REMOVED:
--expandflag (now auto-enabled with queries) - Preset Behaviors:
raw: No optimizations (pure original)fast: No optimizations (minimal metadata)pro: gravitas=basic + query expansion + AST samplingai: gravitas=medium + query expansion + AST sampling
Architecture
- New Modules:
src/dir2md/query/corrector.py(180 lines) - Typo correction enginesrc/dir2md/query/suggester.py(180 lines) - Query suggestion enginesrc/dir2md/samplers/semantic.py(320 lines) - AST semantic sampler
- Pipeline Integration: Semantic sampling integrated into
selector.pyfile processing - Zero Dependencies: All features implemented without external LLM or NLP libraries
Performance
Combined Optimizations
- Token Reduction: Up to 60-70% total savings
- Gravitas: 30-50% (preset-dependent)
- AST Sampling: 30-40% (for Python files)
- Cumulative effect on Python codebases
- Query Accuracy: 60% → 90% (pattern expansion)
- User Experience: 2 flags instead of 7 for common use cases
Documentation
- README: Completely rewritten to emphasize zero-configuration intelligence
- Examples: Simplified from 7-flag commands to 2-flag commands
- Migration Guide: Clear before/after examples for v1.1.3 → v1.2.0
Testing
- Phase 1: 4/4 preset configurations validated
- Phase 2: Typo correction tested with 7+ common mistakes
- Phase 3: AST sampling validated on real Python modules (31% reduction achieved)
- Integration: All phases tested together in production scenarios
Breaking Changes
None - All changes are additive or simplify existing behavior. Users on raw preset see no changes.
[1.1.2] - 2025-12-09
Security
- Masking now pre-compiles basic/advanced regexes and skips processing when input exceeds a safe threshold to reduce ReDoS risk.
- Large individual files are skipped before read when they exceed 1MB, preventing OOM/hangs while still noting the skip.
Performance
- Token estimation is cached with LRU (maxsize 2048) and keeps a minimum of one token for empty strings.
Behavior/UX
- Skipped oversized files now record placeholder hash/summary so the skip is visible in outputs and manifests.
- Custom masking patterns are compiled before use; invalid patterns emit warnings and are ignored.
Tests
- Pytest suite: 22 passed, 2 skipped.
[1.1.1] - 2025-12-04
[1.1.1] - 2025-12-04
Removed
- Pro/license gating entirely: deleted
license.py, removed license checks from masking/parallel/CLI/tests;.env.exampleno longer includes license keys. - Blueprint workflow retired; CI now runs ruff + pytest only.
Fixed
- HF demo import path includes
srcsodir2mdimports resolve in Spaces. - Lint/test cleanup after refactor (ruff + pytest green).
Documentation
- Updated README/demo metadata (Spicy branding, HF front matter); pending: scrub remaining Pro/license mentions in docs.
[1.1.0] — 2025-12-02
AI-Friendly Release · Spicy Risk Engine · Modular Architecture
Added
AI-Friendly Query & Output
--queryfilters/sorts by match score and injects contextual snippets.--output-format md|json|jsonlsupports human readers and CLI/LLM pipelines.
New Presets & Modes
--ai-mode: reference-mode defaults, capped budgets, stats + manifest enabled.--fast: tree + manifest only (skips file content reads).
Spicy Risk Reporting
--spicy: severity counts, scores, and findings across all output formats.--spicy-strict: exits with non-zero status when high/critical findings are detected.
CLI Enhancements
- Unified
[INFO]status messages. --progressverbosity selector.- Added execution plan summary line.
Safety & Performance Hardening
- Symlink escape guard.
- Streaming file reads with full hashing.
- Manifest reuse for repeated runs.
Packaging & Distribution
- Default output: Markdown + JSONL (human + LLM workflows).
- Added GitHub Actions Release workflow (PyPI + TestPyPI).
- Updated Dockerfile to install the package automatically.
Architecture Decoupling
Introduced new modules to avoid a single god-object:
walker.pyselector.pyrenderer.pyorchestrator.py
Test Suite Expansion
- Added coverage for masking, search, token logic, spicy mode, JSONL output, CLI defaults.
- Added symlink and streaming hash tests.
- 22 tests passed, 2 skipped.
Changed
- Default preset remains
pro. - Removed iceberg references from documentation.
Fixed
--fastnow properly skips content reads.- Plan summary reflects effective post-preset settings.
- Fixed
NameErrorin--include-glob/--exclude-globpathspec compilation.
Release v1.0.4
Release Notes — v1.0.4
2025-10-09
🚀 New Features
Enhanced Security Masking
Expanded default masking coverage to include:
- GitHub Personal Access Tokens:
ghp_*,gho_*,ghu_*,ghs_*,ghr_* - Generic API Keys:
api_key=,apikey=,api-key= - Database connection strings (PostgreSQL, MySQL, MongoDB)
- JSON Web Tokens (JWTs): Base64 tokens beginning with
eyJ - OAuth client secrets:
client_secret=,oauth_secret=
Automatic Environment Configuration
- Automatically loads the nearest
.envfile at CLI startup - Enables zero-configuration workflows for teams using
.envfiles - Supports seamless Pro license activation through environment variables
User-Defined Masking Patterns
Three flexible pattern configuration methods:
- Inline:
--mask-pattern "regex" - Pattern files: JSON arrays or newline-delimited text (
file://URIs supported) - Project configuration via
pyproject.toml:[tool.dir2md.masking] patterns = ["regex1", "regex2"]
⚙️ Improvements
Expanded Default Exclusion Rules
Automatically excludes common secret-bearing files:
.env,.env.local,.env.*.local*.pem,*.key,*.p12,*.pfx,*.crt,*.cer,*.der
Windows Compatibility Enhancements
- Documentation updated to ASCII-only for Windows cp949 compatibility
- Prevents illegal multibyte sequence errors
- Replaced emojis with ASCII-safe symbols (
[#],[!],[+],[*],[T])
Documentation Updates
- Clarified relationship with IsaacBreen’s original
dir2md - Added Acknowledgments section
- Updated installation instructions for GitHub-only distribution
- Removed outdated PyPI “coming soon” references
🛠️ Fixes
[CRITICAL] Windows file:// URI Path Parsing
- Fixed incorrect parsing of
file:///C:/...resulting in invalid\C:...paths - Correctly strips leading slash for Windows absolute paths
- Custom mask-pattern files now load properly on Windows
[SECURITY] GitHub PAT Masking Tier Correction
- GitHub PAT regex was incorrectly available in basic masking
- Moved to advanced (Pro) masking rules
- Restores proper separation between OSS and Pro feature tiers
Advanced Masking Notice Deduplication
- Advanced-mode upgrade notice now prints only once per CLI session
🧪 Testing & Verification
Automated Tests
- 12 tests passing (1 skipped on systems without symlink support)
- Full pytest suite coverage for core functionality
- Regression tests for Windows
file://path handling
Manual Verification
- Custom masking patterns load correctly from JSON and text files
- Windows
file://URIs resolve properly - Basic vs. advanced masking separation verified
.envauto-discovery validated across directory structures- ASCII-safe documentation confirmed to render correctly on Windows cp949