Skip to content

Commit d65f962

Browse files
jdmonacoclaude
andcommitted
chore: Release version 0.2.0
Add comprehensive release notes documenting: - Core architecture and unique innovations - Complete feature list - API support and extensibility - Target users and use cases - Pain points solved - Competitive differentiators Version bump from 0.1.0 to 0.2.0 to reflect major feature additions: - PDF document support (bc12145) - Microsoft Office file support (5023475) - Comprehensive documentation (45cac86, 8af897e) This release marks architectural maturity with comprehensive document processing, workflow chaining, and cost optimization features. 🤖 Generated with Claude Code Co-Authored-By: Claude <[email protected]>
1 parent 454e8ac commit d65f962

File tree

3 files changed

+343
-2
lines changed

3 files changed

+343
-2
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# A CLI Tool for AI Workflows in the Terminal
22

3-
**Version:** 0.1.0 (pre-release)
3+
**Version:** 0.2.0 (pre-release)
44

55
A flexible, configurable CLI tool for building and managing AI workflows for research and project development using the Anthropic API.
66

RELEASE-NOTES.md

Lines changed: 341 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,341 @@
1+
# Release Notes
2+
3+
## Version 0.2.0 (2025-01-20)
4+
5+
This release marks a significant milestone with comprehensive document processing capabilities and architectural maturity.
6+
7+
### Major Features
8+
9+
- **PDF Document Support:** Native processing via Claude API with joint text and visual analysis (32MB limit)
10+
- **Microsoft Office Support:** Automatic conversion of .docx and .pptx files with intelligent caching
11+
- **Image Processing:** Vision API integration with automatic resizing and validation
12+
- **Enhanced Documentation:** Complete user guides for document types and LibreOffice setup
13+
14+
### Core Architecture & Unique Innovations
15+
16+
#### 1. Git-like Project Discovery with Configuration Cascade
17+
18+
The tool implements a sophisticated multi-tier configuration cascade (global → ancestors → project → workflow → CLI) with **pass-through inheritance**. Empty values automatically inherit from parent tiers, while explicit values override and become decoupled. This enables centralized defaults that cascade down but can be overridden at any level. Nested projects automatically inherit ALL ancestor configurations in the hierarchy.
19+
20+
**Innovation:** Unlike most tools with simple config hierarchies, this provides transparent inheritance where changing a global default automatically affects all empty configs downstream.
21+
22+
#### 2. Semantic Content Separation
23+
24+
Distinguishes between **INPUT documents** (primary materials to analyze/transform) and **CONTEXT materials** (supporting information) with three aggregation methods: glob patterns, explicit file lists, and workflow dependencies.
25+
26+
**Innovation:** This semantic separation enables precise control over what the AI analyzes vs what provides background, with automatic ordering optimization.
27+
28+
#### 3. Workflow Chaining and Dependencies
29+
30+
Workflows can declare dependencies on other workflow outputs via `DEPENDS_ON`. Outputs are managed via hardlinks for efficient storage and atomic updates. Cross-format dependencies work seamlessly (JSON → Markdown → HTML pipelines).
31+
32+
**Innovation:** Creates a DAG of processing stages where outputs automatically feed as context into dependent workflows, enabling complex multi-stage pipelines.
33+
34+
#### 4. Dual Execution Modes
35+
36+
- **Run mode:** Persistent workflows with configuration, context, dependencies, outputs
37+
- **Task mode:** Lightweight one-off execution without workflow directories
38+
39+
Both modes share execution logic but optimized for different use cases.
40+
41+
**Innovation:** One tool handles both persistent iterative development and quick ad-hoc queries.
42+
43+
#### 5. Advanced Document Processing
44+
45+
- Automatic detection and processing of PDFs (32MB limit, ~2000 tokens/page)
46+
- Office file conversion (.docx, .pptx) via LibreOffice with smart caching
47+
- Vision API support for images (5MB limit, automatic resizing, base64 encoding)
48+
- Text files with embedded metadata
49+
50+
**Innovation:** Unified handling of text, PDFs, Office files, and images with automatic format detection, conversion, caching, and optimal ordering.
51+
52+
#### 6. Prompt Caching Architecture
53+
54+
JSON-first content block architecture with strategic cache breakpoint placement (max 4) at semantic boundaries. Stable-to-volatile ordering: system prompts → project descriptions → PDFs → text → images → task. Date-only timestamps (not datetime) to prevent minute-by-minute cache invalidation.
55+
56+
**Innovation:** Sophisticated caching strategy that can achieve 90% cost reduction by carefully ordering content from most stable (system prompts) to most volatile (task), with PDFs placed before text per Anthropic optimization guidelines.
57+
58+
#### 7. Citations Support
59+
60+
Optional Anthropic citations API support via `--enable-citations` flag. Generates document map for citable sources (text and PDFs, not images). Parses citation responses and formats them appropriately. Creates sidecar citations files for reference.
61+
62+
**Innovation:** Enables AI-generated content with proper source attribution.
63+
64+
### Complete Feature List
65+
66+
#### Core Workflow Management
67+
68+
- `init` - Initialize project with .workflow/ structure
69+
- `new` - Create workflows with XML task skeleton
70+
- `edit` - Open workflow/project files in editor
71+
- `config` - View configuration cascade with source tracking
72+
- `run` - Execute workflows with full context aggregation
73+
- `task` - Lightweight one-off task execution
74+
- `cat` - Display output to stdout
75+
- `open` - Open output in default application (macOS)
76+
- `list` - List all workflows in project
77+
78+
#### Configuration System
79+
80+
- Multi-tier cascade: global → ancestors → project → workflow → CLI
81+
- Pass-through inheritance (empty values inherit, non-empty override)
82+
- Nested project support with automatic ancestor discovery
83+
- Subshell isolation for safe config sourcing
84+
- Source tracking (shows where each value comes from)
85+
- Cross-platform editor detection
86+
87+
#### Context Aggregation
88+
89+
- Three methods: glob patterns, explicit file lists, workflow dependencies
90+
- Semantic separation: INPUT vs CONTEXT materials
91+
- Project-relative paths in configs, PWD-relative in CLI
92+
- Brace expansion and recursive glob patterns
93+
- Automatic file type detection (text, PDF, Office, images)
94+
95+
#### Document Processing
96+
97+
- PDFs: Native support via Claude PDF API (32MB limit)
98+
- Office files: Automatic conversion via LibreOffice (.docx, .pptx)
99+
- Images: Vision API support with validation, resizing, caching
100+
- Text files: Multiple formats with metadata embedding
101+
- Smart caching with mtime validation
102+
103+
#### API Integration
104+
105+
- Anthropic Messages API with streaming and batch modes
106+
- Prompt caching with strategic breakpoint placement
107+
- Token estimation: dual approach (heuristic + exact API count)
108+
- Citations support with document mapping
109+
- Dry-run mode for prompt inspection
110+
- Large payload handling via jq --slurpfile
111+
112+
#### Output Management
113+
114+
- Automatic timestamped backups
115+
- Hardlinked copies for convenient access
116+
- Format-specific post-processing (mdformat, jq)
117+
- Multiple output formats (md, json, txt, html, etc.)
118+
- Atomic updates with trap-based cleanup
119+
120+
#### Safety and Robustness
121+
122+
- Automatic backups before overwriting
123+
- Atomic file operations
124+
- Trap-based cleanup on exit
125+
- Subshell isolation for config extraction
126+
- Git-like project boundary detection (stops at $HOME)
127+
128+
#### Developer Experience
129+
130+
- XML task skeleton with structured sections
131+
- Named task templates (reusable task definitions)
132+
- Token estimation before API calls
133+
- Comprehensive help system (git-style)
134+
- 205+ test suite using Bats
135+
- Detailed documentation with MkDocs
136+
137+
### Current API Support & Extensibility
138+
139+
#### Anthropic API Support
140+
141+
- Messages API (single and streaming)
142+
- Prompt Caching API (ephemeral cache control)
143+
- Token Counting API (exact token estimation)
144+
- PDF API (native document support)
145+
- Vision API (image processing)
146+
- Citations API (source attribution)
147+
148+
#### Extensibility Architecture
149+
150+
**Modular library structure** (lib/):
151+
- `api.sh` - API interaction
152+
- `config.sh` - Configuration loading
153+
- `core.sh` - Subcommand implementations
154+
- `execute.sh` - Shared execution logic
155+
- `utils.sh` - File processing utilities
156+
- `task.sh` - Task mode logic
157+
- `help.sh` - Help text
158+
- `edit.sh` - Editor selection
159+
160+
**Configuration as code:**
161+
- Config files are bash scripts (can include logic)
162+
- Easy to extend with new variables
163+
- Pass-through mechanism scales automatically
164+
165+
**Content block architecture:**
166+
- JSON-first design
167+
- Each file becomes a separate content block
168+
- Enables future format support (e.g., more document types)
169+
- Custom converter for pseudo-XML convenience views
170+
171+
**Custom system prompts:**
172+
- User-definable prompts in ~/.config/workflow/prompts/
173+
- Composable via SYSTEM_PROMPTS array
174+
- Project and workflow overrides
175+
176+
**Named task templates:**
177+
- Reusable task definitions in ~/.config/workflow/tasks/
178+
- Shareable across projects
179+
180+
### Target Users & Use Cases
181+
182+
#### Primary Audience
183+
184+
Technical users with high AI/LLM familiarity and shell proficiency who need:
185+
- Reproducible AI workflows
186+
- Complex multi-stage processing pipelines
187+
- Project-aware context management
188+
- Cost-effective prompt caching
189+
- Integration with existing tools
190+
191+
#### Key User Profiles
192+
193+
**Research Scientists:**
194+
- Multi-stage analysis pipelines
195+
- Paper writing with context management
196+
- Data analysis and visualization workflows
197+
- Citation tracking for AI-generated content
198+
199+
**Software Developers:**
200+
- Code review workflows
201+
- Documentation generation
202+
- Multi-file refactoring analysis
203+
- Architecture and design assistance
204+
205+
**Technical Writers:**
206+
- Structured content generation
207+
- Document transformation (Office → PDF → Markdown)
208+
- Multi-stage editing workflows
209+
- Cross-referencing and citations
210+
211+
**Data Scientists:**
212+
- Exploratory data analysis
213+
- Report generation from datasets
214+
- Multi-format output (JSON, Markdown, HTML)
215+
- Pipeline orchestration
216+
217+
#### Enabled Workflows
218+
219+
**Research Pipelines:**
220+
Context gathering from PDFs/papers → Outline generation → Section drafting → Review/refinement
221+
222+
**Code Analysis:**
223+
Codebase exploration → Architecture analysis → Documentation generation → Review
224+
225+
**Data Processing:**
226+
Data ingestion → Exploratory analysis → Statistical testing → Report generation → Visualization
227+
228+
**Content Creation:**
229+
Research/context → Outline → Draft → Edit → Format conversion
230+
231+
**Iterative Development:**
232+
Initial attempt → Review output → Refine task → Re-run (with automatic backup)
233+
234+
### Pain Points Solved
235+
236+
**Context Management Complexity:**
237+
- **Problem:** Managing large codebases/datasets as context for AI
238+
- **Solution:** Glob patterns, explicit files, and workflow dependencies with semantic separation
239+
240+
**Cost of Repetitive Prompts:**
241+
- **Problem:** Paying for same system prompts and context repeatedly
242+
- **Solution:** Prompt caching with 90% cost reduction on cached content
243+
244+
**Configuration Sprawl:**
245+
- **Problem:** Repeating settings across projects and workflows
246+
- **Solution:** Pass-through cascade enables change-once, affect-many
247+
248+
**Multi-Stage Processing:**
249+
- **Problem:** Manually copying outputs between stages
250+
- **Solution:** Workflow dependencies with automatic context passing
251+
252+
**Output Management:**
253+
- **Problem:** Losing previous versions when iterating
254+
- **Solution:** Automatic timestamped backups with hardlinked access
255+
256+
**Format Fragmentation:**
257+
- **Problem:** Different tools for PDFs, images, Office files, text
258+
- **Solution:** Unified document processing with automatic detection and conversion
259+
260+
**One-Off vs Persistent Workflows:**
261+
- **Problem:** Need both quick queries and persistent workflows
262+
- **Solution:** Dual execution modes optimized for each use case
263+
264+
**Source Attribution:**
265+
- **Problem:** AI-generated content lacks proper citations
266+
- **Solution:** Citations API integration with document mapping
267+
268+
### Unique Selling Points vs Other AI CLI Tools
269+
270+
**vs aider/cursor/copilot:**
271+
- Not code-focused; document and workflow-focused
272+
- Multi-stage pipelines with dependencies
273+
- Sophisticated context aggregation beyond current file
274+
- Prompt caching for cost optimization
275+
276+
**vs chatgpt-cli/claude-cli:**
277+
- Persistent workflows with configuration
278+
- Project-aware with automatic discovery
279+
- Multi-tier config cascade
280+
- Workflow chaining and dependencies
281+
- Native document processing (PDFs, Office, images)
282+
283+
**vs custom scripts:**
284+
- Structured configuration system
285+
- Built-in prompt caching
286+
- Safe output management
287+
- Cross-platform compatibility
288+
- Comprehensive documentation and testing
289+
290+
#### Key Differentiators
291+
292+
1. **Configuration sophistication:** Pass-through cascade is unique
293+
2. **Workflow chaining:** DAG-based pipeline orchestration
294+
3. **Document processing breadth:** Unified handling of text, PDFs, Office, images
295+
4. **Cost optimization:** Strategic prompt caching architecture
296+
5. **Git-like UX:** Familiar project discovery and structure
297+
6. **Dual execution modes:** Both persistent and ad-hoc in one tool
298+
7. **Citations support:** Source attribution for generated content
299+
300+
### Technical Highlights
301+
302+
- **205+ comprehensive test suite** using Bats testing framework
303+
- **Modular architecture** with 8 separate library modules
304+
- **Cross-platform support** (macOS, Linux, WSL)
305+
- **Safe execution** with atomic operations and automatic backups
306+
- **Efficient caching** for images and Office file conversions
307+
- **Robust error handling** with graceful degradation
308+
309+
### Installation & Requirements
310+
311+
**Required:**
312+
- Bash 4.0+
313+
- curl
314+
- jq
315+
- Anthropic API key
316+
317+
**Optional:**
318+
- LibreOffice (for .docx/.pptx support)
319+
- ImageMagick (for optimal image resizing)
320+
- mdformat (for Markdown formatting)
321+
322+
### What's Next
323+
324+
This pre-release (0.2.0) represents a mature foundation with comprehensive document processing. The tool is production-ready for personal and research use. Future directions may include:
325+
326+
- Additional API provider support (OpenAI-compatible endpoints)
327+
- Batch API integration for cost-effective processing
328+
- Enhanced citation formatting options
329+
- Additional document format support
330+
- Web-based workflow visualization
331+
332+
### Commits in This Release
333+
334+
- bc12145: feat: Add PDF document support with optimized API ordering
335+
- 5023475: feat: Add Microsoft Office file support with PDF conversion
336+
- 45cac86: docs: Add comprehensive document type support documentation
337+
- 8af897e: fix: Correct context aggregation order in execution guide
338+
339+
### Acknowledgments
340+
341+
This tool represents a sophisticated approach to AI workflow management, built with attention to cost optimization, reproducibility, and developer experience. Special thanks to the Anthropic team for their excellent API documentation and the LibreOffice project for enabling Office file conversion.

workflow.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
set -e
33

44
# Version
5-
WORKFLOW_VERSION="0.1.0"
5+
WORKFLOW_VERSION="0.2.0"
66

77
# =============================================================================
88
# Workflow - AI-Assisted Research and Project Development Tool

0 commit comments

Comments
 (0)