Skip to content

Implement pre/post processing framework for multi-target workflows #86

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
May 29, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ jobs:
BUNDLE_GEMFILE: ${{ github.workspace }}/gemfiles/${{ matrix.gemfile }}.gemfile
steps:
- uses: actions/checkout@v4
- name: Install ripgrep
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a future PR, we probably want to adopt the dev way here and have a key users can use to verify a requirement has been met and not need to run this command every time.

run: sudo apt-get update && sudo apt-get install -y ripgrep
- uses: ruby/setup-ruby@v1
with:
ruby-version: ${{ matrix.ruby }}
Expand Down
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@

**/CLAUDE.local.md
.roast/

bin/_guard-core
bin/bundle
bin/coderay
bin/dotenv
Expand All @@ -36,6 +38,5 @@ bin/ruby-lsp-test-exec
bin/ruby-parse
bin/ruby-rewrite
bin/thor
bin/_guard-core

gemfiles/*.lock
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [0.2.2] - 2025-05-29

### Added
- Pre/post processing framework for workflows with `pre_processing` and `post_processing` sections (#86)
- Support for `output.txt` ERB templates in post-processing phase for custom output formatting
- Pre/post processing support for single-target workflows (not just multi-target)
- Simplified access to pre-processing data in target workflows (removed `output` intermediary level)
- Verbose mode improvements for better debugging experience (#98)
- Command outputs are now displayed when using the `--verbose` flag
- Commands executed within conditional branches also show output in verbose mode
Expand Down
5 changes: 5 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,11 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
- IterationExecutor handles iterations (each, repeat)
- ConditionalExecutor handles conditionals (if, unless)
- Don't combine different responsibilities in one class
- **Do not implement prompts "inline" using a prompt: attribute nested under step names, that violates the primary design architecture of Roast**

## Guidance and Expectations

- Do not decide unilaterally to leave code for the sake of "backwards compatibility"... always run those decisions by me first.

## Git Workflow Practices

Expand Down
151 changes: 151 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -754,6 +754,157 @@ your-project/
└── ...
```

### Pre/Post Processing Framework

Roast supports pre-processing and post-processing phases for workflows. This enables powerful workflows that need setup/teardown or result aggregation across all processed files.

#### Overview

- **Pre-processing**: Steps executed once before any targets are processed
- **Post-processing**: Steps executed once after all targets have been processed
- **Shared state**: Pre-processing results are available to all subsequent steps
- **Result aggregation**: Post-processing has access to all workflow execution results
- **Single-target support**: Pre/post processing works with single-target workflows too
- **Output templates**: Post-processing supports `output.txt` templates for custom formatting

#### Configuration

```yaml
name: optimize_tests
model: gpt-4o
target: "test/**/*_test.rb"

# Pre-processing steps run once before any test files
pre_processing:
- gather_baseline_metrics
- setup_test_environment

# Main workflow steps run for each test file
steps:
- analyze_test
- improve_coverage
- optimize_performance

# Post-processing steps run once after all test files
post_processing:
- aggregate_results
- generate_report
- cleanup_environment
```

#### Directory Structure

Pre and post-processing steps follow the same conventions as regular steps but are organized in their own directories:

```
workflow.yml
pre_processing/
├── gather_baseline_metrics/
│ └── prompt.md
└── setup_test_environment/
└── prompt.md
analyze_test/
└── prompt.md
improve_coverage/
└── prompt.md
optimize_performance/
└── prompt.md
post_processing/
├── output.txt
├── aggregate_results/
│ └── prompt.md
├── generate_report/
│ └── prompt.md
└── cleanup_environment/
└── prompt.md
```

#### Data Access

**Pre-processing results in target workflows:**

Target workflows have access to pre-processing results through the `pre_processing_data` variable with dot notation:

```erb
# In a target workflow step prompt
The baseline metrics from pre-processing:
<%= pre_processing_data.gather_baseline_metrics %>

Environment setup details:
<%= pre_processing_data.setup_test_environment %>
```

**Post-processing data access:**

Post-processing steps have access to:

- `pre_processing`: Direct access to pre-processing results with dot notation
- `targets`: Hash of all target workflow results, keyed by file paths

Example post-processing prompt:
```markdown
# Generate Summary Report

Based on the baseline metrics:
<%= pre_processing.gather_baseline_metrics %>

Environment configuration:
<%= pre_processing.setup_test_environment %>

And the results from processing all files:
<% targets.each do |file, target| %>
File: <%= file %>
Analysis results: <%= target.output.analyze_test %>
Coverage improvements: <%= target.output.improve_coverage %>
Performance optimizations: <%= target.output.optimize_performance %>
<% end %>

Please generate a comprehensive summary report showing:
1. Overall improvements achieved
2. Files with the most significant changes
3. Recommendations for further optimization
```

#### Output Templates

Post-processing supports custom output formatting using ERB templates. Create an `output.txt` file in your `post_processing` directory to format the final workflow output:

```erb
# post_processing/output.txt
=== Workflow Summary Report ===
Generated at: <%= Time.now.strftime("%Y-%m-%d %H:%M:%S") %>

Environment: <%= pre_processing.setup_test_environment %>

Files Processed: <%= targets.size %>

<% targets.each do |file, target| %>
- <%= file %>: <%= target.output.analyze_test %>
<% end %>

<%= output.generate_report %>
===============================
```

The template has access to:
- `pre_processing`: All pre-processing step outputs with dot notation
- `targets`: Hash of all target workflow results with dot notation (each target has `.output` and `.final_output`)
- `output`: Post-processing step outputs with dot notation

#### Use Cases

This pattern is ideal for:

- **Code migrations**: Setup migration tools, process files, generate migration report
- **Test optimization**: Baseline metrics, optimize tests, aggregate improvements
- **Documentation generation**: Analyze codebase, generate docs per module, create index
- **Dependency updates**: Check versions, update files, verify compatibility
- **Security audits**: Setup scanners, check each file, generate security report
- **Performance analysis**: Establish baselines, analyze components, summarize findings

See the [pre/post processing example](examples/pre_post_processing) for a complete working demonstration.


## Development

After checking out the repo, run `bundle install` to install dependencies. Then, run `bundle exec rake` to run the tests and linter. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
Expand Down
12 changes: 6 additions & 6 deletions examples/grading/generate_recommendations/output.txt
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
========== TEST RECOMMENDATIONS ==========
<%- if response["recommendations"].empty? -%>
<%- if response.recommendations.empty? -%>
No recommendations found.
<%- else -%>
<%- response["recommendations"].each_with_index do |rec, index| -%>
<%- response.recommendations.each_with_index do |rec, index| -%>
Recommendation #<%= index + 1 %>:
Description: <%= rec["description"] %>
Impact: <%= rec["impact"] %>
Priority: <%= rec["priority"] %>
Description: <%= rec.description %>
Impact: <%= rec.impact %>
Priority: <%= rec.priority %>

Code Suggestion:

<%= rec["code_suggestion"] %>
<%= rec.code_suggestion %>

<%- end -%>
<%- end -%>
Expand Down
111 changes: 111 additions & 0 deletions examples/pre_post_processing/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
# Pre/Post Processing Example: Test Suite Optimization

This example demonstrates how to use Roast's pre/post processing framework to optimize an entire test suite across multiple files.

## Overview

The workflow processes multiple test files, but performs setup and aggregation tasks only once:

- **Pre-processing**: Runs once before any test files are processed
- Gathers baseline metrics for comparison
- Sets up the test environment

- **Main workflow**: Runs for each test file matching the target pattern
- Analyzes test quality and coverage
- Improves test coverage
- Optimizes test performance
- Validates changes

- **Post-processing**: Runs once after all test files have been processed
- Aggregates metrics from all files
- Generates a comprehensive report
- Cleans up the environment

## Workflow Structure

```yaml
name: test_optimization
model: gpt-4o
target: "test/**/*_test.rb"

pre_processing:
- gather_baseline_metrics
- setup_test_environment

steps:
- analyze_test_file
- improve_test_coverage
- optimize_test_performance
- validate_changes

post_processing:
- aggregate_metrics
- generate_summary_report
- cleanup_environment
```

## Directory Structure

```
pre_post_processing/
├── workflow.yml
├── pre_processing/
│ ├── gather_baseline_metrics/
│ │ └── prompt.md
│ └── setup_test_environment/
│ └── prompt.md
├── analyze_test_file/
│ └── prompt.md
├── improve_test_coverage/
│ └── prompt.md
├── optimize_test_performance/
│ └── prompt.md
├── validate_changes/
│ └── prompt.md
└── post_processing/
├── aggregate_metrics/
│ └── prompt.md
├── generate_summary_report/
│ └── prompt.md
└── cleanup_environment/
└── prompt.md
```

## Key Features Demonstrated

1. **Shared State**: Pre-processing results are available to all subsequent steps
2. **Result Aggregation**: Post-processing has access to results from all workflow executions
3. **One-time Operations**: Setup and cleanup happen only once, regardless of target count
4. **Metrics Collection**: Each file's results are stored and aggregated for reporting

## Running the Example

```bash
cd examples/pre_post_processing
roast workflow.yml
```

This will:
1. Run pre-processing steps once
2. Process each test file matching `test/**/*_test.rb`
3. Run post-processing steps once with access to all results
4. Generate a comprehensive optimization report

## Use Cases

This pattern is ideal for:
- **Code migrations**: Setup migration tools, process files, generate migration report
- **Performance audits**: Baseline metrics, analyze files, aggregate improvements
- **Documentation generation**: Analyze codebase, generate docs per file, create index
- **Dependency updates**: Check current versions, update files, verify compatibility
- **Security scanning**: Setup scanners, check each file, generate security report

## Customization

To adapt this example for your use case:

1. Update the `target` pattern to match your files
2. Modify pre-processing steps for your setup needs
3. Adjust main workflow steps for your processing logic
4. Customize post-processing for your reporting requirements
5. Use appropriate AI models for each step type
23 changes: 23 additions & 0 deletions examples/pre_post_processing/analyze_test_file/prompt.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Analyze Test File

Current test file: {{file}}

Please analyze this test file and identify:

1. **Test Structure**: Number of test cases, test suites, and overall organization
2. **Coverage Gaps**: Areas of the code that aren't adequately tested
3. **Test Quality Issues**:
- Tests that are too brittle or implementation-dependent
- Missing edge cases
- Unclear test descriptions
- Excessive mocking that reduces test value
4. **Performance Issues**:
- Slow setup/teardown methods
- Inefficient test data generation
- Unnecessary database operations
5. **Opportunities for Improvement**:
- Tests that could be parameterized
- Common patterns that could be extracted to helpers
- Better use of test fixtures or factories

Provide specific, actionable recommendations for each issue found.
17 changes: 17 additions & 0 deletions examples/pre_post_processing/improve_test_coverage/prompt.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Improve Test Coverage

Based on the analysis of {{file}}, implement the following improvements:

1. **Add Missing Test Cases**: Write tests for uncovered code paths, edge cases, and error conditions
2. **Improve Test Descriptions**: Make test names more descriptive and follow consistent naming conventions
3. **Enhance Assertions**: Add more specific assertions to catch regressions
4. **Test Data**: Use more realistic test data that better represents production scenarios
5. **Remove Redundancy**: Eliminate duplicate tests or merge similar ones

Generate the improved test code and explain the rationale for each change.

Remember to:
- Maintain backward compatibility with existing test interfaces
- Follow the project's testing conventions and style guide
- Ensure new tests are fast and deterministic
- Add appropriate comments for complex test scenarios
Loading