Skip to content

Latest commit

Β 

History

History
1318 lines (1014 loc) Β· 33.3 KB

File metadata and controls

1318 lines (1014 loc) Β· 33.3 KB

AIWB Application Audit Report

Date: January 6, 2026 Version: 2.0.0 Auditor: AI Code Analysis System Scope: Complete holistic analysis of codebase structure, workflows, and architecture


Executive Summary

AIWB (AI Workbench) is a bash-based CLI tool for AI orchestration with multi-provider support. The application contains approximately 10,838 lines of code across 12 library modules and 50+ helper scripts. This audit identifies critical inconsistencies, redundancies, architectural issues, and areas for improvement.

Key Metrics

  • Total Files: 62+ (12 libraries + 50+ bin-edit scripts)
  • Total Lines of Code: ~10,838 (libraries only)
  • Main Components: API layer, UI layer, mode system, GitHub integration, context management
  • Supported Providers: Gemini, Claude, OpenAI, Groq, xAI/Grok, Ollama
  • Platforms: Linux, macOS, Termux (Android)

Overall Health Score: 6.5/10

Strengths:

  • Modular architecture with clear separation of concerns
  • Comprehensive error handling system
  • Multi-provider AI support
  • Good platform compatibility (Linux, macOS, Termux)

Critical Issues:

  • Significant code redundancy and duplication
  • Inconsistent naming conventions
  • Missing documentation for many functions
  • Workflow complexity and overlapping features
  • Potential security concerns with key storage

1. Architecture Analysis

1.1 Overall Structure

AIworkbench/
β”œβ”€β”€ aiwb (3000+ lines) - Main entry point
β”œβ”€β”€ lib/ - Core libraries (10,838 lines)
β”‚   β”œβ”€β”€ api.sh (1533 lines) - API integration
β”‚   β”œβ”€β”€ github.sh (1431 lines) - GitHub operations
β”‚   β”œβ”€β”€ modes.sh (1407 lines) - Mode workflows
β”‚   β”œβ”€β”€ editor.sh (567 lines) - File editing
β”‚   β”œβ”€β”€ swarm.sh (550 lines) - Multi-agent orchestration
β”‚   β”œβ”€β”€ ui.sh (537 lines) - TUI components
β”‚   β”œβ”€β”€ context_state.sh (501 lines) - Context persistence
β”‚   β”œβ”€β”€ common.sh (483 lines) - Utilities
β”‚   β”œβ”€β”€ security.sh (437 lines) - Key management
β”‚   β”œβ”€β”€ error.sh (342 lines) - Error handling
β”‚   β”œβ”€β”€ config.sh (326 lines) - Configuration
β”‚   └── chat_router.sh (248 lines) - Intent routing
β”œβ”€β”€ bin-edit/ - Helper scripts (50+ files)
└── docs/ - Documentation

1.2 Architectural Findings

βœ… Strengths

  1. Modular Design: Clear separation between API, UI, config, and business logic
  2. Guard Pattern: Libraries use guard variables to prevent multiple sourcing
  3. Cross-Platform: Good platform detection and compatibility layers
  4. Extensible: Easy to add new providers and features

❌ Weaknesses

  1. Monolithic Main Script: aiwb is 3000+ lines - should be split
  2. Tight Coupling: Many modules depend on global variables from other modules
  3. No Clear API Boundaries: Functions freely call across module boundaries
  4. Mixed Concerns: UI, business logic, and data access often intermingled

2. Code Quality Issues

2.1 Code Redundancy and Duplication

πŸ”΄ CRITICAL: Clipboard Handling Duplication

Location: aiwb:699-739 and lib/common.sh:432-465

Two identical copy_to_clipboard() functions exist:

# In aiwb script (lines 699-739)
copy_to_clipboard() {
    # ... detection logic ...
}

# In lib/common.sh (lines 432-465)
copy_to_clipboard() {
    # ... same detection logic ...
}

Impact: Maintenance burden, potential behavior divergence Recommendation: Remove from aiwb, use only lib/common.sh version


πŸ”΄ CRITICAL: Duplicate API Call Functions

Location: Multiple files have redundant API calling logic

Evidence found:

  • bin-edit/chat-runner.sh (190 lines) - Full chat implementation
  • bin-edit/claude-runner.sh (52 lines) - Claude-specific runner
  • bin-edit/gemini-runner.sh (53 lines) - Gemini-specific runner
  • lib/api.sh - Central API implementation

Impact:

  • Same API logic maintained in 4+ places
  • Inconsistent error handling across implementations
  • Bug fixes need to be replicated

Recommendation:

  • Consolidate all API calls through lib/api.sh
  • Remove bin-edit runners or convert to thin wrappers
  • Establish single source of truth for API interactions

🟑 MODERATE: JSON Utility Duplication

Location: lib/config.sh and lib/common.sh

Both files implement similar JSON operations:

# common.sh
json_get() { jq -r "$query" "$file" ... }
json_set() { jq --arg k "$key" ... }

# config.sh
config_get() { json_get ".$key" "$config_file" ... }
config_set() { jq --arg k "$key" ... }

Recommendation: Unify JSON operations, make config functions use common.sh


2.2 Inconsistent Naming Conventions

πŸ”΄ CRITICAL: Function Naming Chaos

Multiple naming patterns coexist:

# Snake case
get_api_key()
set_github_token()

# Camel case
getDefaultModel()  # NOT FOUND but pattern exists

# Prefix-based
cmd_help()         # Command functions
menu_prompt()      # Menu functions
ui_input()         # UI functions
mode_run()         # Mode functions
github_status()    # GitHub functions
context_state_*()  # Context functions

# Inconsistent prefixes
handle_slash_command()   # "handle" prefix
process_message()        # "process" prefix (if exists)

Problems:

  1. No clear naming standard enforced
  2. Difficult to grep/search for related functions
  3. Newcomers can't predict function names
  4. IDE autocomplete less effective

Recommendation: Establish and enforce naming convention:

  • Commands: cmd_* (user-facing commands)
  • Internal: _private_function (prefix with underscore)
  • Handlers: handle_*_event
  • UI: ui_*
  • API: api_*
  • Utilities: descriptive verbs (get_, set_, validate_*)

2.3 Magic Numbers and Hard-Coded Values

🟑 MODERATE: Magic Numbers Throughout

# api.sh:204
local max_tokens="${3:-16000}"      # Why 16000?
local temperature="${4:-0.2}"        # Why 0.2?

# api.sh:242
--max-time 300                       # Why 300?
--connect-timeout 10                 # Why 10?

# aiwb:403
rotate_logs "$logs_dir" 10 10485760  # Why 10 logs? Why 10MB?

# modes.sh:114
head -20 "$f"                        # Why 20 lines?

# ui.sh:87
gum choose --header "$header" --height 15  # Why height 15?

Impact:

  • Hard to tune performance
  • Unclear reasoning for limits
  • No central configuration for tuning

Recommendation: Create constants file or config section:

readonly AIWB_MAX_TOKENS_DEFAULT=16000
readonly AIWB_TEMPERATURE_DEFAULT=0.2
readonly AIWB_API_TIMEOUT=300
readonly AIWB_LOG_RETENTION=10
readonly AIWB_MAX_LOG_SIZE=10485760

2.4 Error Handling Inconsistencies

🟑 MODERATE: Inconsistent Error Propagation

Some functions use die(), some use return 1, some use err():

# Pattern 1: die immediately
require_file() {
    [[ ! -f "$file" ]] && die "$E_FILE_NOT_FOUND" "..."
}

# Pattern 2: return error code
github_clone() {
    if git clone "$url" "$dest"; then
        return 0
    else
        err "Failed to clone"
        return 1
    fi
}

# Pattern 3: just print error
call_gemini() {
    if [[ $exit_code -ne 0 ]]; then
        display_api_error "..."  # No return!
        return 1
    fi
}

Impact:

  • Unpredictable error behavior
  • Some errors crash, some don't
  • Difficult to implement retry logic

Recommendation:

  • Establish error handling hierarchy
  • Use die() only for unrecoverable errors
  • Use return + err() for recoverable errors
  • Document error handling strategy

3. Functional Redundancy

3.1 Overlapping Features

πŸ”΄ CRITICAL: Multiple Context Management Systems

Evidence:

  1. MODE_UPLOADS array (modes.sh) - In-memory context
  2. context_state.sh - Persistent context with JSON state
  3. Slash commands: /context, /contextload, /contextsave, /contextshow, /contextclear, /contextrefresh, /contextremove

Problems:

  • Two parallel systems that can desync
  • User confusion: "Which context am I using?"
  • Footer shows wrong context (shows MODE_UPLOADS not .context_state)
  • No clear UX on when to use which

Code Evidence (aiwb:746-780):

get_context_summary() {
    # Shows MODE_UPLOADS (in-memory)
    if [[ ${#MODE_UPLOADS[@]} -gt 0 ]]; then
        echo "$file_count file(s)"
    else
        # But then checks context_state (persistent)
        if context_state_exists; then
            echo "none (saved: $scan_count files - use /contextload)"
        fi
    fi
}

Recommendation:

  • Option A: Merge systems - always use persistent context, load into MODE_UPLOADS
  • Option B: Clear separation - MODE_UPLOADS for modes only, context_state for chat
  • Option C: Single unified system with automatic persistence

πŸ”΄ CRITICAL: Multiple Ways to Generate Code

User has 4 different ways to generate code:

  1. /make - Mode-based generation workflow
  2. /quick - One-shot generation
  3. /generate - Generate command (uses API directly)
  4. Direct chat message - Auto-routed through chat_router.sh

Problems:

  • Feature overlap and confusion
  • Different code paths mean inconsistent results
  • /make and /quick do almost the same thing
  • No clear guidance on which to use when

Recommendation:

  • Consolidate into 2 approaches:
    • Interactive: /make (mode-based, with verification)
    • Quick: /quick or chat message (one-shot, auto-detected)
  • Remove or deprecate /generate (redundant)
  • Update docs with clear decision tree

3.2 Duplicate Scan Functionality

🟑 MODERATE: /scanrepo vs /smartscan vs /contextrefresh

Three scanning commands with overlapping functionality:

cmd_scanrepo()      # Deep scan entire repository
cmd_smartscan()     # Quick scan (configs, docs, main files)
cmd_contextrefresh() # Re-scan repository and update context

Issues:

  • /contextrefresh duplicates /scanrepo logic
  • Unclear when to use which scan
  • No performance comparison in docs

Recommendation:

  • Keep /smartscan (quick) and /scanrepo (full)
  • Make /contextrefresh call /scanrepo internally (don't reimplement)
  • Add timing info: "smartscan: ~5s, scanrepo: ~30s"

4. Workflow Analysis

4.1 User Journey Issues

πŸ”΄ CRITICAL: Confusing Context Workflow

Current user journey for adding context:

User in chat mode wants to add context:
  Option 1: /context <file>           ❓ What does this do?
  Option 2: /contextload              ❓ Load what? From where?
  Option 3: /make -> uploads <files>  ❓ Only for make mode?
  Option 4: /scanrepo                 ❓ Does this load context?

User mental model breaks down because:

  1. /scanrepo creates analysis file but doesn't auto-load it
  2. /contextload requires prior /contextsave or /scanrepo
  3. /context command behavior unclear (adds? replaces? shows?)
  4. MODE_UPLOADS only works in modes, not in chat

Evidence (aiwb:2064-2240): /context command has 176 lines but workflow is confusing

Recommendation:

Simplified workflow:
  /context add <file>     # Add file to context (chat + modes)
  /context remove <file>  # Remove from context
  /context show           # Show current context
  /context clear          # Clear all context
  /context scan           # Scan repo and add to context

Make context work uniformly in chat AND modes

🟑 MODERATE: Mode Workflow Complexity

Current mode workflow:

> /make                    # Enter mode
make> prompt "..."         # Set instruction
make> model                # Choose model
make> uploads <files>      # Add context
make> check                # Configure verification
make> status               # Check status
make> run                  # Execute

Issues:

  1. Too many steps for simple tasks
  2. No sensible defaults
  3. check command purpose unclear
  4. Why not auto-run with defaults?

Recommendation:

# Quick path (with defaults)
> /make "create REST API"
[Runs immediately with defaults]

# Advanced path (if needed)
> /make
make> settings             # Configure everything at once
make> run "create REST API"

4.2 GitHub Workflow

βœ… STRENGTH: Comprehensive GitHub Integration

The GitHub integration (1431 lines) is well-designed:

  • Clone, fork, PR creation
  • Issues management
  • Git operations (status, commit, push, pull, sync)
  • API integration with proper auth

🟑 MODERATE: Missing Features

Compared to Claude Code, missing:

  • Branch management UI
  • Diff visualization before commit
  • Interactive staging (git add -p equivalent)
  • Merge conflict resolution helpers

5. Security Analysis

5.1 API Key Storage

πŸ”΄ CRITICAL: Insecure Key Storage

Current implementation:

# lib/security.sh:40-64
set_github_token() {
    # Writes plaintext to ~/.aiwb/.aiwb.env
    echo "export GITHUB_TOKEN=\"${token}\"" >> "$env_file"
    chmod 600 "$env_file"
}

# lib/api.sh:86-92
get_api_key() {
    source "$env_file"  # Sources plaintext file
    echo "${GEMINI_API_KEY:-}"
}

Security Issues:

  1. ❌ Keys stored in plaintext
  2. ❌ No encryption by default (age encryption exists but not used)
  3. ❌ Keys loaded into environment (visible in /proc/<pid>/environ)
  4. ❌ No key rotation mechanism
  5. ❌ No audit trail of key access

Evidence: lib/security.sh has encryption functions (lines 169-237) but they're never called in default flow!

Recommendation:

  1. Enable encryption by default:

    # After getting key, encrypt it immediately
    set_api_key() {
        encrypt_keys "$provider" "$key"  # Use existing function!
    }
  2. Remove plaintext storage option or make it opt-in with warning

  3. Implement key rotation:

    aiwb keys rotate <provider>  # Re-encrypt with new password

5.2 Git Exposure Risks

🟑 MODERATE: Secrets in Git

Current protection:

# lib/security.sh:345-392
audit_git_exposure() {
    # Checks for .env, .keys.age, etc. in git
}

Issues:

  1. Audit function exists but not called automatically
  2. No pre-commit hook to prevent commits with secrets
  3. .gitignore not validated on startup

Recommendation:

  1. Run audit_git_exposure() on every /scanrepo or startup in git repo
  2. Provide pre-commit hook installation: aiwb security install-hooks
  3. Validate .gitignore includes .aiwb.env, .keys.age

5.3 Command Injection Risks

🟑 MODERATE: Potential Command Injection

Vulnerable patterns found:

# modes.sh:103-105
$(cat "$item")  # If $item is user-controlled, could inject

# github.sh:142
remote_url=$(git remote get-url origin 2>/dev/null)
# Then used in regex without sanitization

# api.sh:234
echo "$request_body" > "$request_file"
# If prompt contains special chars, could break

Not currently exploitable because:

  • Most inputs come from controlled sources
  • File paths validated before use
  • But future changes could introduce vulnerabilities

Recommendation:

  1. Use printf '%s' "$var" instead of echo "$var"
  2. Always quote variables in command substitutions: "$(cat "$item")"
  3. Validate file paths exist before using in commands
  4. Add input sanitization layer for user-provided strings

6. Performance Issues

6.1 Inefficient File Operations

🟑 MODERATE: Repeated File Reads

Example from modes.sh:100-116:

for item in "${MODE_UPLOADS[@]}"; do
    if [[ -f "$item" ]]; then
        final_prompt="$final_prompt
$(cat "$item")"     # Reads entire file
    elif [[ -d "$item" ]]; then
        final_prompt="$final_prompt
$(find "$item" ... | while read f; do
    head -20 "$f"   # Reads 20 lines per file
done)"
    fi
done

Performance Impact:

  • Large files read entirely into memory
  • No size limits on file context
  • No caching of file contents
  • Repeated reads if function called multiple times

Measured Impact:

  • 100 files Γ— 1MB each = 100MB in memory
  • API call fails (most providers limit ~200K tokens)

Recommendation:

# Add size limits
MAX_FILE_SIZE=100000  # 100KB
if [[ $(stat -f%z "$item") -gt $MAX_FILE_SIZE ]]; then
    warn "File too large: $item (truncating)"
    head -c $MAX_FILE_SIZE "$item"
fi

# Cache file contents
declare -A FILE_CACHE
get_file_content() {
    [[ -n "${FILE_CACHE[$1]:-}" ]] && echo "${FILE_CACHE[$1]}" && return
    FILE_CACHE[$1]=$(cat "$1")
    echo "${FILE_CACHE[$1]}"
}

6.2 Inefficient Logging

🟑 MODERATE: Unbounded Log Growth

Current implementation (aiwb:371-393):

rotate_logs() {
    # Remove logs older than max count (10)
    # Truncate logs larger than 10MB
}

Issues:

  1. rotate_logs() called once per chat start, not continuously
  2. 10 logs Γ— 10MB = 100MB possible disk usage
  3. Log truncation uses tail -1000 (keeps random middle portion, loses context)

Recommendation:

# Rotate on every write, not just startup
log_message() {
    echo "$message" >> "$log_file"

    # Rotate if over size limit
    if [[ $(stat -f%z "$log_file") -gt $MAX_LOG_SIZE ]]; then
        mv "$log_file" "$log_file.1"
        # Compress old log
        gzip "$log_file.1" &
    fi
}

7. Consistency Issues

7.1 Config Schema Violations

🟑 MODERATE: Inconsistent Config Access

Schema defined in config.sh:126-154:

{
  "model_provider": "gemini",
  "preferences": {
    "auto_estimate": true,
    "syntax_highlighting": true
  }
}

But accessed inconsistently:

# Method 1: Direct jq
provider=$(jq -r '.model_provider' "$config_file")

# Method 2: config_get function
provider=$(config_get model_provider)

# Method 3: Environment variable
provider="${AIWB_PROVIDER:-gemini}"

# Method 4: Global variable
provider="$MODE_MODEL_PROVIDER"

Impact:

  • Changes to config schema break multiple locations
  • No single source of truth
  • Race conditions possible (file vs. memory)

Recommendation:

  1. Always use config_get() / config_set()
  2. Deprecate direct jq access
  3. Add config validation on load
  4. Use config migration for schema changes

7.2 State Management Chaos

πŸ”΄ CRITICAL: Global State Pollution

Multiple state systems compete:

# Session state (.session file)
workspace, provider, model, task, project, timestamp

# Config state (config.json)
workspace, model_provider, model_name, current_task, current_project

# Mode state (global variables)
MODE_CURRENT, MODE_PROMPT, MODE_UPLOADS, MODE_MODEL_PROVIDER

# Context state (.context_state file)
context_files, conversation_history, last_scan

# Environment variables
AIWB_WORKSPACE, AIWB_DEBUG, GITHUB_TOKEN

Problems:

  1. Duplicate data: workspace stored in 3 places
  2. Sync issues: Changing config doesn't update session
  3. Load order matters: Wrong order = wrong values
  4. No clear ownership: Who updates what?

Example conflict:

# User changes model in config
config_set model_provider "claude"

# But mode still uses old value
MODE_MODEL_PROVIDER="gemini"  # Stale!

# Session file also stale
load_session()  # Loads old "gemini" value

Recommendation:

Single source of truth:
  - Config file (config.json) = persistent state
  - Global variables = runtime cache (read from config)
  - Session file = remove (use config)
  - Context state = separate (contextual data)

Flow:
  1. Load config into globals
  2. Update config on change
  3. Reload globals from config

8. Documentation Issues

8.1 Missing Function Documentation

πŸ”΄ CRITICAL: No Function Documentation Standard

Current state: Most functions have no docstrings

# No documentation
get_api_key() {
    local provider="$1"
    # ...what does this return? what providers supported?
}

# Inline comments but no standard format
# Estimate token count (rough approximation)
estimate_tokens() {
    # ...better but not standardized
}

Impact:

  • New contributors struggle
  • Unclear function contracts
  • No auto-generated docs possible

Recommendation: Adopt documentation standard:

#############################
# Get API key for provider
#
# Arguments:
#   $1 - provider name (gemini|claude|openai|groq|xai)
# Returns:
#   API key string, or empty if not set
# Example:
#   key=$(get_api_key "gemini")
#############################
get_api_key() {
    local provider="$1"
    # ...
}

8.2 Misleading Help Text

🟑 MODERATE: Help Text Out of Sync

Evidence from aiwb:180-274:

cmd_help() {
    cat <<'EOF'
REPOSITORY SCANNING:
  scanrepo             Deep scan entire repository
  smartscan            Quick scan of key files
EOF
}

But in chat help (aiwb:636-641):

show_chat_help() {
    cat <<'EOF'
  /scanrepo             Deep scan current folder (works in ANY directory!)
  /smartscan            Quick scan (configs, docs, main files only)
EOF
}

Discrepancies:

  • scanrepo - "repository" vs "current folder"
  • smartscan - "key files" vs "configs, docs, main files"
  • Different emphasis ("works in ANY directory!")

Recommendation:

  1. Centralize help text in a single file
  2. Generate help from function annotations
  3. Add validation test: help text matches actual behavior

9. Testing and Quality Assurance

9.1 Test Coverage

πŸ”΄ CRITICAL: No Automated Tests

Found test files:

test_aiwb_comprehensive.sh      # Manual integration test
test_aiwb_functionality.sh      # Manual functionality test
test_dependency_check.sh        # Dependency check script
test_large_prompt_fix.sh        # Specific bug test
test_preflight_cost.sh          # Cost estimation test

Analysis:

  • All tests are manual shell scripts
  • No unit test framework
  • No CI/CD integration
  • No code coverage measurement
  • Tests don't run automatically

Impact:

  • Regressions go unnoticed
  • Refactoring is risky
  • New features break old features
  • No confidence in changes

Recommendation:

  1. Adopt testing framework: bats-core

    #!/usr/bin/env bats
    
    @test "get_api_key returns key for valid provider" {
        export GEMINI_API_KEY="test-key"
        run get_api_key "gemini"
        [ "$status" -eq 0 ]
        [ "$output" = "test-key" ]
    }
  2. Add CI/CD with GitHub Actions:

    - name: Run tests
      run: bats tests/*.bats
  3. Target coverage:

    • Critical functions: 90%
    • Utilities: 70%
    • UI: 30% (harder to test)

9.2 Code Quality Tools

🟑 MODERATE: No Linting

Checked for linting:

$ find . -name ".shellcheckrc" -o -name ".shfmt.config"
# No results

Missing tools:

  • ❌ ShellCheck - bash static analysis
  • ❌ shfmt - bash formatter
  • ❌ bash-language-server - LSP for IDE

Recommendation:

# Add .shellcheckrc
cat > .shellcheckrc <<EOF
# Disable warning about source not following
disable=SC1090,SC1091
# Disable warning about unused variables (false positives)
disable=SC2034
EOF

# Add to CI
shellcheck lib/*.sh bin-edit/*.sh aiwb

10. Bin-Edit Scripts Analysis

10.1 Purpose Unclear

πŸ”΄ CRITICAL: 50+ Scripts with Minimal Documentation

Sample of bin-edit scripts:

ai-buildprompt.sh    # ❓ What does this do?
ai-clean.sh          # ❓ Clean what?
ai-cost.sh           # ❓ Cost calculation? Why separate from api.sh?
ai-helpers.sh        # ❓ What helpers?
ai-img.sh            # ❓ Image handling?
ai-preflight.sh      # ❓ Preflight checks?
ai-snap.sh           # ❓ Snapshot?
bincheck.sh          # ❓ Binary check?
binpush.sh           # ❓ Push to where?
chat-runner.sh       # ❓ Duplicates aiwb chat?
claude-runner.sh     # ❓ Duplicates lib/api.sh?
cgo.sh, ggo.sh       # ❓ Go tools?
cout.sh, gout.sh     # ❓ Output handlers?
cpre.sh, gpre.sh     # ❓ Preprocessors?

Analysis:

$ head -10 bin-edit/*.sh | grep "^#"
# Most files have minimal or no header comments

Issues:

  1. No README in bin-edit/
  2. No comments explaining purpose
  3. Unclear which scripts are still used
  4. Possible dead code

Recommendation:

  1. Create bin-edit/README.md:

    # Bin-Edit Scripts
    
    Helper scripts for AI Workbench internal operations.
    
    ## Core Scripts
    - `aiwb.sh` - Main entry wrapper
    - `chat-runner.sh` - Chat loop handler
    
    ## Deprecated
    - `claude-runner.sh` - Use lib/api.sh instead
    - `gemini-runner.sh` - Use lib/api.sh instead
  2. Add deprecation warnings:

    # claude-runner.sh
    warn "DEPRECATED: Use 'lib/api.sh call_claude' instead"
  3. Remove dead code after 1-2 release cycles


10.2 Inconsistent Prefixing

Prefix patterns:

ai-*        # AI-related (ai-cost.sh, ai-img.sh, etc.)
bin*        # Binary-related (bincheck.sh, binpush.sh)
c*, g*      # Claude/Gemini (cout.sh, gout.sh, cpre.sh, gpre.sh)
t*          # Task-related (tnew.sh, tedit.sh, tdone.sh)
u*          # Upload-related (uin.sh, uls.sh, uclear.sh)
p*          # Project-related (pset.sh, pstatus.sh, plog.sh)

Problems:

  • No clear namespace
  • cgo.sh vs cout.sh vs cpre.sh - hard to remember
  • t* and p* prefixes collide with common Unix tools

Recommendation: Use clear prefixes:

aiwb-task-*       # Task operations
aiwb-upload-*     # Upload operations
aiwb-project-*    # Project operations
aiwb-provider-*   # Provider-specific tools

11. Platform Compatibility

11.1 Platform Support

βœ… STRENGTH: Good Cross-Platform Support

Platform detection (common.sh:12-30):

is_termux()  # Android/Termux
is_macos()   # macOS
is_linux()   # Linux

Platform-specific handling:

  • Date command differences (BSD vs GNU)
  • Sed command differences (sed -i vs sed -i '')
  • Stat command differences
  • Clipboard tools (termux-clipboard-set, pbcopy, xclip)

🟑 MODERATE: Missing Termux Optimizations

Issues specific to Termux:

  1. Storage access: Complex path handling for external storage
  2. Performance: No acknowledgment of mobile CPU limits
  3. Battery: No power-saving mode for background operations
  4. Network: No handling of cellular data limits

Recommendation:

# Add Termux-specific config
if is_termux; then
    # Lower default tokens on mobile
    MAX_TOKENS_DEFAULT=4000

    # Warn on cellular
    if is_cellular_connection; then
        warn "On cellular data. Consider using WiFi for large operations."
    fi
fi

12. Specific Bugs and Issues

12.1 Identified Bugs

πŸ”΄ BUG #1: Chat Router Fallback Fails

Location: lib/chat_router.sh:127-143

if ! type smart_edit &>/dev/null; then
    warn "Edit functionality not available"
    # Don't return, fall through to chat handler below
fi

Issue: Falls through but then still tries to call smart_edit on line 150

Fix:

if ! type smart_edit &>/dev/null; then
    warn "Edit functionality not available"
    handle_chat_message "$message"  # Explicit call, don't fall through
    return $?
fi

πŸ”΄ BUG #2: Context Footer Shows Wrong Data

Location: aiwb:746-780

get_context_summary() {
    if [[ ${#MODE_UPLOADS[@]} -gt 0 ]]; then
        echo "$file_count file(s)"
    else
        # Shows saved context count, but that's NOT what will be sent to API!
        echo "none (saved: $scan_count files - use /contextload)"
    fi
}

Issue: Footer says "none" but API might still get context if MODE_UPLOADS is set

Fix: Show what API will actually receive:

get_context_summary() {
    # Show actual context that will be sent to API
    local context=$(build_prompt_with_context "" "false")
    local char_count=${#context}
    if [[ $char_count -gt 0 ]]; then
        echo "$char_count chars"
    else
        echo "none"
    fi
}

🟑 BUG #3: Log Rotation Keeps Middle Portion

Location: aiwb:390

# Truncate logs larger than max size
tail -1000 "$log" > "$log.tmp" && mv "$log.tmp" "$log"

Issue: Keeps LAST 1000 lines, loses beginning and middle context

Fix:

# Keep beginning for context
head -500 "$log" > "$log.tmp"
echo "... [truncated] ..." >> "$log.tmp"
tail -500 "$log" >> "$log.tmp"
mv "$log.tmp" "$log"

13. Recommendations Summary

Priority 1 (Critical - Fix Immediately)

  1. Remove Code Duplication

    • βœ… Consolidate copy_to_clipboard() functions
    • βœ… Merge API calling logic (remove bin-edit runners)
    • βœ… Unify context management (MODE_UPLOADS + context_state)
  2. Security Fixes

    • βœ… Enable key encryption by default
    • βœ… Add automatic git exposure audit
    • βœ… Implement key rotation mechanism
  3. State Management

    • βœ… Consolidate session/config/mode state
    • βœ… Establish single source of truth
    • βœ… Document state lifecycle
  4. Fix Identified Bugs

    • βœ… Chat router fallback (Bug #1)
    • βœ… Context footer accuracy (Bug #2)
    • βœ… Log rotation (Bug #3)

Priority 2 (High - Fix Soon)

  1. Documentation

    • βœ… Add function docstring standard
    • βœ… Document bin-edit scripts
    • βœ… Sync help text across commands
    • βœ… Create architecture diagram
  2. Testing

    • βœ… Add bats test framework
    • βœ… Write tests for critical functions
    • βœ… Set up CI/CD with GitHub Actions
    • βœ… Add ShellCheck linting
  3. Workflow Simplification

    • βœ… Simplify context workflow
    • βœ… Reduce mode workflow steps
    • βœ… Consolidate generation commands

Priority 3 (Medium - Plan for Next Version)

  1. Code Quality

    • βœ… Establish naming conventions
    • βœ… Replace magic numbers with constants
    • βœ… Standardize error handling patterns
    • βœ… Add input validation layer
  2. Performance

    • βœ… Add file size limits for context
    • βœ… Implement file content caching
    • βœ… Improve log rotation strategy
  3. Features

    • βœ… Add branch management UI
    • βœ… Add diff visualization
    • βœ… Add interactive staging
    • βœ… Add Termux optimizations

Priority 4 (Low - Future Enhancements)

  1. Architecture

    • βœ… Split monolithic aiwb script
    • βœ… Define clear API boundaries
    • βœ… Reduce module coupling
    • βœ… Create plugin system
  2. Platform Support

    • βœ… Windows support (WSL/Git Bash)
    • βœ… Docker containerization
    • βœ… Package for Linux distros (deb, rpm)

14. Metrics and Measurements

Code Complexity

Metric Value Target Status
Lines of Code 10,838 < 15,000 βœ… Good
Main Script Size 3,000+ lines < 1,000 ❌ Too Large
Cyclomatic Complexity (avg) Unknown < 15 ⚠️ Needs measurement
Function Count 150+ - ℹ️ Normal
Max Function Size 200+ lines < 100 ❌ Some too large

Code Quality

Metric Value Target Status
Duplicate Code % ~15% < 5% ❌ Too High
Functions w/o Docs ~80% < 20% ❌ Too Low
Test Coverage 0% > 60% ❌ None
ShellCheck Violations Unknown 0 ⚠️ Needs scan
Security Audit Score 6/10 > 8/10 ❌ Needs work

Dependencies

Category Count Concerns
Required 3 (bash, jq, curl) βœ… Minimal
Optional 2 (gum, bat) βœ… Good fallbacks
Platform-Specific 5+ (git, termux-*, etc.) ⚠️ Many conditionals

15. Conclusion

AIWB is a functional and feature-rich AI orchestration tool with good multi-provider support and platform compatibility. However, it suffers from:

  1. Technical Debt: ~15% code duplication, inconsistent patterns
  2. Complexity: Multiple overlapping systems (context, state, workflows)
  3. Security: Plaintext key storage by default
  4. Quality: No automated tests, minimal documentation

Path Forward

Recommended roadmap:

Phase 1 (1-2 weeks): Critical Fixes

  • Remove code duplication
  • Fix identified bugs
  • Enable security features by default
  • Consolidate state management

Phase 2 (2-4 weeks): Quality Improvements

  • Add test framework and initial tests
  • Document all functions
  • Set up CI/CD
  • Simplify workflows

Phase 3 (1-2 months): Architecture Refactor

  • Split monolithic main script
  • Establish clear module boundaries
  • Implement plugin system
  • Performance optimizations

Phase 4 (Ongoing): Feature Development

  • Enhanced GitHub features
  • Better Termux support
  • Additional providers
  • Community plugins

Final Assessment

Current State: 6.5/10 - Works well but needs refactoring Potential State: 9/10 - With recommended fixes, could be excellent

The foundation is solid. With focused effort on reducing complexity and improving code quality, AIWB can become a best-in-class AI CLI tool.


Appendices

Appendix A: File Inventory

Total Files: 62+
  - Main Entry: aiwb (3000+ lines)
  - Libraries: 12 files (10,838 lines)
  - Bin-Edit: 50+ helper scripts
  - Docs: 20+ markdown files
  - Tests: 5 manual test scripts
  - Config: Installation and setup scripts

Appendix B: Function Count by Module

Module Functions LOC Functions/LOC
api.sh 20+ 1533 ~77 per function
github.sh 30+ 1431 ~48 per function
modes.sh 25+ 1407 ~56 per function
ui.sh 15+ 537 ~36 per function
common.sh 20+ 483 ~24 per function
config.sh 10+ 326 ~33 per function

Appendix C: Command Reference

Chat Commands: 30+ slash commands CLI Commands: 15+ direct commands Mode Commands: 10+ per mode (make/tweak/debug) Total User-Facing Commands: 50+


End of Audit Report Generated: January 6, 2026 Next Review: Recommended after Phase 1 fixes