[Feature]: Cost Budget & Limits for Agent Executions

## Problem Statement

Currently, there is no way to limit spending on LLM calls during agent execution. I'm always frustrated when an agent with a bug, infinite loop, or unexpectedly complex task consumes unlimited tokens, leading to:

1. Unexpectedly high API bills
2. No visibility into costs until after execution completes
3. No way to set guardrails for production deployments
4. No alerts when costs are approaching limits

## Proposed Solution

Add the ability to set token and cost (USD) budgets for agent executions. When limits are approached or exceeded, the system should warn, pause, or stop execution based on configuration.

**Key Components:**

1. **`BudgetConfig`** - Configuration dataclass for limits, thresholds, and callbacks
2. **`CostTracker`** - Tracks usage, calculates costs, enforces limits
3. **`GraphExecutor` integration** - Check budget after each LLM call
4. **CLI flags** - `--max-cost`, `--max-tokens`, `--warn-at`

**Example Usage:**

```python
executor = GraphExecutor(
    runtime=runtime,
    llm=llm,
    budget_config=BudgetConfig(
        max_cost_usd=5.00,
        max_total_tokens=100_000,
        warn_at_percentage=75.0,
        on_exceed="pause",
        webhook_url="https://hooks.slack.com/...",
    ),
)

result = await executor.execute(graph, goal, input_data)
print(f"Cost: ${result.total_cost_usd:.4f}")
print(f"Budget exceeded: {result.budget_exceeded}")
```

## Alternatives Considered

1. **External monitoring only** - Use third-party tools (LangSmith, Helicone) to track costs after the fact. Downside: No real-time enforcement, costs already incurred.

2. **LLM provider limits** - Set spending limits at the API provider level (OpenAI, Anthropic). Downside: Not granular per-agent, no pause/resume capability.

3. **Pre-execution estimation** - Estimate costs before running based on graph complexity. Downside: Inaccurate for dynamic agents, doesn't handle retries.

4. **Token-only limits (no USD)** - Just limit tokens without cost calculation. Downside: Different models have different costs per token.

## Additional Context

### Use Cases

- **Development:** Limit costs while testing new agents
- **Production:** Set hard limits to prevent runaway costs
- **Budgeting:** Track costs per agent/goal for billing
- **Alerts:** Slack/webhook notifications when approaching limits

### Model Pricing Reference

Default pricing should be included for common models:

| Model             | Input (per 1M) | Output (per 1M) |
| ----------------- | -------------- | --------------- |
| gpt-4o            | $2.50          | $10.00          |
| gpt-4o-mini       | $0.15          | $0.60           |
| claude-3-5-sonnet | $3.00          | $15.00          |
| claude-3-5-haiku  | $0.25          | $1.25           |
| gemini-1.5-pro    | $1.25          | $5.00           |

### Related Roadmap Items

- Guardrails (Phase 2)
- Basic observability hooks (Phase 1)

## Implementation Ideas

### New Files

| File                                   | Description                          |
| -------------------------------------- | ------------------------------------ |
| `core/framework/graph/cost_tracker.py` | CostTracker and BudgetConfig classes |
| `core/tests/test_cost_tracker.py`      | Unit tests                           |

### Modified Files

| File                               | Changes                                                 |
| ---------------------------------- | ------------------------------------------------------- |
| `core/framework/graph/executor.py` | Add `budget_config` param, check limits after each node |
| `core/framework/graph/__init__.py` | Export new classes                                      |
| `core/framework/cli.py`            | Add `--max-cost`, `--max-tokens` flags                  |

### New Fields in `ExecutionResult`

```python
@dataclass
class ExecutionResult:
    # ... existing fields ...
    total_cost_usd: float = 0.0
    budget_exceeded: bool = False
    cost_summary: dict[str, Any] = field(default_factory=dict)
```

### Key Classes

```python
@dataclass
class BudgetConfig:
    max_input_tokens: int | None = None
    max_output_tokens: int | None = None
    max_total_tokens: int | None = None
    max_cost_usd: float | None = None
    warn_at_percentage: float = 80.0
    on_exceed: str = "stop"  # "warn", "pause", "stop"
    on_warning: Callable | None = None
    webhook_url: str | None = None

class CostTracker:
    def record_usage(self, model: str, input_tokens: int, output_tokens: int) -> CostSnapshot
    def should_stop(self) -> bool
    def should_pause(self) -> bool
    def get_summary(self) -> dict
```

@TimothyZhang7 can you assign me this issue 

File	Changes
`core/framework/graph/executor.py`	Add `budget_config` param, check limits after each node
`core/framework/graph/__init__.py`	Export new classes
`core/framework/cli.py`	Add `--max-cost`, `--max-tokens` flags

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Cost Budget & Limits for Agent Executions #1506

Problem Statement

Proposed Solution

Alternatives Considered

Additional Context

Use Cases

Model Pricing Reference

Related Roadmap Items

Implementation Ideas

New Files

Modified Files

New Fields in `ExecutionResult`

Key Classes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Model	Input (per 1M)	Output (per 1M)
gpt-4o	$2.50	$10.00
gpt-4o-mini	$0.15	$0.60
claude-3-5-sonnet	$3.00	$15.00
claude-3-5-haiku	$0.25	$1.25
gemini-1.5-pro	$1.25	$5.00

File	Description
`core/framework/graph/cost_tracker.py`	CostTracker and BudgetConfig classes
`core/tests/test_cost_tracker.py`	Unit tests

[Feature]: Cost Budget & Limits for Agent Executions #1506

Description

Problem Statement

Proposed Solution

Alternatives Considered

Additional Context

Use Cases

Model Pricing Reference

Related Roadmap Items

Implementation Ideas

New Files

Modified Files

New Fields in ExecutionResult

Key Classes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

New Fields in `ExecutionResult`