Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions .github/workflows/push-dev-docker-images.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,14 @@ jobs:
name: Create Docker Image Tags
run: |
SHORT_SHA=$(echo ${GITHUB_SHA} | cut -c1-8)
echo "DOCKER_IMAGE_TAG=${{ github.ref_name }}-${SHORT_SHA}" >> $GITHUB_ENV
echo "DOCKER_IMAGE_TAG_WITH_DATE=${{ github.ref_name }}-${SHORT_SHA}-$(date +%s)" >> $GITHUB_ENV
# Sanitize branch name for Docker tag only if it contains forward slashes
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unrelated

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow was failing - it hadn't been run in months so it seemed that something might have changed either on github's end or somewhere else. This fixed it.

if [[ "${{ github.ref_name }}" == *"/"* ]]; then
SANITIZED_REF_NAME=$(echo "${{ github.ref_name }}" | sed 's/\//-/g')
else
SANITIZED_REF_NAME="${{ github.ref_name }}"
fi
echo "DOCKER_IMAGE_TAG=${SANITIZED_REF_NAME}-${SHORT_SHA}" >> $GITHUB_ENV
echo "DOCKER_IMAGE_TAG_WITH_DATE=${SANITIZED_REF_NAME}-${SHORT_SHA}-$(date +%s)" >> $GITHUB_ENV
echo "BUILD_DATE=$(date +%s)" >> $GITHUB_ENV
SHA=$(git rev-parse $(git rev-parse --abbrev-ref HEAD))
echo "SHA=$SHA" >> $GITHUB_ENV
Expand Down
89 changes: 89 additions & 0 deletions docs/OSMO-53-routing-fix-proposal.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# OSMO-53: Large Swap Routing Fix

## Problem
500K USDC→BTC swap returns **94% slippage** (~0.4 BTC instead of ~4.4 BTC) because router selects low-liquidity orderbook pool (Pool 1930, ~$42K liquidity) due to its 0% spread factor.

---

## Behavior Comparison

### 🔴 Previous (Production)
```
500K USDC request
Route ranking: Test full 500K on each route
Split algorithm: Orderbook error SILENTLY IGNORED
- Code: `result, _ := route.CalculateTokenOutByTokenIn(...)`
❌ Bad route passed to frontend → 94% slippage
```

### 🟡 Current (Staging)
```
440K USDC request
Route ranking: Test full 440K on each route
- All routes ERROR at this amount
❌ "No routes provided" error → User can't swap
```
**Fix applied**: Errors now surfaced in `dynamic_splits.go`
**New issue**: All routes filtered out before split algorithm runs

### 🟢 Proposed
```
440K USDC request
Route ranking: Test full 440K → All routes error
FALLBACK: Test probe amount (10% = 44K)
- Orderbook: Error (still over capacity)
- CL pools: ✅ Success
Split algorithm: Allocates across working CL pools
✅ Valid quote returned
```

---

## Changes Required

| File | Status | Change |
|------|--------|--------|
| `dynamic_splits.go` | ✅ Done | Handle route errors (return 0 instead of ignoring) |
| `optimized_routes.go` | 🔧 Needed | Add probe fallback when all routes fail at full amount |

---

## Why Fallback (Not First-Line Probe)?

| Approach | Normal Swaps | Large Swaps |
|----------|--------------|-------------|
| **Fallback** | Fast (1 test) | 2 tests |
| **Always probe** | Slower (probe + split always) | Same |

Most traffic is normal-sized → fallback keeps them fast, only pays extra cost for large swaps.

---

## Edge Cases Handled

| Scenario | Result |
|----------|--------|
| Probe still too big for all routes | "No routes" error (correct - truly no capacity) |
| Probe works for orderbook too | Split algorithm errors on larger increments → excludes it |

**Defense in depth**: Probe identifies viable routes → Split algorithm handles capacity limits dynamically.

---

## Test Cases

- [ ] 400K USDC→BTC: Should work (baseline)
- [ ] 440K USDC→BTC: Should work after fix
- [ ] 500K USDC→BTC: Should work with reasonable slippage (not 94%)
- [ ] Small amounts (<$10K): Should remain fast (no probe triggered)


115 changes: 115 additions & 0 deletions docs/PR-689-message.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
# PR Title

fix(router): handle large swaps exceeding route capacity (OSMO-53)

---

# PR Body

## Summary

Fixes critical bug where large USDC→BTC swaps (500K+) resulted in 94% slippage due to router selecting low-liquidity orderbook pool.

## Linear Issue

[OSMO-53: SQS router causes 94% slippage for large USDC to BTC swaps](https://linear.app/osmosis/issue/OSMO-53/sqs-router-causes-94percent-slippage-for-large-usdc-to-btc-swaps)

## Problem

1. Orderbook Pool 1930 has ~$42K liquidity but 0% spread factor
2. Router selected it for large swaps because it appeared "cheapest"
3. Error from insufficient liquidity was **silently ignored** in `dynamic_splits.go`
4. Result: Bad route passed to frontend → 94% slippage

## Solution (Three-Part Fix)

### Part 1: Error Handling in Split Algorithm
`dynamic_splits.go` - Routes that error now return 0 output instead of being silently ignored.

### Part 2: Probe Fallback for Route Ranking
`optimized_routes.go` - When all routes fail at full amount:
- Retry with 10% probe amount to identify viable routes
- Pass working routes to split algorithm
- Split algorithm handles actual capacity allocation

### Part 3: Prevent Inaccurate Quote Returns (CodeRabbit feedback)
`router_usecase.go` - When probe fallback is used:
- Track `usedProbeFallback` flag through the call chain
- Skip early single-route returns (quote OutAmount would be inaccurate)
- Force split algorithm to run for accurate calculations
- Return clear error: `"requested amount exceeds available liquidity, try a smaller amount"`

## Solution Flow

```mermaid
flowchart TD
A["User requests swap<br/>e.g. 500K USDC→BTC"] --> B["estimateAndRankSingleRouteQuote()<br/><i>optimized_routes.go</i>"]

B --> C{"All routes fail<br/>at full amount?"}
C -->|No| D["Return quote +<br/>usedProbeFallback = false"]
C -->|Yes| E["🔧 FIX #2: Probe Fallback<br/>Try 10% of amount"]

E --> F{"Probe succeeds?"}
F -->|No| G["Return error:<br/>no viable routes"]
F -->|Yes| H["Set usedProbeFallback = true<br/>Return routes for splitting"]

D --> I["GetOptimalQuoteOutGivenIn()<br/><i>router_usecase.go</i>"]
H --> I

I --> J{"Single route OR<br/>splits disabled?"}
J -->|Yes| K{"🔧 FIX #3: Check flag<br/>usedProbeFallback?"}
J -->|No| L["Run Split Algorithm<br/><i>dynamic_splits.go</i>"]

K -->|No| M["Return single route quote<br/>(OutAmount is accurate)"]
K -->|Yes| N["Force split algorithm<br/>(OutAmount was approximate)"]

N --> L

L --> O["🔧 FIX #1: Error Handling<br/>Routes that error → return 0"]
O --> P{"Valid split found?"}

P -->|Yes| Q["Return accurate split quote"]
P -->|No| R{"usedProbeFallback?"}

R -->|Yes| S["Return error:<br/>'amount exceeds liquidity,<br/>try smaller amount'"]
R -->|No| T["Return single route<br/>as fallback"]

style E fill:#4dabf7,color:#000
style K fill:#4dabf7,color:#000
style O fill:#4dabf7,color:#000
style S fill:#fab005,color:#000
```

## Flow Comparison

| Amount | Before | After |
|--------|--------|-------|
| 400K USDC | ✅ Works | ✅ Works |
| 500K USDC | ❌ 94% slippage (wrong quote shown) | ✅ Splits across CL pools or returns clear error |

## Files Changed

- `router/usecase/dynamic_splits.go` - Handle route errors in DP algorithm
- `router/usecase/optimized_routes.go` - Add probe fallback + `usedProbeFallback` return
- `router/usecase/router_usecase.go` - Propagate flag, skip early returns when probe used
- `router/usecase/export_test.go` - Update test exports for new signatures
- `router/usecase/optimized_routes_test.go` - Handle new return value
- `router/usecase/dynamic_splits_test.go` - Handle new return value

## Testing

- [ ] 400K USDC→BTC returns ~4.4 BTC (baseline)
- [ ] 500K USDC→BTC returns reasonable output OR clear error message
- [ ] Small swaps (<$10K) remain fast (no probe triggered)
- [ ] Direct pool quotes return error when amount exceeds pool capacity

## Related

Fixes OSMO-53







97 changes: 97 additions & 0 deletions docs/PR-checkpoint-orderbook-capacity-fix.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
# Implementation Checkpoint: Orderbook Capacity-Aware Force-Add

## Summary

This PR implements **Option 2** from the OSMO-53 fix proposal: "Only Force-Add if Orderbook Has Capacity". The canonical orderbook route is now only force-added to cached routes when it has sufficient liquidity for the requested swap amount.

## Problem Solved

Previously, the canonical orderbook was **always** force-added back to cached routes after ranking (even when filtered out due to insufficient liquidity). This caused issues for large swaps like 500K USDC → BTC, where the low-liquidity orderbook pool (Pool 1930, ~$42K liquidity) would be reconsidered on every request.

## Changes

### `router/usecase/router_usecase.go`

- Added `orderbookHasSufficientCapacity()` helper function that checks if an orderbook pool can handle the requested swap amount
- Modified the force-add logic to only re-add canonical orderbook if:
- Pool data can be fetched successfully
- OrderbookData is available
- `tokenIn.Amount <= capacity` (using same mirrored data as quote calculation)
- Added 4 debug log points prefixed with `OSMO-53:` for troubleshooting:
- `OSMO-53: candidate routes found`
- `OSMO-53: routes after ranking`
- `OSMO-53: orderbook capacity check`
- `OSMO-53: canonical orderbook force-add decision`

### `router/usecase/export_test.go`

- Exported `OrderbookHasSufficientCapacity()` for testing
- Exported `SetPoolsUsecase()` for test setup

### `router/usecase/router_usecase_test.go`

- Added `TestOrderbookHasSufficientCapacity()` with 9 test cases:
- Sufficient capacity for BID direction
- Insufficient capacity for BID direction (OSMO-53 scenario)
- Exact capacity match
- Sufficient capacity for ASK direction
- Insufficient capacity for ASK direction
- Nil CosmWasmPoolModel handling
- Nil OrderbookData handling
- Pool fetch error handling
- Zero capacity handling

## Developer Concern Addressed

> "There's a part of the SQS code that mirrors the contract to avoid expensive queries, we'd need to ensure we don't cause invalid quotes"

**Resolution**: The capacity check uses the **same** `OrderbookData` (specifically `BidAmountToExhaustAskLiquidity` / `AskAmountToExhaustBidLiquidity`) that is already used in `CalculateTokenOutByTokenIn()` for quote calculation. No new data source is introduced, so no new staleness concerns exist.

## How It Works With OSMO-53

```
500K USDC request
Route ranking: Test full 500K
- Orderbook: Error (insufficient capacity)
- CL pools: Error (too much for single route)
PROBE FALLBACK (optimized_routes.go fix):
Test 50K probe amount
- Orderbook: Error (still over 42K capacity)
- CL pools: Success
Split algorithm runs with CL pools
CACHE WRITE (this fix):
- Orderbook was in candidates but filtered
- Capacity check: 42K < 500K → SKIP force-add
- Cache only contains working CL routes
Valid quote returned
```

## Testing

```bash
# Run capacity check unit tests
go test -v -run TestOrderbookHasSufficientCapacity ./router/usecase/...

# Run all router tests
go test -v ./router/usecase/...

# Enable debug logs (set in config or env)
SQS_LOGGER_LEVEL=debug
```

## Next Steps

- [ ] Code review
- [ ] Run full test suite
- [ ] Deploy to staging for OSMO-53 scenario testing
- [ ] Verify debug logs show expected behavior
Loading
Loading