Skip to content

Conversation

@shuhuiluo
Copy link
Contributor

Refactor _v2Swap to resolve stack too deep error

Previously UniversalRouter can only be compiled via the IR pipeline due to the "stack too deep" error which takes a long time especially in Foundry. After refactoring _v2Swap and modifying SignatureTransfer.permitWitnessTransferFrom to

bytes32 hash = permit.hashWithWitness(witness, witnessTypeString);
_permitTransferFrom(permit, transferDetails, owner, hash, signature);

it not only saves gas but the project can also be compiled with via_ir = false which allows for faster compilation and testing

Compute v2 and v3 pool address using inline assembly and explicitly clean the upper bits

Previously the pool address was computed in pure Solidity and casted via address(uint160(uint256())). However the upper 12 bytes are not explicitly cleaned and the address is later used in solmate::SafeTransferLib.safeTransfer which is written in inline assembly. When compiled with via_ir enabled, all tests pass. However, after making the aforementioned changes, some Foundry tests failed in a ERC20.transfer with via_ir disabled. It was discovered that the dirty upper bits of pool address are the culprit, but somehow the IR pipeline may clean the address after keccak256. Nonetheless, pairForPreSorted and computePoolAddress are rewritten in inline assembly to save gas and clean the upper bits. Closes #290.

Add Uniswap v3 Foundry tests for more granular gas comparison

There weren't Uniswap v3 tests in Foundry. In order to validate gas optimizations to be made, test contracts UniswapV3Test and V3MockMock have been added. forge snapshot --diff is run after each modification to validate the gas savings.

Replace conditional statements with bitwise operations

The current Solidity optimizer isn't smart enough to reduce ternary expressions to fewer opcodes and likely translate them to JUMPI. Therefore TernaryLib utils written in inline assembly have been added to replace ternary expressions as much as possible. sortTokens is also refactored to TernaryLib but inlined where appropriate.

In general for x = y ? a : b, when y = true

x = a = 0 ^ a = b ^ b ^ a = b ^ (b ^ a) * y

When y = false

x = b = b ^ 0 = b ^ (b ^ a) * y

Therefore x = y ? a : b is equivalent to x = b ^ (b ^ a) * y according to the properties of xor.

@shuhuiluo shuhuiluo changed the title Gas savings Save gas and clean the upper bits of computed pool address properly May 8, 2023

/// @notice Sorts two uint256 in ascending order
/// @dev Equivalent to: `a < b ? (a, b) : (b, a)`
function sort2(uint256 a, uint256 b) internal pure returns (uint256, uint256) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No.


/// @notice Swaps two uint256 if `condition` is true
/// @dev Equivalent to: `condition ? (b, a) : (a, b)`
function swapIf(bool condition, uint256 a, uint256 b) internal pure returns (uint256, uint256) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this is a swap router, swap can easily get confused here. maybe reverseOrderIf / reverseIf / switchIf / switchOrderIf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opt for switchIf.

pragma solidity >=0.5.0;

/// @title Library for replacing ternary operator with efficient bitwise operations
library TernaryLib {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

slick ternary ops in here!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One function is a ternary, while the other is not. I wonder if SortLib would be more readable and make more sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only sortTokens is used for sorting. But the rest of the library is meant for replacement of the built-in ternary operator.

/// @title Library for replacing ternary operator with efficient bitwise operations
library TernaryLib {
/// @notice Equivalent to the ternary operator: `condition ? a : b`
function ternary(bool condition, uint256 a, uint256 b) internal pure returns (uint256 res) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not used anywhere either?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was more of an illustration. Now that overloaded versions of ternary() on int256 and address are used in V3SwapRouter, should we leave this one here for completeness?

)
)
);
assembly ("memory-safe") {
Copy link
Contributor

@ewilz ewilz May 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I like the gas savings here. Wonder if it's worth a comment:

// accomplishes the following:
// address(keccak256(abi.encodePacked(hex'ff', factory, keccak256(abi.encodePacked(token0, token1)), initCodeHash)))

to get a feel for what this is doing at a glance

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I think that's pretty helpful for readability, one for the V3Lib would probably make sense too. (sorry, missed that)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

assembly {
// amountOutReceived = uint256(-(zeroForOne ? amount1Delta : amount0Delta))
// no need to check for underflow
amountOutReceived := sub(0, xor(amount0Delta, mul(xor(amount1Delta, amount0Delta), zeroForOne)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could these ternaries in this file get abstracted into the Library with all the xors??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually yes. I was afraid the optimizer wouldn't inline properly. But it seems the gas level remains the same as long as we uncheck the unary - on int256

unchecked {
    uint256 amountOutReceived = uint256(-zeroForOne.ternary(amount1Delta, amount0Delta));
    if (amountOutReceived != amountOut) revert V3InvalidAmountOut();
}

For

hasMultiplePools ? address(this) : recipient

using hasMultiplePools.ternary(address(this), recipient) in place also saves gas due to less stack shuffling.

For zeroForOne := xor(isExactIn, lt(tokenOut, tokenIn)) though, using

zeroForOne = isExactIn.ternary(tokenIn < tokenOut, tokenOut < tokenIn);

wouldn't be cheaper. And writing zeroForOne = isExactIn ^ (tokenOut < tokenIn) isn't allowed since "Operator ^ not compatible with types bool and bool.".

Writing

sqrtPriceLimitX96 = zeroForOne.ternary(MIN_SQRT_RATIO + 1, MAX_SQRT_RATIO - 1);

instead of xor and literal hexes in assembly also wouldn't be cheaper. We could define the literal hexes as contract level constants though if that improves readability.

Copy link
Contributor

@ewilz ewilz May 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool thanks for the explanation. We definitely strive to strike a balance between readability and gas savings (or else we could just write low-level code for everything :)

Even with our steepest gas savings tests, these ternaries that cannot be abstracted away would only save < 100 extra gas per full swap, So I'd personally vote to just omit them. Can always get a third opinion..

Also some context, our v2 UniversalRouter has already been audited so we cannot include this, these gas savings would go into V3 of the router.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can always abstract them away. It will just costs a few more opcodes. I'm okay with replacing sqrtPriceLimitX96 done in assembly with literal hexes with zeroForOne.ternary(MIN_SQRT_RATIO + 1, MAX_SQRT_RATIO - 1). But frankly speaking, is

zeroForOne = isExactIn ? tokenIn < tokenOut : tokenOut < tokenIn;

more understandable and cleaner than

assembly {
    zeroForOne := xor(isExactIn, lt(tokenOut, tokenIn))
}

with derivations?

@hensha256 hensha256 requested a review from a team as a code owner January 10, 2025 13:22
Copy link

@rv64m rv64m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 Code Review Summary

The pull request successfully resolves a critical 'stack too deep' error and implements substantial gas optimizations by leveraging inline assembly and bitwise operations. Key risks identified are the trade-offs in code maintainability and readability due to these low-level optimizations. The most critical finding is the correct handling of 'dirty' upper bits in address calculations, a subtle but crucial bug fix. The overall assessment is positive, recommending a merge after addressing the suggested comments to improve long-term code clarity.

📋 Reviewed Changes

  • contracts/modules/Permit2Payments.sol: Analyzed payment logic and integration with Permit2, focusing on transfer authorization.
  • contracts/modules/uniswap/TernaryLib.sol: Introduction of a new library to replace ternary operators with gas-efficient bitwise operations using assembly.
  • contracts/modules/uniswap/v2/UniswapV2Library.sol: Refactored pairForPreSorted to use inline assembly for CREATE2 address calculation, including explicit clearing of upper address bits.
  • contracts/modules/uniswap/v2/V2SwapRouter.sol: Refactored _v2Swap function into a loop to resolve a 'stack too deep' error and integrated TernaryLib for gas optimization.
  • contracts/modules/uniswap/v3/BytesLib.sol: Reviewed new library for byte manipulation, used in V3 path handling.
  • contracts/modules/uniswap/v3/V3SwapRouter.sol: Refactored computePoolAddress to use inline assembly and integrated TernaryLib for optimizations. Reviewed the logic for exact output swaps using transient storage.
  • test/foundry-tests/UniswapV2.t.sol: New Foundry test suite for Uniswap V2 functionality.
  • test/foundry-tests/UniswapV3.t.sol: New Foundry test suite for Uniswap V3 functionality to validate gas savings.
  • test/foundry-tests/uniswapTokens/v3MockMock.t.sol: New mock contract for V3 testing.
  • test/integration-tests/gas-tests/snapshots/CheckOwnership.gas.test.ts.snap: Gas snapshot update.
  • test/integration-tests/gas-tests/snapshots/CryptoPunk.gas.test.ts.snap: Gas snapshot update.
  • test/integration-tests/gas-tests/snapshots/Element.gas.test.ts.snap: Gas snapshot update.
  • test/integration-tests/gas-tests/snapshots/Foundation.gas.test.ts.snap: Gas snapshot update.
  • test/integration-tests/gas-tests/snapshots/LooksRareV2.gas.test.ts.snap: Gas snapshot update.
  • test/integration-tests/gas-tests/snapshots/NFT20.gas.test.ts.snap: Gas snapshot update.
  • test/integration-tests/gas-tests/snapshots/NFTX.gas.test.ts.snap: Gas snapshot update.
  • test/integration-tests/gas-tests/snapshots/Payments.gas.test.ts.snap: Gas snapshot update.
  • test/integration-tests/gas-tests/snapshots/SeaportV1_4.gas.test.ts.snap: Gas snapshot update.
  • test/integration-tests/gas-tests/snapshots/SeaportV1_5.gas.test.ts.snap: Gas snapshot update.
  • test/integration-tests/gas-tests/snapshots/Sudoswap.gas.test.ts.snap: Gas snapshot update.
  • test/integration-tests/gas-tests/snapshots/Uniswap.gas.test.ts.snap: Gas snapshot update.
  • test/integration-tests/gas-tests/snapshots/UniversalRouter.gas.test.ts.snap: Gas snapshot update.
  • test/integration-tests/gas-tests/snapshots/UniversalVSSwapRouter.gas.test.ts.snap: Gas snapshot update.
  • test/integration-tests/gas-tests/snapshots/X2Y2.gas.test.ts.snap: Gas snapshot update.

📊 Architecture & Flow Diagrams

Diagram 1

flowchart TD
    subgraph "V2SwapRouter._v2Swap Execution Flow"
        direction LR
        Start("Start _v2Swap") --> LoopInit{"Loop i  0 to path.length-2"}
        LoopInit -->|Next Hop| GetReserves["Get reserves from current pair"]
        GetReserves --> CalcInputAmt["Calculate amountInput\n(balance - reserve)"]
        CalcInputAmt --> CalcOutputAmt["Calculate amountOutput\nvia getAmountOut"]
        CalcOutputAmt --> DetNextHop{"Is this the final hop"}
        DetNextHop -->|No| CalcNextPair["Calculate next pair address"]
        DetNextHop -->|Yes| SetRecipient["Set recipient as next hop"]
        CalcNextPair --> CallSwap("pair.swap(amountOut -> nextPair)")
        SetRecipient --> CallSwap
        CallSwap --> NextIter{"Update current pair  nextPair"}
        NextIter --> LoopInit
        LoopInit -->|End Loop| End("End _v2Swap")
    end
Loading

🤔 Review Quality Assessment

The review provides a comprehensive analysis of the changes, covering gas optimization, architectural improvements, and correctness. It successfully identified the critical fix for 'dirty' address bits when compiling without the IR pipeline and highlighted the maintainability trade-offs of the low-level optimizations. The assessment of the transient state pattern in V3 swaps demonstrates a thorough understanding of potential security and state-management risks. Confidence in the review's quality and coverage is high, as it addresses both the explicit goals of the PR and its implicit consequences on the codebase's long-term health.

💡 Suggestions Summary

  • Inline Comments: 2 suggestions on modified code lines
  • General Suggestions: 1 suggestions for overall improvements

Inline suggestions are provided as comments on specific lines below.

📝 Additional Suggestions

The following suggestions apply to code outside the current PR diff:

1. 🔴 The v3SwapExactOutput function uses a transient state variable, maxAmountInCached, to pass the amountInMaximum constraint into the swap callback. This pattern is inherently fragile, as a failure to reset the variable could disrupt subsequent transactions. While the implementation is correct, the fragility of this pattern should be explicitly noted in the code to prevent future regressions during maintenance.

File: contracts/modules/uniswap/v3/V3SwapRouter.sol (Line 151)
Confidence: 85%

    function v3SwapExactOutput(
        address recipient,
        uint256 amountOut,
        uint256 amountInMaximum,
        bytes calldata path,
        address payer
    ) internal {
        // This transient state pattern is fragile. It's critical that `maxAmountInCached` is reset
        // after the swap to ensure subsequent transactions in the same block are not affected.
        // Extensive test coverage for this reset mechanism is advised.
        maxAmountInCached = amountInMaximum;

// Compute the CREATE2 pair address and clean the upper bits.
pair := and(keccak256(0x0b, 0x55), 0xffffffffffffffffffffffffffffffffffffffff)
// Restore the free memory pointer.
mstore(0x40, fmp)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The inline assembly for pool address calculation correctly clears the upper bits of the address, a critical fix for compilations without via_ir. However, the code lacks a comment explaining why this is necessary. Adding context would improve long-term maintainability by clarifying the purpose of the bitmask, which prevents a subtle but critical bug.

Confidence: 100%

Suggested Implementation:

Suggested change
mstore(0x40, fmp)
// Compute the CREATE2 pair address and clean the upper bits.
// keccak256 returns a 32-byte hash, but an address is 20 bytes. We must explicitly
// mask the result to prevent 'dirty' upper bits from causing issues in downstream
// contract calls, which can occur when not compiling with via-ir.
pair := and(keccak256(0x0b, 0x55), 0xffffffffffffffffffffffffffffffffffffffff)

(uint256 reserve0, uint256 reserve1,) = IUniswapV2Pair(pair).getReserves();
(uint256 reserveInput, uint256 reserveOutput) =
TernaryLib.switchIf(input == token0, reserve1, reserve0);
uint256 amountInput = ERC20(input).balanceOf(pair) - reserveInput;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The use of TernaryLib and direct xor-based logic in assembly replaces standard conditional statements with a pattern that is highly efficient but also unreadable. This significantly increases the cognitive load for developers unfamiliar with this specific bitwise trick, posing a risk to future maintenance. Adding a brief explanatory comment at the point of use would improve clarity without sacrificing performance.

Confidence: 95%

Suggested Implementation:

Suggested change
uint256 amountInput = ERC20(input).balanceOf(pair) - reserveInput;
// Use bitwise operations to efficiently select reserves without a conditional jump
(uint256 reserveInput, uint256 reserveOutput) =
TernaryLib.switchIf(input == token0, reserve1, reserve0);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

The upper 12 bytes of CREATE2 pool address computed are not cleaned

3 participants