Pay-per-Token AI Inference with x402

Next.js 15 and App Router-ready AI chatbot with reasoning.

Pay-per-Token AI Inference with x402

An open-source template demonstrating how to charge for AI inference on a pay-per-token basis using the x402 protocol.

Features · How It Works · Running locally

Features

Pay-per-Token Dynamic Pricing

Accurate Token Metering: Extract actual token usage from AI responses using Vercel AI SDK
Flexible Pricing: Set your own price per token (ex: $0.000001 / token)
Real-time Cost Display: Users see the exact cost of each AI response in the UI
Asynchronous Settlement: Payment is settled after streaming completes for optimal UX

Secure Payment Flow with x402

Pre-verification: Verify signed payment data before processing requests
Maximum Amount Protection: Set a max token limit to cap potential costs
Post-inference Settlement: Charge only for actual tokens used
Onchain Payments: Paid in the token and chain of your choice with gasless transactions

How It Works

This template demonstrates a complete pay-per-token flow:

Payment Verification (verifyPayment)
- User signs payment authorization with maximum amount
- Server verifies signature before processing request
- Prevents unauthorized inference calls
AI Inference (streamText)
- Process chat request and stream AI response to user
- Non-blocking payment flow ensures optimal UX
- Extract token usage via onFinish callback
Asynchronous Settlement (settlePayment)
- Calculate final price: PRICE_PER_INFERENCE_TOKEN_WEI × totalTokens
- Settle payment on-chain after streaming completes
- Only charge for actual tokens consumed
Cost Display
- Stream token metadata to frontend via messageMetadata
- Display cost card below each AI response
- Full transparency for users

Key Code Snippets

Backend - Token Extraction & Payment Settlement (app/api/chat/route.ts):

const stream = streamText({
  // ... model config
  onFinish: async (event) => {
    const totalTokens = event.totalUsage.totalTokens;
    const finalPrice = PRICE_PER_INFERENCE_TOKEN_WEI * totalTokens;

    await settlePayment({
      facilitator: twFacilitator,
      network: arbitrum,
      price: { amount: finalPrice.toString(), asset: usdcAsset },
      // ... other params
    });
  },
});

Frontend - Cost Display (components/messages.tsx):

const totalTokens = metadata?.totalTokens;
const costInUsdc = (PRICE_PER_INFERENCE_TOKEN_WEI * totalTokens) / 10 ** 6;

Tech Stack

Next.js App Router for server-side rendering and performance
Vercel AI SDK for LLM API and streaming
thirdweb x402 for HTTP micropayments and payment infrastructure

Running Locally

Prerequisites

You will need the following API keys and environment variables:

AI Provider API Keys: Anthropic, Fireworks, or Groq (depending on which model you want to use)
thirdweb Credentials: For x402 payment infrastructure
- Get your secret key from thirdweb dashboard
- Client ID for frontend wallet connection

Setup

Clone the repository

git clone <repository-url>
cd x402-ai-inference

Install dependencies

pnpm install

Set up environment variables

Create a .env.local file in the root directory:

# AI Provider API Keys
ANTHROPIC_API_KEY=your_anthropic_api_key
OPENAI_API_KEY=your_openai_api_key

# thirdweb Configuration
THIRDWEB_SECRET_KEY=your_thirdweb_secret_key
THIRDWEB_SERVER_WALLET_ADDRESS=your_server_wallet_address
NEXT_PUBLIC_THIRDWEB_CLIENT_ID=your_thirdweb_client_id

Important: Never commit your .env.local file. It contains secrets that will allow others to control access to your AI provider and thirdweb accounts.

Configure pricing (Optional)

Edit lib/constants.ts to adjust your pricing:

export const PRICE_PER_INFERENCE_TOKEN_WEI = 1; // 0.000001 USDC per token
export const MAX_INFERENCE_TOKENS_PER_CALL = 1000000; // 1M tokens max

You can also change the chain and token used for the payment in that file.

Start the development server

pnpm dev

Your app should now be running on localhost:3000.

Testing Payments

Connect a wallet with USDC on Arbitrum
Send a chat message to trigger an AI inference
The app will:
- Verify your payment signature
- Stream the AI response
- Settle payment based on actual tokens used
- Display the cost below the response

Learn More

x402 thirdweb Documentation

License

This project is open source and available under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
app		app
components		components
lib		lib
public		public
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
postcss.config.mjs		postcss.config.mjs
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Pay-per-Token AI Inference with x402

Features

Pay-per-Token Dynamic Pricing

Secure Payment Flow with x402

How It Works

Key Code Snippets

Tech Stack

Running Locally

Prerequisites

Setup

Testing Payments

Learn More

License

About

Uh oh!

Languages

thirdweb-example/x402-ai-inference

Folders and files

Latest commit

History

Repository files navigation

Pay-per-Token AI Inference with x402

Features

Pay-per-Token Dynamic Pricing

Secure Payment Flow with x402

How It Works

Key Code Snippets

Tech Stack

Running Locally

Prerequisites

Setup

Testing Payments

Learn More

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Languages