GPU Compute Exchange · Beta

Route AIcomputeat cost.

Flopex is a real-time GPU exchange that routes your inference jobs to the fastest, cheapest provider in milliseconds — across Groq, DeepInfra, Together AI, and Featherless.

Get API Key — free $10 credit →See how it works

16K+

Models available

<200ms

Avg. routing latency

Live GPU providers

Live routing feed

Be next in the feed →

Quick start

# One line to switch from OpenAI
import requests

response = requests.post(
  "https://api.flopex.ai/v1/inference",
  headers={ "Authorization": "Bearer sk_live_..." },
  json={ "model": "llama-3-70b", "input": prompt }
)
# billing.cost_usd in every response

Groq LPU · 40msDeepInfra · $0.07/1MTogether AI · 154 modelsFeatherless · 15,886 modelsRunPod · GPU burstReal-time pricingAutomatic failover35% cheaper on averageGroq LPU · 40msDeepInfra · $0.07/1MTogether AI · 154 modelsFeatherless · 15,886 modelsRunPod · GPU burstReal-time pricingAutomatic failover35% cheaper on average

How it works

The exchange clears in
milliseconds.

↗ Request

You send a job

One API call with your model, prompt, and performance profile. Economy, balanced, or fast.

⇄ Quote

All providers bid

Flopex pings every eligible provider simultaneously. Each returns a real-time cost and latency quote.

✓ Route

Exchange clears

The winning provider is selected by price, latency, reliability, and your profile. No human in the loop.

$ Settle

You see the cost

Every response includes exact token counts and cost in USD. Your balance updates in real time.

Live providers

Five GPU networks,
one API.

Groq

Speed tier · LPU

~40ms

Avg. latency

DeepInfra

Economy tier

$0.07

Per 1M tokens

Together AI

Balanced tier

154

Models available

Featherless

Long tail

15,886

Models available

RunPod

GPU burst

H100

On-demand GPUs

Transparent pricing

You pay exactly what we charge. No hidden markups on hidden markups.

Direct provider

llama-3.1-8b$0.20/1M

llama-3.3-70b$0.88/1M

qwen-2.5-7b$0.30/1M

deepseek-r1$3.00/1M

Via Flopex

llama-3.1-8b$0.07/1M -65%

llama-3.3-70b$0.59/1M -33%

qwen-2.5-7b$0.10/1M -67%

deepseek-r1$3.00/1M best

Route AIcomputeat cost.

The exchange clears inmilliseconds.

Five GPU networks,one API.

Start routingin 60 seconds.

The exchange clears in
milliseconds.

Five GPU networks,
one API.

Start routing
in 60 seconds.