Orbitrage

Run 3X more agentic workflows
for the same budget

Every step of your agent has a model that wins on cost and quality. Finding them by hand takes weeks. We do it once and hand you the answer.

Every model under every modality
ClaudeClaude
OpenAIOpenAI
DeepSeekDeepSeek
GrokGrok
MistralMistral
GeminiGemini
Meta AIMeta AI
PerplexityPerplexity
CohereCohere
GroqGroq
GemmaGemma
PhiPhi
QwenQwen
GLMGLM
KimiKimi
MiniMaxMiniMax
SkyworkSkywork
ElevenLabsElevenLabs
SunoSuno
UdioUdio
Fish AudioFish Audio
MidjourneyMidjourney
RunwayRunway
Stability AIStability AI
Luma AILuma AI
PikaPika
IdeogramIdeogram
Adobe FireflyAdobe Firefly
LightricksLightricks
ByteDanceByteDance
Hugging FaceHugging Face
ReplicateReplicate
Together AITogether AI
FireworksFireworks
ClaudeClaude
OpenAIOpenAI
DeepSeekDeepSeek
GrokGrok
MistralMistral
GeminiGemini
Meta AIMeta AI
PerplexityPerplexity
CohereCohere
GroqGroq
GemmaGemma
PhiPhi
QwenQwen
GLMGLM
KimiKimi
MiniMaxMiniMax
SkyworkSkywork
ElevenLabsElevenLabs
SunoSuno
UdioUdio
Fish AudioFish Audio
MidjourneyMidjourney
RunwayRunway
Stability AIStability AI
Luma AILuma AI
PikaPika
IdeogramIdeogram
Adobe FireflyAdobe Firefly
LightricksLightricks
ByteDanceByteDance
Hugging FaceHugging Face
ReplicateReplicate
Together AITogether AI
FireworksFireworks
Intelligence

The right model for every step.

Orbitrage profiles each call in your pipeline and routes it to the model that scores highest on quality per dollar — automatically, with no code changes required.

Visibility

See exactly where your budget goes.

The audit dashboard breaks down every API call by model, cost, latency, and outcome. Know your spend down to the token before your next invoice arrives.

FAQ

Frequently asked.

Still curious? Drop us a note at hello@onlyrouter.ai.

  • Will routing change my outputs?
    Only on routes you explicitly approve, and only after the cheaper model has matched your live traffic in shadow. Per-route quality floors trigger automatic rollback the moment drift is detected.
  • Do you support self-hosted models?
    Yes. Any OpenAI-compatible endpoint works — including vLLM, Together, Fireworks, Bedrock, Groq, and your own deployments behind a private gateway.
  • What if my traffic is too small to route?
    The audit will tell you. We don't pitch a rollout if there isn't meaningful spend to compress.
  • Is the audit really free?
    Yes. A read-only mirror runs against a copy of your traffic for one week and produces a single report. No SDK change, no commitment.
  • How does shadow testing work?
    We replay a slice of live traffic through the cheaper model and grade outputs against the original. Shadow runs continuously, and a route only goes live once your quality floor holds for a configurable window.
  • Can routing fail back automatically?
    Every policy ships with hard backstops on latency, quality, and uncertainty. A breach aborts the route and falls back to the original model in real time, no human in the loop required.
  • How does pricing work?
    A flat percentage of the spend we save. If the bill doesn't move, you don't pay. The audit itself is always free.
  • Where does my data live?
    Self-hosted by default. The gateway runs in your VPC and prompts, responses, and logs never leave your network. A managed cloud deployment is available on request.