OpenRout.ing
ModelsPricingFeaturesDocsLog inGet API Key
100% Open Source · Zero Logs · Fully Private

World's first agentic
inference platform.

0
open-source models
0
tokens/sec — live
This hour
0
Today
0
This week
0
This month
0
G
GLM 5.2
OpenRouter: £0.95/M prompt · £3.24/M completion
OpenRouter 50B tok/mo
£104,675
openrout.ing Plus
£179/mo
x585 cheaper
K
Kimi K2.7 Code
OpenRouter: £0.58/M prompt · £2.77/M completion
OpenRouter 50B tok/mo
£83,740
openrout.ing Plus
£179/mo
x468 cheaper

No proprietary models. No data logging. No telemetry. Every model is fully open source and your prompts are never stored. From £79/mo fixed. OpenRouter charges £188,415/mo for the same tokens.

DeepSeekMeta LlamaQwenGLMMistralKimiGemma
Featured Flagship Models

The best open-source models.
At a fraction of the cost.

Why so much cheaper

Same models.
Hundreds of times cheaper.

Serverless GPU Orchestration

We don't rent you a GPU. We intelligently route requests across a shared fleet of GPUs, loading and unloading models in real-time. Every millisecond of compute is utilised — no idle hardware, no wasted spend. You pay for inference, not for sitting empty.

Multi-Tenant Model Serving

Thousands of users share the same model instances simultaneously. A single 70B model serving 500 concurrent requests costs a fraction per-user compared to dedicated hardware. This is how we deliver frontier models at £0.00395/M tokens instead of £1.59/M.

Flat-Rate, Fixed Tokens

No per-token billing. Each plan includes a set token allowance — 20B, 50B, 100B, up to 1T — with concurrency limits matched to the tier. No surprise invoices, no throttling mid-request. You know exactly what you pay and exactly what you get.

Zero Logs, Fully Private

Prompts and completions are never stored, logged, or transmitted to third parties. Ephemeral by design. No telemetry. No usage analytics. GDPR compliant by default.

100% Open Source

Every model we serve is open source. No lock-in. No proprietary dependencies. Verify the weights yourself. If we ever go down, take the weights and self-host.

Smart Model Caching

Popular models stay hot in GPU memory. Rare models load on-demand in seconds, then free their resources. You get instant access to 23,480+ models without paying for 23,480+ GPUs. Only the inference you use, nothing more.

Model showcase

Only open source.
No exceptions.

Flagship256K ctx

GLM-5.2

753B MoE. Multilingual reasoning

Code256K ctx

Kimi K2.7 Code

1T MoE coding specialist

Flagship256K ctx

DeepSeek V4 Pro

Frontier open-source reasoning

Code256K ctx

Qwen3 Coder 480B

480B MoE. Ultimate coding model

View all 23,480+ models →
Price per million tokens

OpenRouter charges
hundreds of times more.

Two APIs, one key

Raw inference on
raw.openrout.ing

Intelligent inference on
api.openrout.ing

One API key. Two endpoints. raw.openrout.ing/v1 gives you direct OpenAI-compatible completions — pure model access, no orchestration, minimum latency. api.openrout.ing/v1 wraps every request in a full agentic runtime: autonomous tool execution, persistent memory, MCP server connections, workflow orchestration, knowledge retrieval, guardrails, and multi-step reasoning — all included free.

Raw API

raw.openrout.ing/v1

POST /v1/chat/completions OpenAI-compatible format All 23,480+ open-source models Streaming & non-streaming Tool calling (parallel) JSON mode & structured output Logprobs & top_logprobs Seed for reproducibility

Intelligent API Free

api.openrout.ing/v1

POST /v1/agents/:id/chat Autonomous multi-step execution Persistent memory across sessions MCP tool connections (23,480+) Workflow DAG orchestration Knowledge base RAG retrieval Fact extraction & graph construction Guardrails & safety filtering Multi-agent swarm coordination Cron & event scheduling Observability traces & logs

How intelligent inference works

AGENTS
Create an agent with a system prompt, tools, and knowledge. Send a message — the agent decides which tools to call, executes them, reads the output, and loops until the task is complete. No manual orchestration. Unlimited agents per account.
MEMORY
Three-tier hot/warm/cold memory. Short-term context stays in the active conversation. Medium-term facts auto-extracted and stored. Long-term knowledge persisted in vector stores with semantic retrieval. Agents remember across sessions — no re-explaining.
MCP SERVERS
9,973 MCP-compatible servers ready to connect. File systems, databases, APIs, browsers, code execution environments — your agent can interact with any external tool or data source via the Model Context Protocol. No custom integrations needed.
WORKFLOWS
Define DAG pipelines with branching, conditional logic, parallel execution, and error recovery. Chain agents together — one extracts data, another validates, a third takes action. Workflows run on schedule or on event. Unlimited workflows.
KNOWLEDGE
Upload documents, connect databases, or point to URLs. The platform auto-chunks, embeds, and indexes everything. Agents retrieve relevant context via semantic search at inference time. Unlimited knowledge bases per account.
GUARDRAILS
Seven built-in safety and compliance layers: input filtering, output validation, PII redaction, topic restriction, toxicity detection, instruction-injection defence, and custom rule engine. Apply per-agent or account-wide.
MULTI-AGENT
Swarm orchestration with supervisor pattern. A coordinator agent delegates tasks to specialist agents — researcher, coder, validator — each with their own tools, memory, and knowledge. Results merge back into a single response.
OBSERVABILITY
Full execution traces — every tool call, every reasoning step, every memory retrieval, every token. Eight integrated tools: traces, metrics, logs, error tracking, latency profiling, token accounting, cost attribution, and audit logs.
Pricing

Fixed price.
No per-token billing.

Every open-source model. Full agentic platform. Zero logs.

Basic

£79 /mo
£79/mo
£0.00395/M tokens
Tokens
20B
Agents
Unlimited
Conc
4
23,480+ models (up to 228B params)
Intelligence API (auto-summarize)
256K context window
Streaming & batch inference
Zero logs. Fully private.
Cancel anytime
Get Started

Plus

£179 /mo
£179/mo
£0.00358/M tokens
Tokens
50B
Agents
Unlimited
Conc
5
All 23,480+ open-source models
Agentic ecosystem included
Intelligence API (auto-summarize)
256K context window
Streaming & batch inference
Zero logs. Fully private.
Cancel anytime
Get Started
Recommended

Pro

£299 /mo
£299/mo
£0.00299/M tokens
Tokens
100B
Agents
Unlimited
Conc
8
All 23,480+ open-source models
Agentic ecosystem included
Intelligence API (auto-summarize)
256K context window
Streaming & batch inference
Zero logs. Fully private.
Cancel anytime
Priority model loading
Get Started

Pro+

£479 /mo
£479/mo
£0.00192/M tokens
Tokens
250B
Agents
Unlimited
Conc
12
All 23,480+ open-source models
Agentic ecosystem included
Intelligence API (auto-summarize)
256K context window
Streaming & batch inference
Zero logs. Fully private.
Cancel anytime
Priority model loading
Get Started

Ultra

£679 /mo
£679/mo
£0.00136/M tokens
Tokens
500B
Agents
Unlimited
Conc
24
All 23,480+ open-source models
Agentic ecosystem included
Intelligence API (auto-summarize)
256K context window
Streaming & batch inference
Zero logs. Fully private.
Cancel anytime
Priority model loading
Get Started

Enterprise

£999 /mo
£999/mo
£0.0010/M tokens
Tokens
1T
Agents
Unlimited
Conc
30
All 23,480+ open-source models
Agentic ecosystem included
Intelligence API (auto-summarize)
256K context window
Streaming & batch inference
Zero logs. Fully private.
Cancel anytime
Priority model loading
Custom model hosting
Get Started

Open source AI.
Zero compromise.

Get API Key →