Control every LLM request before it reaches your provider.

Pulse is the edge control plane for AI requests: block threats, enforce spend, route smartly, and inspect every call.

One endpoint for 19 native providers, plus observability, guardrails, routing, and a playground in one runtime.

Cloudflare edge250k free requests/month19 native providers96% detection · 0% false positiveIncident brief PDFLangfuse/OTLP export

Native providers

+ MCP gateway

~1.41 ms

Median Worker CPU

Cloudflare production p50

96%

ThreatPrint detection

0% false positive

1 URL

Integration

Swap the base URL

Block risky prompts before upstream spend

ThreatPrint and session drift checks run before provider calls.

Enforce budgets and policy at the edge

Durable Objects, KV, and Worker policy gates keep spend decisions close to traffic.

Trace every request in real time

Pipeline spans, W3C trace context, live feed, and exports make every hop inspectable.

Export incident-ready evidence

Incident briefs include timeline, ThreatPrint result, spend impact, evidence, and redactions.

Control-first architecture

Pulse keeps decisions on the edge. Cloudflare Workers handle auth, limits, budgets, ThreatPrint, and routing in the hot path. R2 handles cold archive. That split keeps latency low and lets us price serious controls for every team.

Not more hype. More leverage.

Feature status Quickstart ThreatPrint benchmark Incident brief

One API, every model you care about

OpenAI/openai

Anthropic/anthropic

Google AI/google

Groq/groq

Mistral/mistral

Together/together

Fireworks/fireworks

Perplexity/perplexity

Cohere/cohere

DeepSeek/deepseek

OpenAI/openai

Anthropic/anthropic

Google AI/google

Groq/groq

Mistral/mistral

Together/together

Fireworks/fireworks

Perplexity/perplexity

Cohere/cohere

DeepSeek/deepseek

xAI/xai

AWS Bedrock/bedrock

GCP Vertex/vertex

Azure OpenAI/azure

OpenRouter/openrouter

HuggingFace/huggingface

Replicate/replicate

Ollama/ollama

Custom/custom

xAI/xai

AWS Bedrock/bedrock

GCP Vertex/vertex

Azure OpenAI/azure

OpenRouter/openrouter

HuggingFace/huggingface

Replicate/replicate

Ollama/ollama

Custom/custom

Commercial, hyperscaler, open-source, and self-hosted — all behind a single endpoint. Missing yours?

Product tour

Five surfaces your team lives in — one edge runtime

Tap a surface to see how Pulse presents it. Below that is the full named capability matrix (Gateway through Platform) kept in lockstep with the Pulse-Proxy README and worker code.

One endpoint

AI Gateway

A single OpenAI-compatible URL fans out to 19 native providers. Keep the SDK you already use — we speak every upstream's native wire format.

OpenAI, Anthropic, Google, Groq, DeepSeek, xAI, Bedrock, Vertex, and more
Streaming, tools, vision, JSON mode — all pass through unchanged
Per-provider retries and timeouts tuned at the edge

Try AI Gateway See the code

Unified request path

your app→

langchain→

crewai→

cursor→

Pulse edge

< 2 ms

→openai

→anthropic

→gemini

→bedrock

/v1/chat/completions— native wire format for every provider

Full stack

Every capability named, nothing hand-wavy

Pipeline stages mirror Pulse-Proxy execution order. Pillars list what ships today — optional bindings called out explicitly.

Request path

Same order as the production Worker: admission → upstream → async finalize (April 13, 2026 ThreatPrint benchmark: 96% detection, 0% false positives on the published corpus).

01
Admission
10 MB body cap, stale-timestamp reject, CORS
02
Auth
X-Pulse-Key lookup, tenant resolution
03
Rate limit
Per-key KV sliding window
04
Idempotent
Dedup + replay via idempotency_keys
05
Budget
Durable Object hot path, Supabase fallback
06
Governor
Spend caps, routing, shadow, quotas, circuits
07
ThreatPrint
Prescore + full scan, <2 ms budget
08
Upstream
Tee-stream to provider with key pool
09
Output
JSON-schema firewall, repair, canary check
10
Finalize
Signed receipts, logs, live WS feed, OTLP, archive

Gateway

One edge endpoint, native wire format per provider.

6 areas

19 native providers
OpenAI, Anthropic, Google, Groq, Mistral, Cohere, DeepSeek, xAI, Bedrock, Vertex, Azure, and more
MCP gateway
HTTP POST + SSE forwarding for Model Context Protocol clients
Bedrock SigV4
Converse path with host allowlist and signed upstream
Custom endpoints
X-Pulse-Base-URL for OpenAI-compatible self-hosted models
Idempotency keys
Safe POST retries + replay via idempotency_keys table
Request attribution
X-Pulse-Meta-*, X-Pulse-User-Id, session/app/feature/env headers

Observability

Every call traced, streamed, and exportable.

7 areas

Live WebSocket feed
Stream new requests to the dashboard in real time
Pipeline + W3C trace
Hierarchical spans with elapsed-time breakdowns
Policy receipts
Signed request/response hashes with policy version and route context
OTLP export
Generic OTLP and Langfuse destinations, SSRF-guarded
Incident-brief PDF
One-click postmortem export for any request
Phoenix cold archive
Request logs flushed to R2 on a cron schedule
Saved views
URL-shareable filters across range, provider, status, trace focus

Security

Measured protection, not marketing promises.

6 areas

ThreatPrint scanning
96% detection / 0% false positive on published corpus, <2 ms budget
Prompt Guard 2
Optional Workers AI edge judge for richer classification
optional binding
Session drift blocking
KV-tracked hostile turn count per session
Output schema firewall
JSON-schema validate, repair, and firewall-event log
Cross-tenant canaries
Honeytokens that surface data leakage across tenants
SSRF guards
Egress policy on OTLP, governor webhooks, SSO metadata

Governor

Policy decisions before upstream spend happens.

8 areas

Spend caps
Per-key, per-project, per-org caps enforced at the edge
Conditional routes
Regex + metadata rules that pick provider/model
Provider fallbacks
Safe parsed fallback chains, never silent re-route to OpenAI
Shadow routing
Mirror traffic to a secondary provider for comparison
Weighted model routing
Percentage split between models at the gateway
Key pools
Round-robin across encrypted upstream credentials
Tool & task quotas
Per-tool and per-task budgets with reservations
Circuit breakers
Automatic skip + recovery on provider failure

LLM Ops

Ship prompt + model changes with real evidence.

8 areas

Prompt templates
Versioned templates with threat scans and activation flow
Datasets
CSV import, row editing, eval fixtures
Evaluations
Scorers, runs, score distributions, result drilldown
Experiments
Draft A/B splits, run, pause, compare aggregates
Annotations
Human ratings on request logs with CSV export
Prompt replay
Re-run a batch of historical requests against a new template
Multi-model playground
Side-by-side compare up to 4 providers with ThreatPrint per slot
Semantic cache
Optional Vectorize + Workers AI embedding cache
optional binding

Admin

Run it in a team without surprise outages.

6 areas

Orgs + projects
Multi-tenant membership with project-scoped budgets
SAML SSO
Test-before-enforce flow, org-scoped metadata
Audit log
Admin activity stream with JSON diffs
Virtual keys
Provider credentials stored AES-256-GCM; plaintext never leaves the server
Policy bundles
Versioned policy snapshots with drift audit
Stripe billing
Checkout, customer portal, webhook-driven entitlements

Platform

Meet developers where they already are.

6 areas

Node SDK
OpenAI + Anthropic + LangChain + Vercel AI SDK adapters
Python SDK
PulseRuntime client + OpenAI and Anthropic shims
CLI
login / keys list-create-revoke / test / status
Self-host Docker
workerd image with documented graceful degradation
OpenAPI spec
Published contract for proxy, dashboard, governor, MCP
Reproducible benchmarks
Open corpus + script in Pulse-Proxy/BENCHMARKS.md

Edge pipeline

Ten stages, one request.

Every proxied call flows through these stages on Cloudflare Workers — from admission control to output firewall — before your app sees a single byte of response.

Stage 01

Admission

10 MB body cap, stale-timestamp reject, CORS

Stage 02

Auth

X-Pulse-Key lookup, tenant resolution

Stage 03

Rate limit

Per-key KV sliding window

Stage 04

Idempotent

Dedup + replay via idempotency_keys

Stage 05

Budget

Durable Object hot path, Supabase fallback

Stage 06

Governor

Spend caps, routing, shadow, quotas, circuits

Stage 07

ThreatPrint

Prescore + full scan, <2 ms budget

Stage 08

Upstream

Tee-stream to provider with key pool

Stage 09

Output

JSON-schema firewall, repair, canary check

Stage 10

Finalize

Signed receipts, logs, live WS feed, OTLP, archive

See it running

Every call, narrated.

This is the live feed view from the dashboard — allow, block, repair — all coming off Cloudflare Workers the instant each request completes. Illustrative replay on this page; real traces start streaming the moment you sign in.

· Allow rows include governor decisions, rate-limit state, and idempotency hits.
· Blocks name the exact reason — ThreatPrint, session drift, policy.
· Repairs flag output-firewall activity, never a silent rewrite.

Live feed

Illustrative replay

200 OKreq_42aaopenai/gpt-4o-mini412 ms1240 tok

streamed · cached governor decision

200 OKreq_09b2anthropic/claude-3-5-sonnet690 ms880 tok

routed via shadow fallback

403 Blockedreq_ff17openai/gpt-4o3 ms0 tok

threatprint: jailbreak (score 0.94)

200 OKreq_18c3google/gemini-2.5-flash341 ms540 tok

prompt-guard-2 clean

Illustrative only. Real traces stream from Cloudflare Workers to the dashboard over a signed websocket once you sign in.

Quickstart

Point your SDK at Pulse. That's the whole diff.

No rewrites, no middleware, no vendor lock-in. Pulse speaks every provider's native wire format.

quickstart.tsone line diff

import OpenAI from "openai";

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
  baseURL: "https://proxy.orionslock.com/openai/v1",
  defaultHeaders: { "X-Pulse-Key": process.env.PULSE_KEY! },
});

const r = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello, Pulse." }],
});

Drop-in

Same SDK, same wire format. Swap the base URL and ship.

Global edge

Runs on Cloudflare Workers across 250+ Points of Presence.

Auto-secured

ThreatPrint scores every prompt before it reaches upstream.

Dashboard

One view. Every provider. Every request.

Live spend, token counts, latency, threat scores, and incident replay — aggregated from Pulse itself, not scraped per-provider.

pulse.orionslock.com/dashboard

Control plane

March 2026 · month-to-date

Sync Now

Total MTD Spend

$23.47

+5% vs last month

Requests

18.2k

99.98% uptime

Tokens Used

10.5M

input + output

Potential Savings

$12.60

3 tips

Daily Spend

OpenAI

Anthropic

Google · Groq · others

Spend by Provider

OpenAI

$14.82

Anthropic

$6.90

Groq

$1.10

Google

$0.65

How it works

From zero to full visibility in minutes

Swap one env var

Point OPENAI_BASE_URL (or the equivalent) at proxy.orionslock.com/<provider>. Keep your existing SDK — OpenAI, Anthropic, Vertex, Bedrock, whatever.

Traffic runs through the edge

Every request is ThreatPrint-scanned (under 2 ms scan budget), cost-metered, rate-limited, and logged on Cloudflare Workers (median CPU ~1.41 ms).

See and control your AI

Unified dashboards for spend, tokens, latency, and threats. Governor rules can reroute or block before traffic leaves your perimeter.

Built on, and compatible with

Cloudflare Workers

Edge runtime

Managed Postgres

Audit + auth data

AES-256-GCM

Keys at rest

Security review

By request

Stripe billing

Cards + invoices

HTTPS only

TLS 1.3 everywhere

Native providers

+ MCP gateway

ThreatPrint detection

0% false positive

< 2 ms

Scan budget

~1.41 ms median CPU

250+

Cloudflare PoPs

Global edge runtime

Why Pulse

Infra shouldn't be gated

Most gateways are priced for procurement cycles. Pulse ships production controls at a builder price, without cutting core safety or visibility.

$25 Pro, not enterprise tax

Flat pricing that starts generous on Developer and scales predictably before enterprise procurement starts.

All features, one plan

ThreatPrint, Governor, waste detection, playground — no "contact sales" upgrades to unlock basic production safety.

Ship today, not next quarter

Three-line integration, no new SDK to learn, no proxy config YAML that needs its own team to maintain.

Features

Everything you need to run LLMs in production

Spend tracking is just the beginning. Pulse ships with the things teams keep bolting on themselves.

ThreatPrint security scanning

Behavioral + structural detection for prompt injection, PII exfiltration, jailbreaks, and URL-based data leaks — on every request, before upstream.

Governor routing

Failover chains, cost-aware routing, and policy-based blocks. If OpenAI is down, Pulse re-routes to Anthropic automatically with the same request shape.

Unified spend dashboard

Daily spend, token counts, per-model breakdowns, projected month-end, and attribution across keys and teams — no per-provider logins.

Budget alerts via email

Set monthly limits per key, per team, or per org. Pulse emails you at 50/80/100% so you never wake up to a bill surprise.

Waste detection engine

Spots oversized system prompts, frontier models used on trivial tasks, missed prompt-cache hits, and suggests drop-ins that cut cost by 40–80%.

Virtual keys, AES-256-GCM

Provider credentials and upstream key pools are AES-256-GCM encrypted before they touch the database — plaintext never leaves the server.

Live request feed + replay

Stream every call as it happens. Inspect prompts, responses, threat signals, routing decisions, and latency breakdowns with full payload replay.

Prompt versioning & evals

Ship prompt changes with git-style diffs, run them against datasets, and promote to production only when the eval score ticks up.

Multi-model playground

Compare up to four providers on the same prompt. ThreatPrint runs per-slot so you can see how different models handle the same payload.

Pick your path

Targeted next steps.

Four personas, four distinct paths through the product. Pick one — we won't try to sell you the rest of the tour until you want it.

Drop-in at your base URL.

Swap one env var, keep your native OpenAI / Anthropic / Google SDK. Pulse preserves the wire format per provider.

10-minute setup guide

Copy-paste curl + SDK snippets.

Playground

Compare 4 providers on the same prompt.

OpenAPI spec

Every proxy + dashboard route.

Pricing

Simple, transparent pricing

Start on Developer. Upgrade when you need higher request volume and seats.

Developer

$0/mo

Mission-tier. Real product, no credit card required.

250k requests / month
1 seat
14-day retention
Full proxy gateway + cost tracking
Virtual keys
Basic ThreatPrint
No credit card required

Pro

$25/mo

Best fit for production builders shipping daily.

2M requests / month
3 seats
30-day retention
Full ThreatPrint
Output schema firewall
Cross-tenant canary detection
Prompt cache metrics
Slack + email alerts
CLI + webhooks
MCP gateway

Team

$125/mo

Cross-functional governance for high-volume teams.

20M requests / month
10 seats
90-day retention
SAML SSO + RBAC
Audit logs + policy bundles
WebSocket live feed
Priority support (48-hour response)

Need more than 10 seats, custom retention, or a private deployment?

Get your AI stack under control

One endpoint, every model, full observability and security. Developer tier, no card, five-minute setup.

Start for free Read the quickstart

Questions? Email us — we reply within 24 hours.

You're one env var away

$export OPENAI_BASE_URL=https://proxy.orionslock.com/openai/v1

$export PULSE_KEY=plse_live_…

$npm run dev ✓ routed through Pulse