NEWInfrastructure before hype.

Control every LLM request before it reaches your provider.

Pulse is the edge control plane for AI requests: block threats, enforce spend, route smartly, and inspect every call.

One endpoint for 19 native providers, plus observability, guardrails, routing, and a playground in one runtime.

Cloudflare edge250k free requests/month19 native providers96% detection · 0% false positiveIncident brief PDFLangfuse/OTLP export

19

Native providers

+ MCP gateway

~1.41 ms

Median Worker CPU

Cloudflare production p50

96%

ThreatPrint detection

0% false positive

1 URL

Integration

Swap the base URL

Block risky prompts before upstream spend

ThreatPrint and session drift checks run before provider calls.

Enforce budgets and policy at the edge

Durable Objects, KV, and Worker policy gates keep spend decisions close to traffic.

Trace every request in real time

Pipeline spans, W3C trace context, live feed, and exports make every hop inspectable.

Export incident-ready evidence

Incident briefs include timeline, ThreatPrint result, spend impact, evidence, and redactions.

Control-first architecture

Pulse keeps decisions on the edge. Cloudflare Workers handle auth, limits, budgets, ThreatPrint, and routing in the hot path. R2 handles cold archive. That split keeps latency low and lets us price serious controls for every team.

Not more hype. More leverage.

One API, every model you care about

OpenAI/openai
Anthropic/anthropic
Google AI/google
Groq/groq
Mistral/mistral
Together/together
Fireworks/fireworks
Perplexity/perplexity
Cohere/cohere
DeepSeek/deepseek
OpenAI/openai
Anthropic/anthropic
Google AI/google
Groq/groq
Mistral/mistral
Together/together
Fireworks/fireworks
Perplexity/perplexity
Cohere/cohere
DeepSeek/deepseek
xAI/xai
AWS Bedrock/bedrock
GCP Vertex/vertex
Azure OpenAI/azure
OpenRouter/openrouter
HuggingFace/huggingface
Replicate/replicate
Ollama/ollama
Custom/custom
xAI/xai
AWS Bedrock/bedrock
GCP Vertex/vertex
Azure OpenAI/azure
OpenRouter/openrouter
HuggingFace/huggingface
Replicate/replicate
Ollama/ollama
Custom/custom

Commercial, hyperscaler, open-source, and self-hosted — all behind a single endpoint. Missing yours?

Product tour

Five surfaces your team lives in — one edge runtime

Tap a surface to see how Pulse presents it. Below that is the full named capability matrix (Gateway through Platform) kept in lockstep with the Pulse-Proxy README and worker code.

One endpoint

AI Gateway

A single OpenAI-compatible URL fans out to 19 native providers. Keep the SDK you already use — we speak every upstream's native wire format.

  • OpenAI, Anthropic, Google, Groq, DeepSeek, xAI, Bedrock, Vertex, and more
  • Streaming, tools, vision, JSON mode — all pass through unchanged
  • Per-provider retries and timeouts tuned at the edge

Unified request path

your app
langchain
crewai
cursor

Pulse edge

< 2 ms

openai
anthropic
gemini
bedrock
/v1/chat/completions— native wire format for every provider

Full stack

Every capability named, nothing hand-wavy

Pipeline stages mirror Pulse-Proxy execution order. Pillars list what ships today — optional bindings called out explicitly.

Request path

Same order as the production Worker: admission → upstream → async finalize (April 13, 2026 ThreatPrint benchmark: 96% detection, 0% false positives on the published corpus).

  1. 01

    Admission

    10 MB body cap, stale-timestamp reject, CORS

  2. 02

    Auth

    X-Pulse-Key lookup, tenant resolution

  3. 03

    Rate limit

    Per-key KV sliding window

  4. 04

    Idempotent

    Dedup + replay via idempotency_keys

  5. 05

    Budget

    Durable Object hot path, Supabase fallback

  6. 06

    Governor

    Spend caps, routing, shadow, quotas, circuits

  7. 07

    ThreatPrint

    Prescore + full scan, <2 ms budget

  8. 08

    Upstream

    Tee-stream to provider with key pool

  9. 09

    Output

    JSON-schema firewall, repair, canary check

  10. 10

    Finalize

    Signed receipts, logs, live WS feed, OTLP, archive

Gateway

One edge endpoint, native wire format per provider.

6 areas
  • 19 native providers

    OpenAI, Anthropic, Google, Groq, Mistral, Cohere, DeepSeek, xAI, Bedrock, Vertex, Azure, and more

  • MCP gateway

    HTTP POST + SSE forwarding for Model Context Protocol clients

  • Bedrock SigV4

    Converse path with host allowlist and signed upstream

  • Custom endpoints

    X-Pulse-Base-URL for OpenAI-compatible self-hosted models

  • Idempotency keys

    Safe POST retries + replay via idempotency_keys table

  • Request attribution

    X-Pulse-Meta-*, X-Pulse-User-Id, session/app/feature/env headers

Observability

Every call traced, streamed, and exportable.

7 areas
  • Live WebSocket feed

    Stream new requests to the dashboard in real time

  • Pipeline + W3C trace

    Hierarchical spans with elapsed-time breakdowns

  • Policy receipts

    Signed request/response hashes with policy version and route context

  • OTLP export

    Generic OTLP and Langfuse destinations, SSRF-guarded

  • Incident-brief PDF

    One-click postmortem export for any request

  • Phoenix cold archive

    Request logs flushed to R2 on a cron schedule

  • Saved views

    URL-shareable filters across range, provider, status, trace focus

Security

Measured protection, not marketing promises.

6 areas
  • ThreatPrint scanning

    96% detection / 0% false positive on published corpus, <2 ms budget

  • Prompt Guard 2

    Optional Workers AI edge judge for richer classification

    optional binding

  • Session drift blocking

    KV-tracked hostile turn count per session

  • Output schema firewall

    JSON-schema validate, repair, and firewall-event log

  • Cross-tenant canaries

    Honeytokens that surface data leakage across tenants

  • SSRF guards

    Egress policy on OTLP, governor webhooks, SSO metadata

Governor

Policy decisions before upstream spend happens.

8 areas
  • Spend caps

    Per-key, per-project, per-org caps enforced at the edge

  • Conditional routes

    Regex + metadata rules that pick provider/model

  • Provider fallbacks

    Safe parsed fallback chains, never silent re-route to OpenAI

  • Shadow routing

    Mirror traffic to a secondary provider for comparison

  • Weighted model routing

    Percentage split between models at the gateway

  • Key pools

    Round-robin across encrypted upstream credentials

  • Tool & task quotas

    Per-tool and per-task budgets with reservations

  • Circuit breakers

    Automatic skip + recovery on provider failure

LLM Ops

Ship prompt + model changes with real evidence.

8 areas
  • Prompt templates

    Versioned templates with threat scans and activation flow

  • Datasets

    CSV import, row editing, eval fixtures

  • Evaluations

    Scorers, runs, score distributions, result drilldown

  • Experiments

    Draft A/B splits, run, pause, compare aggregates

  • Annotations

    Human ratings on request logs with CSV export

  • Prompt replay

    Re-run a batch of historical requests against a new template

  • Multi-model playground

    Side-by-side compare up to 4 providers with ThreatPrint per slot

  • Semantic cache

    Optional Vectorize + Workers AI embedding cache

    optional binding

Admin

Run it in a team without surprise outages.

6 areas
  • Orgs + projects

    Multi-tenant membership with project-scoped budgets

  • SAML SSO

    Test-before-enforce flow, org-scoped metadata

  • Audit log

    Admin activity stream with JSON diffs

  • Virtual keys

    Provider credentials stored AES-256-GCM; plaintext never leaves the server

  • Policy bundles

    Versioned policy snapshots with drift audit

  • Stripe billing

    Checkout, customer portal, webhook-driven entitlements

Platform

Meet developers where they already are.

6 areas
  • Node SDK

    OpenAI + Anthropic + LangChain + Vercel AI SDK adapters

  • Python SDK

    PulseRuntime client + OpenAI and Anthropic shims

  • CLI

    login / keys list-create-revoke / test / status

  • Self-host Docker

    workerd image with documented graceful degradation

  • OpenAPI spec

    Published contract for proxy, dashboard, governor, MCP

  • Reproducible benchmarks

    Open corpus + script in Pulse-Proxy/BENCHMARKS.md

Edge pipeline

Ten stages, one request.

Every proxied call flows through these stages on Cloudflare Workers — from admission control to output firewall — before your app sees a single byte of response.

Stage 01

Admission

10 MB body cap, stale-timestamp reject, CORS

Stage 02

Auth

X-Pulse-Key lookup, tenant resolution

Stage 03

Rate limit

Per-key KV sliding window

Stage 04

Idempotent

Dedup + replay via idempotency_keys

Stage 05

Budget

Durable Object hot path, Supabase fallback

Stage 06

Governor

Spend caps, routing, shadow, quotas, circuits

Stage 07

ThreatPrint

Prescore + full scan, <2 ms budget

Stage 08

Upstream

Tee-stream to provider with key pool

Stage 09

Output

JSON-schema firewall, repair, canary check

Stage 10

Finalize

Signed receipts, logs, live WS feed, OTLP, archive

See it running

Every call, narrated.

This is the live feed view from the dashboard — allow, block, repair — all coming off Cloudflare Workers the instant each request completes. Illustrative replay on this page; real traces start streaming the moment you sign in.

  • · Allow rows include governor decisions, rate-limit state, and idempotency hits.
  • · Blocks name the exact reason — ThreatPrint, session drift, policy.
  • · Repairs flag output-firewall activity, never a silent rewrite.

Live feed

Illustrative replay

200 OKreq_42aaopenai/gpt-4o-mini412 ms1240 tok

streamed · cached governor decision

200 OKreq_09b2anthropic/claude-3-5-sonnet690 ms880 tok

routed via shadow fallback

403 Blockedreq_ff17openai/gpt-4o3 ms0 tok

threatprint: jailbreak (score 0.94)

200 OKreq_18c3google/gemini-2.5-flash341 ms540 tok

prompt-guard-2 clean

Illustrative only. Real traces stream from Cloudflare Workers to the dashboard over a signed websocket once you sign in.

Quickstart

Point your SDK at Pulse. That's the whole diff.

No rewrites, no middleware, no vendor lock-in. Pulse speaks every provider's native wire format.

quickstart.tsone line diff
import OpenAI from "openai";

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
  baseURL: "https://proxy.orionslock.com/openai/v1",
  defaultHeaders: { "X-Pulse-Key": process.env.PULSE_KEY! },
});

const r = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello, Pulse." }],
});

Drop-in

Same SDK, same wire format. Swap the base URL and ship.

Global edge

Runs on Cloudflare Workers across 250+ Points of Presence.

Auto-secured

ThreatPrint scores every prompt before it reaches upstream.

Dashboard

One view. Every provider. Every request.

Live spend, token counts, latency, threat scores, and incident replay — aggregated from Pulse itself, not scraped per-provider.

pulse.orionslock.com/dashboard

Control plane

March 2026 · month-to-date

Sync Now

Total MTD Spend

$23.47

+5% vs last month

Requests

18.2k

99.98% uptime

Tokens Used

10.5M

input + output

Potential Savings

$12.60

3 tips

Daily Spend

OpenAI
Anthropic
Google · Groq · others

Spend by Provider

OpenAI
$14.82
Anthropic
$6.90
Groq
$1.10
Google
$0.65

How it works

From zero to full visibility in minutes

01

Swap one env var

Point OPENAI_BASE_URL (or the equivalent) at proxy.orionslock.com/<provider>. Keep your existing SDK — OpenAI, Anthropic, Vertex, Bedrock, whatever.

02

Traffic runs through the edge

Every request is ThreatPrint-scanned (under 2 ms scan budget), cost-metered, rate-limited, and logged on Cloudflare Workers (median CPU ~1.41 ms).

03

See and control your AI

Unified dashboards for spend, tokens, latency, and threats. Governor rules can reroute or block before traffic leaves your perimeter.

Built on, and compatible with

Cloudflare Workers

Edge runtime

Managed Postgres

Audit + auth data

AES-256-GCM

Keys at rest

Security review

By request

Stripe billing

Cards + invoices

HTTPS only

TLS 1.3 everywhere

0

Native providers

+ MCP gateway

0%

ThreatPrint detection

0% false positive

< 2 ms

Scan budget

~1.41 ms median CPU

250+

Cloudflare PoPs

Global edge runtime

Why Pulse

Infra shouldn't be gated

Most gateways are priced for procurement cycles. Pulse ships production controls at a builder price, without cutting core safety or visibility.

$25 Pro, not enterprise tax

Flat pricing that starts generous on Developer and scales predictably before enterprise procurement starts.

All features, one plan

ThreatPrint, Governor, waste detection, playground — no "contact sales" upgrades to unlock basic production safety.

Ship today, not next quarter

Three-line integration, no new SDK to learn, no proxy config YAML that needs its own team to maintain.

Features

Everything you need to run LLMs in production

Spend tracking is just the beginning. Pulse ships with the things teams keep bolting on themselves.

ThreatPrint security scanning

Behavioral + structural detection for prompt injection, PII exfiltration, jailbreaks, and URL-based data leaks — on every request, before upstream.

Governor routing

Failover chains, cost-aware routing, and policy-based blocks. If OpenAI is down, Pulse re-routes to Anthropic automatically with the same request shape.

Unified spend dashboard

Daily spend, token counts, per-model breakdowns, projected month-end, and attribution across keys and teams — no per-provider logins.

Budget alerts via email

Set monthly limits per key, per team, or per org. Pulse emails you at 50/80/100% so you never wake up to a bill surprise.

Waste detection engine

Spots oversized system prompts, frontier models used on trivial tasks, missed prompt-cache hits, and suggests drop-ins that cut cost by 40–80%.

Virtual keys, AES-256-GCM

Provider credentials and upstream key pools are AES-256-GCM encrypted before they touch the database — plaintext never leaves the server.

Live request feed + replay

Stream every call as it happens. Inspect prompts, responses, threat signals, routing decisions, and latency breakdowns with full payload replay.

Prompt versioning & evals

Ship prompt changes with git-style diffs, run them against datasets, and promote to production only when the eval score ticks up.

Multi-model playground

Compare up to four providers on the same prompt. ThreatPrint runs per-slot so you can see how different models handle the same payload.

Pick your path

Targeted next steps.

Four personas, four distinct paths through the product. Pick one — we won't try to sell you the rest of the tour until you want it.

Drop-in at your base URL.

Swap one env var, keep your native OpenAI / Anthropic / Google SDK. Pulse preserves the wire format per provider.

Pricing

Simple, transparent pricing

Start on Developer. Upgrade when you need higher request volume and seats.

Developer

$0/mo

Mission-tier. Real product, no credit card required.

  • 250k requests / month
  • 1 seat
  • 14-day retention
  • Full proxy gateway + cost tracking
  • Virtual keys
  • Basic ThreatPrint
  • No credit card required

Pro

$25/mo

Best fit for production builders shipping daily.

  • 2M requests / month
  • 3 seats
  • 30-day retention
  • Full ThreatPrint
  • Output schema firewall
  • Cross-tenant canary detection
  • Prompt cache metrics
  • Slack + email alerts
  • CLI + webhooks
  • MCP gateway

Team

$125/mo

Cross-functional governance for high-volume teams.

  • 20M requests / month
  • 10 seats
  • 90-day retention
  • SAML SSO + RBAC
  • Audit logs + policy bundles
  • WebSocket live feed
  • Priority support (48-hour response)

Need more than 10 seats, custom retention, or a private deployment?

Contact us

Get your AI stack under control

One endpoint, every model, full observability and security. Developer tier, no card, five-minute setup.

Questions? Email us — we reply within 24 hours.

You're one env var away

$export OPENAI_BASE_URL=https://proxy.orionslock.com/openai/v1
$export PULSE_KEY=plse_live_…
$npm run dev ✓ routed through Pulse