API + OpenCompress
Change your base URL. Keep everything else.
Same SDK, same code, same models — 40-60% cheaper.
Sign up and grab your sk-occ-* key from the dashboard. Free $10 credit, no card required.
Go to Dashboard →Point your OpenAI SDK at our endpoint — that's the only change.
https://www.opencompress.ai/api/v1Every request is compressed, routed to the upstream LLM, and billed at the reduced rate.
Quick Start
Two lines change. Everything else stays the same.
from openai import OpenAI
client = OpenAI(
base_url="https://www.opencompress.ai/api/v1",
api_key="sk-occ-your-key-here",
)
response = client.chat.completions.create(
model="anthropic/claude-sonnet-4.6",
messages=[{"role": "user", "content": "Hello"}],
)
print(response.choices[0].message.content)Bring Your Own Key
Already have an API key from OpenAI, Anthropic, or another provider? Pass it via headers and we'll route to your account — you only pay for the compression savings.
from openai import OpenAI
client = OpenAI(
base_url="https://www.opencompress.ai/api/v1",
api_key="sk-occ-your-key-here",
default_headers={
"X-Upstream-Key": "sk-your-openai-key",
"X-Upstream-Base-Url": "https://api.openai.com/v1",
},
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}],
)How it works: Your upstream key calls the LLM directly. We compress the prompt first so you use fewer tokens. X-Upstream-Base-Url tells us where to forward.
Streaming
Set stream=True and get SSE exactly like the OpenAI API. Compression happens before forwarding — streaming latency is unaffected.
response = client.chat.completions.create(
model="anthropic/claude-sonnet-4.6",
messages=[{"role": "user", "content": "Write a haiku"}],
stream=True,
)
for chunk in response:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")Why Use the API
Drop-in replacement for any OpenAI-compatible workflow.
OpenAI-Compatible
Drop-in replacement. Same SDK, same endpoints, same response format.
40-60% Cheaper
Prompts are compressed before forwarding. You keep 80% of the savings.
All Models
ChatGPT, Claude, Gemini, DeepSeek, Grok, Kimi — all via one endpoint.
Streaming
Full SSE streaming. stream=true works like the OpenAI API.
Start Saving
Get your API key and start saving on every LLM call in under a minute.