Subscription inference API

permute.sh

OpenAI-compatible text and image inference with API keys, monthly plans, usage tracking, and 30-day request history.

Platform

One API, one account, one console.

Permute keeps model access, billing, and request history in one place.

OpenAI-Compatible

Use the same SDKs and switch only the base URL.

Subscription Quota

Plans include text and image usage each month.

Connected Usage

Direct API usage and connected app usage can be shown separately.

Request History

Request history is available for 30 days.

Overview

Built for production use.

Permute is built for teams that want one API, one billing flow, and one account.

Independent service

Permute runs as its own inference product with its own accounts, billing, and support.

Production-first

Use stable model names, one billing flow, and one console.

Focused product

The public product stays focused on keys, plans, models, billing, and status.

Principles

What the product is built to do.

  • OpenAI-compatible API contracts
  • Subscription-based usage limits
  • 30-day request history by default
  • Stable model names with clear pricing

Drop-In Endpoint

Same SDKs, controlled quota.

Use one OpenAI-compatible endpoint for text and image requests, with billing, usage, and account access managed in the same console.

inference.py
from openai import OpenAI

client = OpenAI(
    base_url="https://api.permute.sh/v1",
    api_key="pm_..."
)

response = client.chat.completions.create(
    model="deepseek-v4-flash",
    messages=[{"role": "user", "content": "Ship it"}]
)

Model Catalog

Text and image routes from day one.

Text and image generation ship together in v1, with service health surfaced through the console and status page.

Text Models

  • GLM-5.1
  • GLM-5
  • Kimi K2.5
  • Kimi K2.6
  • DeepSeek V4
  • DeepSeek V4 Flash
  • DeepSeek V3.2

Image Models

  • Grok Image
  • ChatGPT Image
  • Lumina
  • Anima

Enterprise

Custom model hosting for large-volume teams.

Need a model that is not listed? For enterprise volume, Permute can host requested models on dedicated infrastructure, handle maintenance, and provide an API endpoint for your workload.

For enterprise plans or custom model hosting, contact our sales team.

  • Models requested for large-volume use
  • Dedicated hosting and maintenance on Permute infrastructure
  • Private API endpoint with custom pricing and plan terms