Business models

API Platform / Pay-Per-Call

The pure usage-based pattern: there is no recurring fee, customers pay only for what they consume, and the unit of consumption is something countable (API calls, tokens, messages, minutes, GB). The hard parts are not the pricing curve - they are ingest at scale, deduplication, real-time visibility, and reconciling what the customer thinks they used against what you billed them.

Real-world examples. OpenAI (per-token, varies by model), Twilio (per-message, per-minute, per-recipient-country), Stripe API (per-call quotas, per-charge fees), Anthropic API (per-token), Mapbox (per-tile-load), SendGrid (per-email). Common shape: a published rate card with graduated discounts above thresholds, real-time usage dashboards in the customer portal, monthly invoice for the prior month's aggregated consumption.

The shape of the problem

The deceptive simplicity of "$X per call" hides:

Event-rate scale. A real API platform receives millions of events per minute. The metering pipeline can't be the bottleneck - and it can't drop events under load, because dropped events become uncollected revenue.
Idempotency at the edge. Network retries, client-side bugs, and outages all produce duplicate event submissions. Without deduplication, customers get billed for events you can prove they didn't actually originate.
Real-time vs period totals. Customers expect to see a usage number in their dashboard within seconds of making a call - while the invoice number for the period only finalizes at month end. The same underlying data has to feed both views.
Tier crossings during a period. Graduated discounts mean the per-unit rate changes mid-period as the customer crosses thresholds. The invoice has to walk the tiers and bill the cheaper tail correctly.
Multi-dimensional metering. "Per call" is rarely just per call - it's per-call × per-region or per-token × per-model. The metering must split the same event stream multiple ways.

Kontorion blueprint

Concern	Kontorion primitive
High-volume event ingest	`POST /v1/transaction/usage-event` with `idempotency_key`
Tier-walking pricing	STAIRCASE pricing model on the product
Real-time customer visibility	`GET /v1/usage/aggregate` (filter by `customer_id`)
Per-region or per-model breakdown	Meter `filter_tree` / `dimensions` on event `metadata`
Unmatched / unknown SKU events	Keyed product with `unmatched_price_key_policy`
Customer-side retry safety	Server-side dedup on `dedup_key_path` per meter window

Build it

1. Define the metered product

Code
curl -X POST https://api.kontorion.eu/v1/products \
  -H "Authorization: Bearer sk_live_..." \
  -d '{
    "name": "API Calls",
    "quantity_source": "METERED",
    "pricing_model": "STAIRCASE",
    "tax_category": "DEFAULT"
  }'

2. Define the meter and bind it to the product

Metering is a two-call flow: first create a meter that defines how raw events are aggregated, then bind it to the product as the billable quantity.

Code
# 2a. Create the meter
curl -X POST https://api.kontorion.eu/v1/meters \
  -H "Authorization: Bearer sk_live_..." \
  -d '{
    "key": "api_calls",
    "name": "API Calls",
    "metric_key": "api_calls",
    "function": "SUM",
    "value_field": "quantity",
    "dedup_key_path": "idempotency_key"
  }'

Code
# 2b. Bind the meter to the product (meter_id from 2a)
curl -X POST https://api.kontorion.eu/v1/products/prod_api_calls/meter-bindings \
  -H "Authorization: Bearer sk_live_..." \
  -d '{
    "meter_id": "mtr_api_calls",
    "pricing_var_name": "quantity",
    "role": "BILLABLE_QUANTITY"
  }'

function is the aggregation (SUM, COUNT, MAX, MIN, AVG, UNIQUE_COUNT, LAST, P95). value_field names the event field summed (required for everything except COUNT / UNIQUE_COUNT). Setting dedup_key_path to your event's idempotency_key ensures two events with the same key within the meter window count as one. Omit window to use the calendar-period default; for sliding/tumbling/session windows pass a window object (kind is CALENDAR_HOURLY, CALENDAR_DAILY, CALENDAR_PERIOD, SLIDING, TUMBLING, or SESSION).

3. Publish the graduated rate card

Code
curl -X POST https://api.kontorion.eu/v1/prices \
  -H "Authorization: Bearer sk_live_..." \
  -d '{
    "product_id": "prod_api_calls",
    "plan_id": "plan_pay_as_you_go",
    "list_price": "0.00",
    "currency": "USD",
    "tiers": [
      { "up_to": 10000,    "unit_amount": "0.10" },
      { "up_to": 100000,   "unit_amount": "0.05" },
      { "up_to": 1000000,  "unit_amount": "0.03" },
      { "up_to": null,     "unit_amount": "0.01" }
    ]
  }'

Tier unit_amount is a decimal string in the price's currency. Each tier rate applies only to the units inside its band - graduated semantics. Customer making 250,000 calls pays: 10k × $0.10 + 90k × $0.05 + 150k × $0.03 = $10,000.

4. Ingest events at runtime

Code
curl -X POST https://api.kontorion.eu/v1/transaction/usage-event \
  -H "Authorization: Bearer sk_live_..." \
  -d '{
    "customer_id": "cust_apico",
    "subscription_id": "sub_pay_as_you_go",
    "product_id": "prod_api_calls",
    "metric_key": "api_calls",
    "quantity": 1,
    "timestamp": "2025-04-15T14:30:00.123Z",
    "idempotency_key": "req_8b1d7c20-4d6c-4d04-9f78-2a3a06d7b2fc",
    "metadata": {
      "endpoint": "/v1/completions",
      "region": "us-east-1",
      "model": "gpt-4"
    }
  }'

Critical fields:

idempotency_key should be your internal request ID. Submitting the same key twice is a no-op.
timestamp should be when the API call actually occurred, not when the event was forwarded - this matters when batched event submission lags.
metadata keys can drive meter filter_tree / dimensions and analytics breakdowns later.

5. Show the customer their real-time usage

Read totals from GET /v1/usage/aggregate. The filters (customer_id, product_id, subscription_id, metric_key) are query params - there is no customer ID in the path. bucket (hour | day | week) and since (RFC3339) are required; until defaults to now, and metric defaults to count (pass sum_quantity to sum the event quantity field).

Code
curl "https://api.kontorion.eu/v1/usage/aggregate?\
bucket=day&\
metric=sum_quantity&\
customer_id=cust_apico&\
subscription_id=sub_pay_as_you_go&\
product_id=prod_api_calls&\
since=2025-04-01T00:00:00Z&\
until=2025-04-30T23:59:59Z" \
  -H "Authorization: Bearer sk_live_..."

Response (time-bucketed; rows carries one bucket per since/until step):


Code
{
  "data": {
    "bucket": "day",
    "since": "2025-04-01T00:00:00Z",
    "until": "2025-04-30T23:59:59Z",
    "metric": "sum_quantity",
    "group_by": "none",
    "series": ["total"],
    "rows": [
      {
        "bucket": "2025-04-15T00:00:00Z",
        "sum_totals": { "total": "247534" }
      }
    ]
  }
}

Decimal sum_quantity totals come back as strings (in sum_totals) so they round-trip without float loss; with the default count metric the buckets carry integer totals instead. The latest bucket reflects ingested events within seconds. For the raw event list (newest first, cursor-paginated) call GET /v1/usage with the same filters.

Variations

Multi-dimensional pricing (per-call × per-model). Make the product keyed with price_key_label: "model". Add one price per model (gpt-4, gpt-4-turbo, gpt-3.5). Events carry price_key: "gpt-4". The same event stream produces per-model lines on the invoice.
Free tier with overage. Set the first STAIRCASE tier to unit_amount: "0.00". The customer sees usage but is only billed above the free threshold.
Per-region pricing. Add a country_code field on the price for geo-targeted rates, or use a meter filter_tree / dimensions on metadata.region to split a single event stream into per-region products.
Spending caps. Listen for usage_event ingestion + reconcile against a self-set monthly cap; throttle or notify when the customer crosses 90%.

What you don't have to build

High-throughput event ingest pipeline (the /v1/transaction/usage-event endpoint absorbs the volume)
Idempotency / deduplication on retried submissions (built into the meter via dedup_key_path)
Real-time aggregation between events and the dashboard query (eventually-consistent within seconds)
Tier-walking math for graduated pricing (STAIRCASE tiers handle it deterministically)
Per-region or per-model fan-out (meter filter_tree / dimensions or keyed products do it without app-layer code)
Reconciliation between dashboard usage and invoice quantity (same source of truth on both sides)

Next steps

Usage Metering - aggregators, filters, deduplication
Pricing Models - STAIRCASE walks vs VOLUME vs PACKAGE
Keyed Prices - one product, many priced dimensions
Analytics - per-customer, per-product usage trends

Last modified on July 25, 2026

SaaS - Per-Seat Hybrid Platform (PaaS)