openai_cost_calculator
GitHub

Instant, accurate USD cost estimates for OpenAI & Azure OpenAI

Turn any API response (Chat Completions or Responses API, streaming or non-streaming) into precise, per-request costs — 8-decimal strings or Decimal via a typed dataclass.

Overview

openai_cost_calculator is a tiny, production-hardened helper that reads the usage counters returned by OpenAI/Azure OpenAI and outputs the exact USD cost for that call. It supports cached tokens, undated models, streaming generators, and both classic and new SDKs.

  • Typed API for exact financial arithmetic (Decimal)
  • Legacy API for drop-in string output (8 decimal places)
  • Pricing loaded from a remote CSV with a 24h cache + local overrides
  • Offline mode for pinned environments (no network calls)

Installation

pip install openai-cost-calculator

Package name on PyPI uses dashes; import name uses underscores.

Quick start

Tip: Prefer the typed API for money math.

One-line (legacy string API)

from openai import OpenAI
from openai_cost_calculator import estimate_cost

client = OpenAI()
resp = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role":"user","content":"Hi there!"}],
)

print(estimate_cost(resp))
# {'prompt_cost_uncached': '0.00000150',
#  'prompt_cost_cached'  : '0.00000000',
#  'completion_cost'     : '0.00000600',
#  'total_cost'          : '0.00000750'}

Typed API (recommended)

from openai import OpenAI
from openai_cost_calculator import estimate_cost_typed

client = OpenAI()
resp = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role":"user","content":"Hi there!"}],
)

cost = estimate_cost_typed(resp)
print(cost.total_cost)                 # Decimal('0.00000750')
print(cost.as_dict(stringify=True))    # strings, 8 dp

Responses API

resp = client.responses.create(
    model="gpt-4.1-mini",
    input=[{"role":"user","content":"Hi there!"}],
)

# Both work
from openai_cost_calculator import estimate_cost, estimate_cost_typed
print(estimate_cost(resp))         # dict[str, str]
print(estimate_cost_typed(resp))   # CostBreakdown

CostBreakdown dataclass

Typed results are returned as a frozen dataclass with Decimal fields:

CostBreakdown(
  prompt_cost_uncached: Decimal,
  prompt_cost_cached:   Decimal,
  completion_cost:      Decimal,
  total_cost:           Decimal
)

Use .as_dict(stringify=True|False) to convert to 8-dp strings (legacy) or raw Decimals.

Legacy API

estimate_cost(response) → dict[str,str] keeps your existing code working. Returns:

{
  "prompt_cost_uncached": "…",
  "prompt_cost_cached"  : "…",
  "completion_cost"     : "…",
  "total_cost"          : "…"
}

Pricing utilities

The library resolves model rates from a remote CSV and merges local overrides.

FunctionDescription
refresh_pricing()Force-reload the remote CSV (24-hour cache is bypassed).
set_offline_mode(True)Disable all network fetches; only local overrides are used.
add_pricing_entry(name,date, …)Add/override a single (model, YYYY-MM-DD) row.
add_pricing_entries([...])Bulk add/override multiple rows.
clear_local_pricing()Drop all in-process overrides (remote cache unaffected).
Note: The pricing CSV contains only Text token prices (no Token-Type column). If a model string lacks a date suffix, the calculator uses today’s date and selects the latest pricing row with date ≤ today.

Pricing sources & cache

  • Remote CSV (GitHub) is fetched at most once every 24 hours per process.
  • Local overrides always take precedence over remote rows on key collision.
  • Offline mode disables remote fetches — ideal for air-gapped or pinned deployments.
from openai_cost_calculator import set_offline_mode, refresh_pricing
set_offline_mode(False)  # allow using the remote CSV
refresh_pricing()        # refresh the 24h cache now

Streaming

Pass the generator directly. The helper walks the stream and uses the last chunk that carries .usage.

stream = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role":"user","content":"Hi!"}],
    stream=True,
    stream_options={"include_usage": True},
)

from openai_cost_calculator import estimate_cost_typed
cost = estimate_cost_typed(stream)
print(cost.total_cost)

Error handling

All recoverable errors raise CostEstimateError with a clear message (e.g., missing pricing row, bad input).

from openai_cost_calculator import estimate_cost_typed, CostEstimateError
try:
    cost = estimate_cost_typed(resp)
except CostEstimateError as e:
    print("Could not estimate cost:", e)
Azure OpenAI responses include the original model string (e.g., gpt-4o-mini-2024-07-18) — deployments names are ignored.

Cookbook — add pricing for any model

You can cost any provider’s response as long as you teach the calculator the price point for that (model_name, model_date). In offline mode the library won’t reach the network; only your overrides are used.

from litellm import completion
from openai_cost_calculator import estimate_cost_typed, set_offline_mode, add_pricing_entry

set_offline_mode(True)
# Teach the library a new price point:
add_pricing_entry(
    "ollama/qwen3:30b", "2025-08-01",
    input_price=0.20,         # USD per 1M input tokens
    output_price=0.60,        # USD per 1M output tokens
    cached_input_price=0.04,  # optional
)

response = completion(
    model="ollama/qwen3:30b",
    messages=[{ "content": "respond in 20 words. who are you?","role": "user"}],
    api_base="http://localhost:11434"
)

print(estimate_cost_typed(response))
Works for any model name: choose a stable identifier (e.g., your deployment name), pick a date string (YYYY-MM-DD) you control, and supply the per-1M token prices. Add more entries over time to reflect price changes by date.

Troubleshooting

“Pricing not found” after a new model launch

  1. Check that the model/date row exists in the project’s pricing CSV.
  2. If it exists, call refresh_pricing() (24h cache).
  3. Otherwise, temporarily add a local row with add_pricing_entry(). Then create an issue from here.

“cached_tokens = 0” even with caching

Request usage details: classic API needs include_usage_details=True; streaming needs stream_options={"include_usage": True}.

What if the model string has no date?

The library uses today’s date and selects the latest CSV row with date ≤ today.

License & contributions

MIT License © 2025 Orkun Kınay & Murat Barkın Kınay. PRs that enhance robustness (SDK changes, pricing formats) are welcome.