just-use-nan.
ES / EN
· v1.0 · LIVE
INFRASTRUCTURE / flat-rate inference / no token meter

STOP PAYING
PER TOKEN FOR
MODELS YOU
DON'T EVEN
OWN

Every API call is a slot machine. Your retry loop. Your runaway agent at 3am. Your customer hitting the same endpoint 400 times because of a bug. All metered. All billed. All compounding into an invoice written by someone who profits when your code misbehaves.

// 01_PROBLEM

The problem.

comparison · metered vs flat-rate

Token billing was designed to make your costs unpredictable on purpose. The more your product works, the more you pay. The better your prompts get, the longer they grow. Success is the punishment.

closed APIs · pay per token NaN · flat €70/month
× a buggy loop wakes you up to a $2k bill ✓ burn all the tokens you want · the bill never moves
× rate limits throttle production at 3am ✓ shared cluster sized for builders who actually ship
× prompts logged, stored, used to train ✓ zero logs · prompts never leave your session
× price hikes with zero negotiation ✓ same price next month · same price next year
× model weights are secret ✓ Qwen, Gemma, DeepSeek · public weights · audit everything
× model deprecated whenever they feel like it ✓ models voted by the community every quarter
× you cap your own users to protect your margin ✓ ship the product you actually wanted to ship
× lock-in cost exceeds annual migration budget ✓ OpenAI-compatible API · swap base_url and you're out
// 02_NUMBERS

The numbers.

receipts · 100% verified
€70
96GB
0
// 03_THESIS

The thesis.

manifesto · one per page
+ + + +
OPEN MODELS GOT GOOD ENOUGHTWELVE MONTHS AGO. THE GPUSEXIST. THE STACK IS SOLVED.YOU'RE STILL PAYING PER TOKENBECAUSE NOBODY TOLD YOUYOU COULD STOP.
// 04_LOSS

What token billing costs you.

6 categories · honest audit
// PREDICTABLE COSTS
A flat fee means you can plan, price, and sleep. With token billing every demo day, every viral tweet, every aggressive user is a financial event. You stop building features because you can't model the cost.
// PRODUCT VELOCITY
Token meters punish iteration. Long context? Expensive. Tool calls in a loop? Expensive. Reasoning models thinking out loud? Very expensive. You ship worse products because the better version costs too much to test.
// USER EXPERIENCE
Free tier capped at 10 messages. Pro tier capped at 200. Why? Not because the model can't handle more. Because your margin can't. Your users feel it. They churn. The cap was always about you, not them.
// DATA SOVEREIGNTY
Every prompt you send becomes their training signal. Your competitors' queries. Your customers' secrets. Their next model. Your liability. Reading the ToS doesn't make it disappear. It just confirms it.
// MODEL CHOICE
Open models like Qwen, Gemma, DeepSeek, Llama closed the gap. They're inside the cluster. You don't have to wait for a closed lab to release the version that's already on Hugging Face. The future is open and it already shipped.
// EXIT VELOCITY
NaN speaks the OpenAI API. If you ever want to leave, change one base_url and you're out. No proprietary SDK, no bespoke endpoints, no lock-in. The cost of switching is one line of code. That's the deal.
// 05_CODE

It's literally one line.

code · actual diff

Stop pretending migration is hard. NaN exposes an OpenAI-compatible API. If your code already calls OpenAI, this is the entire diff:

$ closed API · metered
client = OpenAI(
  api_key="sk-..."
)

response = client.chat.completions.create(
  model="gpt-4o",
  messages=your_data
)
# meter spinning · invoice growing
$ NaN · flat rate
client = OpenAI(
  api_key="sk-...",
  base_url="https://nan.builders/v1"
)

response = client.chat.completions.create(
  model="qwen3.6",
  messages=your_data
)
# €70/month · meter never started
// 06_RECEIPTS

Their pricing pages say this.

direct quotes · verbatim
"PRICED PER 1M INPUT TOKENS" // "OUTPUT TOKENS BILLED SEPARATELY" // "CACHED INPUT AT A DIFFERENT RATE" // "REASONING TOKENS COUNT AS OUTPUT" // "RATE LIMITS APPLY PER TIER" // "PRICING SUBJECT TO CHANGE" // "USAGE MAY BE USED TO IMPROVE OUR MODELS" // "OVERAGES BILLED AT END OF CYCLE"
// 07_CTA

Just use NaN.

deployment · immediate

€70 a month. Shared GPU. Open models. No token meter. No surprise invoices. No prompts logged. The only thing standing between you and a predictable AI bill is the decision to stop feeding the meter.

flat rate
open models
zero logs
no token meter