just-use-nan | Flat-rate GPU inference, open models, no token meter

// 01_PROBLEM

The problem.

comparison · metered vs flat-rate

Token billing was designed to make your costs unpredictable on purpose. The more your product works, the more you pay. The better your prompts get, the longer they grow. Success is the punishment.

closed APIs · pay per token NaN · flat €70/month

× a buggy loop wakes you up to a $2k bill ✓ burn all the tokens you want · the bill never moves

× rate limits throttle production at 3am ✓ shared cluster sized for builders who actually ship

× prompts logged, stored, used to train ✓ zero logs · prompts never leave your session

× price hikes with zero negotiation ✓ same price next month · same price next year

× model weights are secret ✓ Qwen, Gemma, DeepSeek · public weights · audit everything

× model deprecated whenever they feel like it ✓ models voted by the community every quarter

× you cap your own users to protect your margin ✓ ship the product you actually wanted to ship

× lock-in cost exceeds annual migration budget ✓ OpenAI-compatible API · swap base_url and you're out

// 02_NUMBERS

The numbers.

receipts · 100% verified

€70

96GB

// 03_THESIS

The thesis.

manifesto · one per page

+ + + +

OPEN MODELS GOT GOOD ENOUGHTWELVE MONTHS AGO. THE GPUSEXIST. THE STACK IS SOLVED.YOU'RE STILL PAYING PER TOKENBECAUSE NOBODY TOLD YOUYOU COULD STOP.

// 04_LOSS

What token billing costs you.

6 categories · honest audit

// PREDICTABLE COSTS

A flat fee means you can plan, price, and sleep. With token billing every demo day, every viral tweet, every aggressive user is a financial event. You stop building features because you can't model the cost.

// PRODUCT VELOCITY

Token meters punish iteration. Long context? Expensive. Tool calls in a loop? Expensive. Reasoning models thinking out loud? Very expensive. You ship worse products because the better version costs too much to test.

// USER EXPERIENCE

Free tier capped at 10 messages. Pro tier capped at 200. Why? Not because the model can't handle more. Because your margin can't. Your users feel it. They churn. The cap was always about you, not them.

// DATA SOVEREIGNTY

Every prompt you send becomes their training signal. Your competitors' queries. Your customers' secrets. Their next model. Your liability. Reading the ToS doesn't make it disappear. It just confirms it.

// MODEL CHOICE

Open models like Qwen, Gemma, DeepSeek, Llama closed the gap. They're inside the cluster. You don't have to wait for a closed lab to release the version that's already on Hugging Face. The future is open and it already shipped.

// EXIT VELOCITY

NaN speaks the OpenAI API. If you ever want to leave, change one base_url and you're out. No proprietary SDK, no bespoke endpoints, no lock-in. The cost of switching is one line of code. That's the deal.

// 05_CODE

It's literally one line.

code · actual diff

Stop pretending migration is hard. NaN exposes an OpenAI-compatible API. If your code already calls OpenAI, this is the entire diff:

$ closed API · metered

client = OpenAI(

  api_key="sk-..."

)
 
response = client.chat.completions.create(

  model="gpt-4o",

  messages=your_data

)
 # meter spinning · invoice growing

$ NaN · flat rate

client = OpenAI(

  api_key="sk-...",

  base_url="https://nan.builders/v1"

)
 
response = client.chat.completions.create(

  model="qwen3.6",

  messages=your_data

)
 # €70/month · meter never started

// 06_RECEIPTS

Their pricing pages say this.

direct quotes · verbatim

"PRICED PER 1M INPUT TOKENS" // "OUTPUT TOKENS BILLED SEPARATELY" // "CACHED INPUT AT A DIFFERENT RATE" // "REASONING TOKENS COUNT AS OUTPUT" // "RATE LIMITS APPLY PER TIER" // "PRICING SUBJECT TO CHANGE" // "USAGE MAY BE USED TO IMPROVE OUR MODELS" // "OVERAGES BILLED AT END OF CYCLE"

STOP PAYING
PER TOKEN FOR
MODELS YOU
DON'T EVEN
OWN

The problem.

The numbers.

The thesis.

What token billing costs you.

It's literally one line.

Their pricing pages say this.

Just use NaN.

STOP PAYINGPER TOKEN FORMODELS YOUDON'T EVENOWN

The problem.

The numbers.

The thesis.

What token billing costs you.

It's literally one line.

Their pricing pages say this.

Just use NaN.

STOP PAYING
PER TOKEN FOR
MODELS YOU
DON'T EVEN
OWN