// guide

How to save Claude credits.
9 techniques that actually work.

One user wasted 1 million tokens uploading PDFs to Claude — then discovered the same documents as markdown used 90% fewer tokens. Here's how Claude's limits actually work and how to make them go much further.

// quick answer

Claude Pro uses a 5-hour rolling window — not a monthly bucket. The fastest way to save credits is to convert PDFs and DOCX files to markdown before uploading (60–90% fewer tokens), turn off extended thinking for routine tasks, and start fresh conversations instead of continuing long threads.

How Claude's credits actually work

“Credits” in Claude.ai aren't a prepaid balance. They're a rolling usage window that refreshes continuously. Understanding the mechanics is the first step to using them efficiently.

Claude Pro

~44K tokens

per 5-hour window

$20/mo

Claude Max 5x

~220K tokens

per 5-hour window

$100/mo

Claude Max 20x

~880K tokens

per 5-hour window

$200/mo

Input tokens

Your prompts, conversation history, file contents, system prompts, tool definitions. Charged at the base rate.

Output tokens

All text Claude generates. Costs 5× more than input — the most important cost driver to control.

Extended thinking tokens

Claude's internal reasoning. Billed at output token rates. Can cost 9× more than a standard response for the same task.

Cache reads

Reused context from prompt caching. 90% discount — $0.30/MTok instead of $3.00/MTok on Sonnet.

Vision tokens

(width × height) / 750. A 1,000×1,000px image = ~1,334 tokens. Crop aggressively.

What burns Claude credits the fastest

PDF uploads

Repeated headers/footers, font metadata, layout coordinates. A 50-page PDF = 75,000 tokens. Same content as markdown = 21,000.

Extended thinking (always on)

Billed at output rates — 5× more expensive than input. Disable for summarising, formatting, and simple Q&A.

Long conversations

Claude re-reads every message on every turn. Message 20 pays for messages 1–19 each time.

Tool/MCP overload

Each MCP server loaded adds up to 18,000 tokens to the system prompt per turn, used or not.

9 techniques to save Claude credits

01

Convert files to markdown before uploading

60–90% reduction

This is the single most impactful change most users can make. A 50-page PDF burns 75,000 tokens. Converted to clean markdown, it uses ~21,000. The PDF carries repeated headers, footers, layout metadata, and font references — the model pays for all of it. Markdown carries only content.

02

Use prompt caching for repeated context

90% on cache reads

If you regularly start sessions with the same system prompt, codebase snippet, or reference document, Claude API prompt caching stores it and recharges at just $0.30/MTok (vs $3.00/MTok standard). For claude.ai users, upload documents to a Project once — they persist across conversations without re-tokenising on every turn.

03

Start new conversations instead of continuing long ones

~20% reduction

Claude re-reads the entire conversation history on every single message. By message 15, you're paying for messages 1–14 on every turn. For a new task that doesn't need prior context, use /clear or start a fresh chat. Use /compact to create a compressed summary before switching topics within the same session.

04

Turn off extended thinking for routine tasks

Up to 9× cheaper

Extended thinking tokens are billed at output token rates — already 5× more expensive than input. For complex reasoning tasks, extended thinking is worth it. For summarising, formatting, or simple Q&A, disable it. The savings are immediate and significant.

05

Use Claude Projects for document-heavy workflows

7–10× fewer tokens

Claude Projects use retrieval-augmented generation (RAG) — they retrieve only the relevant parts of uploaded documents rather than loading the full document into context. Upload your reference material once to a Project, then ask questions against it. This avoids re-uploading and loading full docs on every turn.

06

Constrain output length explicitly

40–70% on output cost

Output tokens cost 5× more than input tokens on most Claude models. If you ask an open-ended question, Claude may generate 1,000 tokens when 200 would have answered it. Add length constraints: 'Reply in 3 bullet points', 'Max 150 words', 'JSON only'. Set stop sequences to terminate generation at a defined endpoint.

07

Route to Haiku for simple tasks

60–80% cost reduction

Claude Haiku 4.5 costs $1/MTok input and $5/MTok output. Claude Sonnet 4.6 costs $3/$15. Claude Opus 4.7 costs $5/$25. For summarisation, extraction, reformatting, or classification tasks, Haiku produces equivalent quality at one-fifth the cost. Reserve Sonnet and Opus for tasks that actually require advanced reasoning.

08

Minimise tool and MCP server definitions

Up to 18K tokens/turn

Each MCP server loaded into Claude Code adds up to 18,000 tokens to the system prompt per turn — even if that server is never used. Unload servers you don't need for the current task. In the API, use dynamic toolsets that only include the tools relevant to the current request.

09

Crop and resize images before sending

Up to 96% on vision tokens

Vision tokens are calculated as (width × height) / 750. A 1,000×1,000 pixel screenshot costs ~1,334 tokens. A 200×200 crop of the relevant area costs ~54 tokens. Crop, resize, and reduce image resolution before sending to Claude. Use a focused crop rather than a full-screen capture.

Claude API pricing reference (2026)

ModelInputOutputCache read
Claude Opus 4.7$5.00/MTok$25.00/MTok$0.50/MTok
Claude Sonnet 4.6$3.00/MTok$15.00/MTok$0.30/MTok
Claude Haiku 4.5$1.00/MTok$5.00/MTok$0.10/MTok

Batch API: 50% discount on all rates. Cache write: 1.25× input rate, one-time cost.

Frequently asked questions

How many credits does Claude Pro give me?

Claude Pro ($20/month) gives approximately 44,000 tokens per 5-hour rolling window — not a monthly bucket. Claude Max 5x ($100/month) gives ~220,000 tokens per 5-hour window. There is no fixed monthly credit limit; instead, you have a rolling window that refreshes continuously based on when you last hit the limit.

When does my Claude usage reset?

Claude uses a 5-hour rolling window, not a daily or monthly reset. If you hit the limit at 2pm, you can resume at 7pm. The window resets 5 hours after you first sent a message in that session — not at midnight or a fixed time each day.

What burns Claude credits the fastest?

The fastest credit-burners are: (1) Extended thinking — billed at output token rates which are 5x the input rate. (2) PDF uploads — a 50-page PDF can consume 75,000 tokens, the same content as markdown uses ~21,000. (3) Long conversations — Claude re-reads the entire thread on every message. (4) Tool use — each MCP server adds up to 18,000 tokens to the system prompt per turn.

Does Claude Pro include API credits?

No. The $20/month Claude Pro subscription and the Anthropic API are billed completely separately. Pro gives you claude.ai chat access with a 5-hour rolling window. API access is pay-as-you-go at $3–$5 per million input tokens (model-dependent). Claude Code also requires separate 'extra usage' credits since early 2026.

Can I buy more Claude credits?

Yes. Claude.ai has an 'Extra Usage' feature in Settings > Usage that lets you purchase additional capacity billed at API rates. The daily limit is $2,000 per day. Alternatively, upgrading to Claude Max 5x ($100/month) or Max 20x ($200/month) gives 5x or 20x the standard Pro window.

How much do PDFs cost in Claude tokens vs markdown?

A 50-page PDF typically consumes 70,000–75,000 tokens in Claude. The same document converted to clean markdown uses approximately 21,000 tokens — a 72% reduction. The savings come from stripping repeated headers/footers, layout metadata, font references, and formatting overhead that PDFs carry on every page.

Stop wasting tokens on PDFs

Convert your files to LLM-optimised markdown and reduce Claude token usage by up to 90%. Free, in-browser.