Originally posted by kaize on X
For a long time, I blamed Claude for being too restrictive. Then I realized I was thinking about it all wrong.
Claude doesn’t count messages — it counts tokens. Once that clicked, everything changed. Here are the 10 habits that dramatically cut my token usage and kept me well within my limits.
1. Edit Your Prompt — Don’t Send a Follow-Up
When Claude misses the mark, the temptation is to fire back with “No, I meant…” or “That’s not what I wanted…” Resist that urge.
Every follow-up message gets added to the conversation history, and Claude re-reads all of it on every single turn. The token cost isn’t linear — it’s quadratic:
Total tokens = S × N(N+1) / 2 (S = avg tokens per exchange, N = number of messages)
At ~500 tokens per exchange:
- 5 messages → 7,500 tokens
- 10 messages → 27,500 tokens
- 20 messages → 105,000 tokens
- 30 messages → 232,000 tokens
Message 30 costs 31× more than message 1.
What to do instead: Click Edit on your original message, fix it, and regenerate. The old exchange is replaced — not stacked on top.
2. Start a Fresh Chat Every 15–20 Messages
Long conversations are brutal on your token budget. A 100+ message chat at ~500 tokens per exchange burns over 2.5 million tokens — the vast majority of which is just re-reading old history.
One developer tracked his usage and found that 98.5% of tokens went toward re-reading context. Only 1.5% was actually generating output.
What to do instead: When a chat gets long, ask Claude to summarize the conversation, copy that summary, open a new chat, and paste it as your first message.
3. Batch Your Questions Into One Message
Splitting questions across multiple messages feels more natural, but it’s one of the costliest habits you can have. Three separate prompts means three full context loads.
Instead of:
- “Summarize this article”
- “Now list the main points”
- “Now suggest a headline”
Write: “Summarize this article, list the main points, and suggest a headline.”
You save tokens and often get better answers — Claude sees the full picture from the start.
4. Upload Recurring Files to Projects
Every time you upload the same PDF to a new chat, Claude re-tokenizes it from scratch. If you work with contracts, briefs, style guides, or any long documents, this adds up fast.
What to do instead: Use the Projects feature. Upload your file once, and it gets cached. Every conversation inside that project can reference it without burning tokens again.
5. Set Up Memory & User Preferences
How many times have you opened a new chat and typed something like “I’m a marketer, I write in a casual tone, keep paragraphs short…”? Those setup messages are pure token waste.
What to do instead: Go to Settings → Memory and User Settings and save your role, communication style, and preferences once. Claude will apply them automatically to every new chat.
6. Turn Off Features You’re Not Using
Web search, connectors, and Explore mode all add tokens to every response — even when you don’t need them.
- Writing your own content? Turn off Search and Tools.
- Not doing complex reasoning? Keep Advanced Thinking off by default, and only enable it if your first attempt falls short.
Rule of thumb: If you didn’t turn it on intentionally, turn it off.
7. Use Haiku for Simple Tasks
Grammar checks, brainstorming, formatting, quick translations — Claude Haiku handles all of this at a fraction of the cost of Sonnet or Opus.
A simple mental model:
- Haiku → quick, simple tasks (low cost)
- Sonnet → real work (medium cost)
- Opus → deep, complex thinking (high cost)
Routing simple tasks to Haiku can free up 50–70% of your budget for work that actually needs a more powerful model.
8. Spread Your Work Across the Day
Claude uses a rolling 5-hour window — not a midnight reset. Messages sent at 9 AM won’t count against you by 2 PM.
If you blow through your limit in one morning session, you’re leaving most of your daily allowance on the table.
What to do instead: Break your day into 2–3 sessions (morning, afternoon, evening). By the time you return, your earlier usage has rolled off and your limit has refreshed.
9. Work During Off-Peak Hours
As of March 26, 2026, Anthropic adjusts how quickly your 5-hour limit is consumed based on demand. During peak hours — 5–11 AM Pacific / 8 AM–2 PM Eastern on weekdays — the same query costs more against your limit than it would off-peak.
Your weekly allowance doesn’t change, but shifting heavy tasks to evenings or weekends stretches it significantly. If you’re in Europe, Latin America, or Asia, peak hours may fall during your afternoon — worth checking based on your time zone.
10. Enable Extra Usage as a Safety Net
If you’re on a Pro, Max 5x, or Max 20x plan, consider enabling Overage under Settings → Usage.
When your session limit runs out, Claude doesn’t cut you off — it switches to pay-as-you-go billing at API rates. You set a monthly spending cap so there are no surprise charges.
This isn’t about saving tokens. It’s about not losing your work at the worst possible moment.
The Bottom Line
These habits feel like a lot at first. But once they’re second nature, hitting your limit becomes a rare event. You may even find yourself downgrading from a max plan — because you simply don’t need it anymore.
Claude doesn’t count messages. It counts tokens. Use them wisely.
