IF you have local AI installed AND it's DECENT, then you can have claude code (or whatever AI coding tool you're using) intelligently choose between using your expensive AI or your LOCAL AI, depending on the need at the moment. Even if you're using the fixed fee per month plan with Claude Code, they WILL cut you off if you use too much. That token count timer gets reset once per week.
Without further adu, here is the prompt:
```text
YOU ARE NOW ON EMERGENCY RATION MODE FOR CLAUDE TOKENS! WE JUST GOT A
WARNING THAT WE'RE CLOSE TO BEING CUT OFF! MINIMIZE your use of Claude.
Make ample use of the 2 ollama MCP tools using qwen3-coder:latest
for everything except planning, tool chaining, and anything more
than boiler plate coding.
```
Might even want to add it to your project's CLAUDE.md file or even your global CLAUDE.md file.
Obviously, replace "Claude" with whatever AI you're using and replace "qwen3-coder:latest" with whatever model you want it to use. But if you're using any other model, then you MUST have claude TEST your model:
```text
Test all of my local Ollama models using the ollama mcp tool(s)
installed to see how many tasks they can reliably perform that you
can then outsource work to without comprimising too much on quality
and give me a table of scores and suggest to me which is the best one
OR which ones are best at which tasks, then update your "EMERGENCY
RATION" rules in CLAUDE.md to use the model's you've verified as
worthy for the tasks they should be used for. Also, look online to
see if there are reports of other free, open source, downloadable
models that will fit MY hardware that people are having success with
in this regard, download them via Ollama and test them too, IF I have
enough drive space to download them.
```
The 2 Ollama MCP tools I use are:
- ollama
- ollama-claude
These were recommended by Claude.
---
*This tip written entirely my me, a human, not A.I.*
This Prompt will SAVE YOU THOUSANDS OF $$$!!!
By Mike
31 views
0