Skip to main content

Posts

Showing posts with the label prompt cache key

How I Cut GPT Input Costs 10× by Turning Off the Vector Store on the Ham Radio Practice Exams

I finally found out why my Extra Class AI Tutor was spending nearly ten times more on input than output tokens. It wasn’t the math, the cache, or the prompt—it was the vector store. Turning it off cut token usage from 17021 to 1743 in a single move.