Copasetic Flow

Posts

Showing posts with the label GPT-5-Nano

LLMs or SLMs? A Gladych Files PsyOps Demo Study

I put OpenAI’s gpt-5-nano and gpt-5.1 head-to-head on my psy-ops article scorer to see what you really get for the extra spend. Along the way I ran into pricing surprises, wild variance, and a reminder that ChatGPT’s shiny new memory feature can quietly bend your evals if you’re not careful. A post on LinkedIn a few days back suggested using Small Language Models (SLMs) as opposed to LLMs for repetitive tasks. This seemed like a great idea in some regards for me, but I was curious about how it would apply to apps that were intended to perform lanugage analysis. Luckily, I have the psy-ops app up and running. Also? At the moment, it is using a close-to-an-SLM model, gpt-5-nano due to pricing decisions. I used it as a test vehicle to look at the difference betwween gpt-5 nano and full featured gpt-5.1. The testing framework I used: Starting from this article, I first did three separate anayses with gpt-5-nano, and then three others with gpt-5.1. I then used gpt-5.1...

Deploying a ChatKit Demo for PsyOps Detection

I deployed the LLM Psy-ops detection app earlier today! For those of you just hopping onboard, the WhyFiles ran an episode highlighting a simple, logical scoring method publicized by NCI for determining if a piece media or new article was emotionally manipulative, (think propaganda), or not. I was looking for a good app to practice deployment, guardrails, and evals, and this one suggested by a @somethingLethal on reddit seemed promising in all those regards. If you'd like to try it, you can find the app at https://projecttoucans.com/gladych_files_psy_ops . LLMs, Simple Math, and Pricing The Psy-op scoring instrument requires that the model sum the scores for the twenty categories. gpt-4o-mini did not sum any of the scores correctly. It got close, but that was about it. I experimented with the python code interpreter to cure the simple math issue. The code interpreter seemed reasonable at first. I mean, three cents per compute minute , not bad right? Ins...