Skip to main content

Posts

Showing posts with the label ChatKit

Punching Through Sandboxes and Codex CLI --yolo

 Just a brief note to mention that when I was fighting Codex CLI earlier this week to call the OpenAI Whisper API , what was really going on was that Codex CLI was sandboxed. ChatGPT helped me modify the script I was creating so that the script itself punched through the sandbox by clearing the proxies that had been setup to keep Codex CLI in the secondbox. It's interesting that ChatGPT didn't just tell me to add the --yolo argument to my codex cli command line. I wonder if that's part of its guardrails, or if ChatGPT doesn't know about the arguments to Codex CLI yet through training. I saw similar things happen when  ChatKit was announced. ChatGPT wasn't quite sure what it was on the day of the announcement. The good new is that I'm now calling APIs without any shenanigans because I learned to simply add '--yolo' the following day. That argument comes with its own set of risks, but I'm ok with those for the moment.

Deploying a ChatKit Demo for PsyOps Detection

 I deployed the LLM Psy-ops detection app earlier today! For those of you just hopping onboard, the WhyFiles ran an episode highlighting a simple, logical scoring method publicized by NCI for determining if a piece media or new article was emotionally manipulative, (think propaganda), or not.  I was looking for a good app to practice deployment, guardrails, and evals, and this one suggested by a @somethingLethal on reddit seemed promising in all those regards. If you'd like to try it, you can find the app at  https://projecttoucans.com/gladych_files_psy_ops  .  LLMs, Simple Math, and Pricing The Psy-op scoring instrument requires that the model sum the scores for the twenty categories. gpt-4o-mini did not sum any of the scores correctly. It got close, but that was about it. I experimented with the python code interpreter to cure the simple math issue. The code interpreter seemed reasonable at first. I mean, three cents per compute minute , not bad right? Ins...