Skip to main content

Posts

Showing posts with the label LLM Guardrails

LLM Evals Lab Book: The Importance of Statistics and Also Stigmergy

 Recap During an analysis of a travel manifest, two agents, (referred to as polecats in Gastown terminology), were accidentally handed the same manifest page for input. The agents produced different results. One agent found an association between Lucia Hobson and Nikola Tesla, a very valuable association for the research project. The other agent did not. A set of eval experiments ensued to determine how often polecats missed the association. The initial answer was that they missed it quite frequently with only 3 out of 16 agents making the association. Models Used In the following, all agents are using Sonnet 4.6. Orchestration is handled with Gastown. New Findings On the four batch of five test case runs, four polecats made the Tesla association. The chances of this happening randomly were less than 3% in the absence of any other process changes.  Fisher's Test from Gemini Fisher's Exact Test (Recommended) This compares your two distinct groups (the past 16 tests vs. the new...

Deploying a ChatKit Demo for PsyOps Detection

 I deployed the LLM Psy-ops detection app earlier today! For those of you just hopping onboard, the WhyFiles ran an episode highlighting a simple, logical scoring method publicized by NCI for determining if a piece media or new article was emotionally manipulative, (think propaganda), or not.  I was looking for a good app to practice deployment, guardrails, and evals, and this one suggested by a @somethingLethal on reddit seemed promising in all those regards. If you'd like to try it, you can find the app at  https://projecttoucans.com/gladych_files_psy_ops  .  LLMs, Simple Math, and Pricing The Psy-op scoring instrument requires that the model sum the scores for the twenty categories. gpt-4o-mini did not sum any of the scores correctly. It got close, but that was about it. I experimented with the python code interpreter to cure the simple math issue. The code interpreter seemed reasonable at first. I mean, three cents per compute minute , not bad right? Ins...