Recap During an analysis of a travel manifest, two agents, (referred to as polecats in Gastown terminology), were accidentally handed the same manifest page for input. The agents produced different results. One agent found an association between Lucia Hobson and Nikola Tesla, a very valuable association for the research project. The other agent did not. A set of eval experiments ensued to determine how often polecats missed the association. The initial answer was that they missed it quite frequently with only 3 out of 16 agents making the association. Models Used In the following, all agents are using Sonnet 4.6. Orchestration is handled with Gastown. New Findings On the four batch of five test case runs, four polecats made the Tesla association. The chances of this happening randomly were less than 3% in the absence of any other process changes. Fisher's Test from Gemini Fisher's Exact Test (Recommended) This compares your two distinct groups (the past 16 tests vs. the new...
With all the work that's been put into making agents "correct" by construction, I gotta say, sometimes I need an LLM agent to take a chance at just being wrong. I'm working on a book project called The Gladych Files . While the book is narrative nonfiction about the history of general relativity research, it explores the liminal space inhabited by very rich fringe scientist speculators of the 1950s who funded mainstream general relativity advances, (more or less on accident.) In those spaces, you'll find Tesla, the architect of the FBI building, Timothy Leary's LSD explorations and many, many other things, institutions, and people. I've accumulated hundreds of pages of historical documents from various archives, and I'm using orchestrated agentic AI, (in the form of Gastown), to review those documents. So far, the analysis has gone well, but last week I saw something that made me look up. I'd accidentally input the same archive page twice, so i...