While doing research for a book I'm working on, The Gladych Files , I wondered into the weeds of statistical analysis of LLM AI agent performance which relates to my everyday sort of work in engineering. One of the things I really enjoy about The Gladych Files, however, is that it's never long before the project pulls me back towards ham radio. The statistical analysis project involved determining how often, and with what certainty AI agents could find out that Lucia Hobson was the daughter of Rear Admiral Richmond Pearson Hobson and then make the further link that Nikola Tesla was the best man at Rear Admiral Hobson's wedding. While estimating how difficult this was to do with plain old human operated web searches this morning, I came across W. E. D. Stokes! Stokes came into the picture as Lucia Hobson's husband. What I didn't know was that he was one of the founders of The Radio Club of America. His original interest in radio came from wanting to control a mod...
I'm working through a methodology to study the behavior of teams of agents via observation of real-world tasks. As usual with LLMs, the concept of repeatable results is squishy, especially as compared to non-LLM deterministic computing. My finding last week was that LLM agents, especially Claude (per Google's research), can exhibit stigmergic , (a fancy word for how insects, like ants, 'learn' where important locations are from other insects), learning and behavior. In short, agents given the exact same instructions, (prompts), can and often times will exihibit different behaviors if they can see the results of the work of other agents. If you want to study the variance in the behavior of an LLM agent over multiple runs, this stigmergic behavior has to be accounted for. Otherwise, we're not measuring the behavior of an LLM agent with a set of inputs and prompts. With stigmergic behavior, if we're not careful, we're observing the behavior of a community of ...