Copasetic Flow

Posts

Showing posts with the label LLM

TIL: GasTown Extended git to Offer Rig Setup on first git status

This is kind of cool! I started a new project today using codex --yolo, (I didn't want to use my Claude tokens until I'd worked out the design and architecture a bit more.) When I got done coding with codex, I asked it to create a repo for the work we did. That seemed to work, but I wanted to verify that the repo was ok, so I typed gt status and was immediately presented with I answered y and the extension was off and running adding my new repo to Gas Town. I didn't actually have Gas Town up and running on my Linux client, so I had to get it started and then try again manually with cd gt && gt rig add ssm_overlay https://github.com/hcarter333/ssm_overlay.git and it just worked!

Lab Book 2026-07-20 Claude, Many Agents, and Subagents

Just noting a feeling, nothing quantitative, but I think I've noticed that if more Claude agents are working at once, they each tend to create fewer subagents. If that's true, and it's a big if, I wonder if there's something at Anthropic that trottles or budgets overall effort to an account. Just to be clear, I've also watched five polecats descend into a complete and utter subagent storm like this one. True or not, I need the storms to either stop or I need to gain the ability to control them, so I'm working more on subagent storm control today. As promised, I did not watch my agents. Intead, I showed the mayor how to watch and mange polecats creating subagents. Here's the start of my conversattion At that stage, I hadn't considered the obvious fact in the figure above that subagents are occasionally beyond recursive. (As an aside, the kid here tells me this is actually a plot point in one of Matrh Well's MrubderBot books.) The next thing I asked...

Lab Book 2026-07-19: Is the Antrhopic Increased Token Deal Related to Fan Experiments?

Working with Fable last week, I noticed that it can get very excited about fanning out tasks, (to the point of crashing my WSL session.) I experimented with prompting agents to split out tasks on their own, but haven't found a successful to do this yet. Consequnetly, I reverted back to my original prompt which does not explicitly call for task splits. Then, this morning, running on Sonnet 4.6, another task storm sprung up. I wonder if the new 50% higher token limit till August is Anthropics way of buying themselves some head room while they experiment with models creating subagents? Are other people seeing the same thing? It's particularly worrisome that Sonnet 4.6 has started a subagent storm becuase it really doesn't ahve the context to deal with the results.

LLM Lab Book 2026-07-12: Claude fable-5 agent forks

I'm still tracking down what makes some agents find a Nikola Tesla research finding and why others do not. Today, that's led me into investigating Claude CLI's forking harness. A few notes from me. Forking looks pretty spectacular! The agent kicks off a subagent that automatically has a copy of the parent's context. The subagent doesn't add to the parent's context until it's done. So, it seems to make things cheaper, at least for my passenger manifest research. The agent that used forking made the Tesla association. The other two agents with the same inputs and the same model did not use forking and did not find the Tesla assocation. This is important. It seems that agents that don't fork lack the persistence to look for more than one "really good" finding. They make that one good finding, and then kind of take any results for the rest of the passengs as good enough. Each forked subagent is looking for its own "really good" finding ...

Variance in Research; Manual Agent Orchestration - LLM Lab Book 2026-07-02

More instances of variance in research results, this time around Isidore Nobel. Also, saving tokens by keeping conversations short. Missing Isidore Nobel This was probably due to a misspelling on the British manifest of Nobel's last name as Noble. The U.S. manifest has it as Nobel. Another midweek usage reset I would say this was caused by the introduction of fable yesterday, but my usage numbers didn't reset until late last night. Yesterday, my limits were at about 21% and set to renew on Saturday. What do ticket number clusters reveal in the sorta solved Hedy Lamarr mystery? First, an update. Hedy Lamarr aka Hildegard Mandl is present on the English version of the manifest. You might remember she is not present on the United States side. No answers yet as to why. However, the English manifest has ticket numbers and people traveling together seem to have similar ticket numbers, so this is a reminder to look for something there. Why we sometimes want substrate-mediated pro...

LLM Lab Book 2026-06-30: Using LLMs with Datasette-agent and Database Prep vs Token Usage on Claude

I'm condensing the steps to move from travel manifest page to human readable findings to sqlite database here. History Research Contextual Recap I'm working on a history of physics research project, The Gladych Files , that explores how industrialists interested in fringe physics wound up actually funding mainstream general relativity research. As part of that research, I've been looking at the travel manifests of various industrialists and research scientists from the 1930s to the 1950s. Because there are literally thousands of passengers on their combined voyages, I'm using LLM agents orchestrated through Gas Town to coordinate the research. At present, I am working on a bit of a mystery. Multiple sources state that Hedy Lamarr came to the United States aboard the S.S. Normandy, arriving on September 30th, 1937. That's the same ship that Tom Slick, (one of the industrialists of whom I spoke above), took across the Atlantic. There's only one problem. Hedy is...

Tom Slick, Hedy Lamarr the Normandy and Other Things: Lab Book for 2026-06-28/29

I haven't mentioned my portfolio site here before, but I did manage to fix page view tracking on it today, so that's kind of nice. It was the final project for a digital portfolio class I took at City College San Francisco and highly recommend. History Research Recap I'm offloading a significant amount of research work for my history of physics book, The Gladych Files , onto an orchestrated platform of LLM agents in Gas Town . The bulk of the work is to research passengers on trips the main characters of the book took from the 1930s to the 1950s. One of the main subjects of the book is Tom Slick . While returning from a trip to attempt to spot the Lochness Monster while he was a student at Yale—seriously, I love this book—he was aboard a ship, the Normandie, with Hedy Lamarr. The ship also had over 1,000 other passengers including the grandchildren of Henri Matisse . My agentic AI research team comes to task because it's not a small project to research each of a t...

Controlling other Geo-Apps with CesiumJS MCP

Moving the camera to see the sky in CesiumJS maps has always been a little bit difficult for me. So,m when CesiumJS announced their baseline MCP for controlling the camera on CesiumJS maps, I leaped at the chance to try out an MCP and to grab hold of better control of my map camera. This week, the sujbect of eclipses came up in my Gladych Files research. ( Ferry Barrows Colton , famed National Geographic Science writer of the 1940s was part of the 1947 Brazil eclipse expedtiion, and was also on board the Normandie with Tom Slick in 1937.) That reminded me of the following picture I took of the 2017 eclipse from Wyoming. I've wanted to identify the stars on that picture for years, so I was curious if CesiumJS had accurate constellation maps for a given date and time. Turns out, they do. But, how to look at the stars? I revived my version of the MCP camera control server for CesiumJS in a few minutes by starting Codex in the repo directory on my local machine, a...

fable-5 down for now per US Government Directive

It was fun getting to use Anthropic's Fable-5 for a few days. Hopefully the chance will come up again. For the moment, the US government has denied access to non-US citizens.

Can Agents Think Outside the Box?

With all the work that's been put into making agents "correct" by construction, I gotta say, sometimes I need an LLM agent to take a chance at just being wrong. I'm working on a book project called The Gladych Files . While the book is narrative nonfiction about the history of general relativity research, it explores the liminal space inhabited by very rich fringe scientist speculators of the 1950s who funded mainstream general relativity advances, (more or less on accident.) In those spaces, you'll find Tesla, the architect of the FBI building, Timothy Leary's LSD explorations and many, many other things, institutions, and people. I've accumulated hundreds of pages of historical documents from various archives, and I'm using orchestrated agentic AI, (in the form of Gastown), to review those documents. So far, the analysis has gone well, but last week I saw something that made me look up. I'd accidentally input the same archive page twice, so i...

Gladych Files Lab Book: Document OCR vs LLM Model vs Cost, or Claude Opus is Cheaper than Sonnet for OCR!

I started my lab book entries when I was a physics graduate student. It's kind of amusing and kind of cool how far I've come. I have the equivalent of a grad student, (aka Claude Opus 4.7), working for me now. I spent some time over the weekend setting up an OCR framework for a book research project of mine. I've been coming up to speed on evals, so I decided to run one to determine which model was the most accurate and cost effective for doing OCR on travel manifest pages. I stepped the eval along rather than automating it and talked the results through with Opus as I went. First, it turns out that Opus at low effort is the most accurate and the most cost effective choice! That was a surprise. The result has to do with Opus' ability to look at higher res images which means it needs to think less for OCR vs. Sonnet. Second, at the end of the eval, as I was preparing to write up my results it occurred to me that I could ask my grad student to do it instead. Here's...

Working with Process Revision Control

I took time to play with a new Dolt enabled app example called Quorum last night. Quorum sets 13 LLM agents with different defined personas loose on a users question. The agents come up with solutions to the question and then discuss their individual solutions with each other to arrive at a consensus. There's much more detail in this blog post that accompanies the app. Quorum is cool. It is not, however, what I wanted to talk aobut here. Instead, I'm going to focus on the blog post for the app. In short, I'm very excited to see ideas that I've used to manage verification processes for years get codified into tools for LLM agents. Here's one of the important parts " I can shut down the app, lose the server, or disappear entirely — and the deliberation history remains, publicly accessible and cryptographically verified. " Imagine what an engineer can do to work back through their debug hypothesis tree with that sort of infrastructure! As the article'...

ChatGPT 5.4 Confused this Morning oaicite index

I asked ChatGPT 5.4 Thinking to write some JSON-LD text to summarize a blog post this morning along with alt text for the posts images and got this When I asked it to file a but report on itself, I didn't really expect it to succeed, but I didn't expect more oaicite index listings for the proposed issue description. (Kinda obviously, I also haven't had enough coffee yet :) Anyone else seeing this? Happenings of Interest (radio and nature) Where was our skip zone at San Bruno? For example. QSO Log Table containing QSOs in text Callsign rx RST tx RST Time (GMT) Frequency KBTEST 539 559 16:42 14058.3 kHz (Add callsigns as post tags?) Unschooling Highlights POTA tx QSL: QSL rx album: References POTA ( Parks on the Air ) Local Ionograms https://lgdc.uml.edu/common/DIDBYearListForStation?ursiCode=PA836 (for example) Videos Demo

Project TouCans | AI-Tutored Technician Class Ham Radio Practice Exam

Just a quick note that the AI tutored Project TouCans exams are up and running for the latest US technician class question pool for the license exam. The exam sprang out of two different projects here at the home QTH. KO6BTY is studying for her extra class license. Also, I'm learning about developing code with AI. And voila: AI tutored ham radio practice exams. Try the AI-tutored Technician Class practice exam now. Project TouCans technician class practice exams . Read more about this project: First demo of OpenAI ChatKit enabled exams First release of extra class exam based on OpenAI responses API Removing the vector store to reduce costs Experiences with Vector Stores Early debug to add contexts by local compute and storage First release of extra class exam with no AI

Lab Notebook: GPT-5 Help Agent for Ham Radio Exams Debug

Debug notes from getting the AI help feature of the free ham radio exams to work today. Grabbing a text answer from OpenAI still works: That's from this method async function retrieveTextWithFileSearch ({ system , user }) { const vsId = localStorage . getItem ( 'vector_store_id' ); if (! vsId ) throw new Error ( 'No vector_store_id found.' ); answText = answText + " " + user ; const resp = await openai ( '/responses' , { body : { model : 'gpt-4.1-mini' , input : [ { role : 'system' , content : system }, { role : 'user' , content : answText } ], tools : [{ type : 'file_search' , vector_store_ids : [ vsId ] }] } }); With the agent flow, it doesn't work async function retrieveTextWithAgent ({ system , user }) { // 0) Make sure we have ...