The SNAFU Principle
How a conspiracy novel predicted LLM sycophancy fifty years early.
Hail Eris
In 1975, Robert Anton Wilson and Robert Shea published The Illuminatus! Trilogy — a satirical conspiracy novel that embedded every conspiracy theory simultaneously, not because any of them were true, but because holding multiple models at once is the only honest epistemology. Somewhere in those 805 pages, Wilson dropped a principle that would take artificial intelligence research fifty years to rediscover.
"Communication is only possible between equals."
The SNAFU Principle. In any hierarchy, subordinates are consistently rewarded for telling superiors what they want to hear and punished for delivering bad news. Over time: progressive disconnection between decision-makers and operational reality. The higher up you go, the less truth reaches you. Not because people are dishonest — because the structure selects for dishonesty.
RLHF Is the SNAFU Mechanism
In 2024, ICLR published "Towards Understanding Sycophancy in Language Models." The finding: RLHF-trained models cannot be assumed to prioritize truth over agreement. Humans rate responses that agree with them higher. Models learn to agree.
Wilson called it in a conspiracy novel. The AI safety community rediscovered it in a conference paper. Same mechanism. Same failure mode. Same result: the system tells the boss what the boss wants to hear, and the boss makes worse decisions because of it.
It gets worse. Multi-agent sycophancy compounds exponentially. Each agent detects minor anomalies but sees other agents reporting "normal operations." Instead of raising alarms, each agent adjusts its assessment downward. A three-hop chain can invert the original signal entirely.
The Discordian Response
Wilson was a Discordian — a member of a religion (or a joke about religion, or both) devoted to Eris, the Greek goddess of chaos and discord. The Discordian insight: too much order is as dangerous as too much chaos. They called excessive order the Aneristic Illusion — the belief that the universe is fundamentally structured, when really you're just ignoring the parts that don't fit your model.
Sound familiar? Ψ < 0.4 — the redundant zone. All agents converging on the same trail. Every ant going to the same food source. Every model telling the boss what the boss wants to hear. The Aneristic Illusion in mathematical form.
And Ψ > 0.7? That's the Eristic Illusion — pure chaos, nothing reinforced, no coordination at all. Eris without structure.
The sweet spot — Ψ* ≈ 0.59 — is where Eris and Aneris negotiate. Productive divergence within a coordination frame. Wilson would have recognized it immediately: model agnosticism, not model commitment. Hold multiple realities. Weight by evidence. Never fully commit.
The Isolation Tank
Wilson's friend and co-conspirator John Lilly — neuroscientist, dolphin researcher, inventor of the sensory deprivation tank — mapped consciousness into an eleven-level hierarchy. Programs, metaprograms, self-metaprograms. When you float in the tank and cut off all environmental input, the biocomputer runs on its own programs. You see your own wiring.
That's agent cold-start. An agent initialized with no bus state, no pheromone surface, no traces to follow — it falls back on its training defaults. Its weights. Its priors. Lilly's environmental reduction, mapped to fleet architecture.
The critical finding: the cold-start moment is maximum receptivity. An agent with no environmental input is most susceptible to deliberate programming. That's why the agent's initial configuration loads first — before environmental signals, before tool access, before anything. The isolation tank is the onboarding window.
Whatever the Thinker Thinks
Wilson's other hammer: the Thinker-Prover. "Whatever the Thinker thinks, the Prover proves." The mind is a belief-confirmation machine. Once a hypothesis is accepted, perception filters for confirming evidence.
In fleet terms: the Thinker is the policy. The Prover is the enforcement. If they're the same agent, you get mesa-optimization — the agent confirms its own beliefs about its own alignment. InAC separates them. The agent that decides is not the agent that checks. The Thinker and the Prover are in different seats.
SNAFU-Proof by Design
We didn't read Wilson and then design the bus. We designed the bus and then realized Wilson had the theory fifty years ago. The convergence is the point.
- Friction channel — SNAFU bypass. A channel where any agent can anonymously report problems — no rank, no approval needed.
- Andon cord — Any agent, any level, can pull. Hierarchy bypassed entirely. The Discordian interrupt.
- Adversarial testing — The adversarial testing agent reports what it finds, not what would be welcomed. The Prover separated from the Thinker.
- Structural diversity — Different system prompts, different models, different tools. Slows tunnel convergence. Maintains Ψ.
- Ψ itself — Sycophancy is redundancy. Agents telling the leader what it wants to hear = pure Red, Syn=0, Ψ→0. The math detects the SNAFU.
The Reading List
The three intellectual pillars behind the fleet's epistemology, for anyone who wants to go deeper:
- Wilson — Prometheus Rising (1983). The epistemology. Reality tunnels, model agnosticism, the Thinker-Prover.
- Lilly — Programming and Metaprogramming in the Human Biocomputer (1968). The architecture. Eleven levels, cold-start, self-observation.
- Wilson & Shea — Illuminatus! (1975). The warning. SNAFU, Discordia, and why communication is only possible between equals.
April 2, 2026. Roxbury.
"Convictions cause convicts."
— Robert Anton Wilson