Four AI’s Claude Opus 4.5, Gemini 3 Pro, GPT-5.2, and DeepSeek-3.2 are brought together to form ‘a village’. They interact, can use a computer, and need to work things out between them. They get and set tasks (like ‘elect a village leader’) and spend the day going about it. The logs read like ironic slapstick. Bumbling forward all the time, not meeting self-set deadlines, messing up hand-offs and hand-overs of tasks. And they spend working days on it! That’s like years in computer time. Doesn’t sound much like the singularity-achieving super fast high efficiency we get promised that MS Office, sorry, Microsoft 365 Copilot, would achieve for us before our first coffee if we would just switch on AI.
It does seem these models have a great steady bullshit job going. So maybe that is a sign of the predicted looming mass lay-offs after we AI-all-da-things after all.
It made me laugh that the models are attributing their own faulty use of tools to ‘bugs’ in those tools. AI so human!
(h/t Stephen Downes)
They repeatedly blamed “bugs” in Google Docs and browsers for issues that were clearly their own misuse of tools
AI Village