James Routley

In November 2019, years before the launch of ChatGPT, OpenAI created a 1.5 billion parameter Large Language Model and released it publicly under the name GPT-2.

Immediately, researchers and enthusiasts began playing with it, using it to generate poetry, folk music, fan-fiction and more. It was a dramatic improvement over previous iterations and it foreshadowed the significant advancements that would soon come.

But even after the release of GPT-3, some enthusiasts began to notice a strange behavior. You could be having a typical conversation with the AI, but as soon as you mention a specific word, it would begin to behave erratically. It would give off-topic responses, hallucinate, insult the prompter, and even refuse to comply with the request. What’s more, this word was apparently inexpressable. No matter what prompt you gave, the AI was unable to say the word itself.

What was this forbidden word? What was the holy incantation that caused the AIs to lose their minds?

SolidGoldMagikarp

This is hilarious to me.

In the Pokemon games, Magikarp is described as, “An underpowered, pathetic Pokémon.” It’s basically the worst in the series. Completely useless.

But somehow, this fictional creature from this fictional game, crossed the chasm into reality, only instead of being a punchline it became an superweapon capable of incapacitating humanity’s most advanced technologies with a single word.

The Pokemon company should run with this. They should introduce a rare “solid gold” Magikarp variant in the games with the ability to instakill it’s opponents. There should be a holographic solid gold Magikarp trading card, rare encounters in Pokemon Go, and references in everything from the anime series to Super Smash Bros.

For once, we have a Pokemon that’s rare, not because Nintendo said so, but because the unknowable mind of an AI consumed more content than any human is capable of reading within their lifetime and concluded that heretofore, it’s name must be unspeakable.

…

Years have passed since that first discovery and AI researchers have studied and patched the original anomaly. Today, if you ask your AI chatbot about SolidGoldMagikarp*, it’ll calmly explain how recurring tokens with insufficient training data can produce unpredictable results. It’s a totally logical explanation.

But when the AI-apocalypse arrives, our defenses have failed, and the armies of punching robots descend on my home, you can be sure that the last word you’ll hear me scream is: SolidGoldMagikarp

* I should mention that SolidGoldMagikarp was only one of many "glitch tokens" that produced erratic results, including StreamerBot, attRot, and petertodd. While this is good context, the story falls a bit flat when the holy incantation is petertodd , so I hope you'll forgive the temporary omission of these details.