How AI’s hottest trend turned into a costly hangover

A version of this article originally appeared in Quartz’s AI & Tech newsletter. Sign up here to get the latest AI & tech news, analysis and insights straight to your inbox.

Every few months, artificial intelligence invents a new way to make people feel behind.

First came prompt engineering, the art of talking to chatbots just right, which briefly looked like it might become an actual profession. Then came AI slop, the tsunami of generated images and text that flooded the internet until an em-dash use got you called AI. Then came vibe coding, when developers and non-developers alike started describing what they wanted in plain English and letting AI handle the rest, whether or not the result was secure or would hold up under pressure.

The latest was tokenmaxxing. And like the trends that it follows, it may already be over.

Token legends

For a few months earlier this year, the prevailing philosophy at some of the biggest companies in the world was simple: use as many AI tokens as possible, as fast as possible, and figure out the returns later. Tokens are the basic unit of AI computing, each one representing a fragment of a word, and the more you burned, the more AI-forward you looked (at least to some).

At Meta, an employee built an internal leaderboard ranking colleagues by token consumption, doling out titles like “Token Legend” before it was taken down. Amazon had its own informal version. A Sequoia partner told the Wall Street Journal that her firm had one too, and that every founder she advises should adopt the tokenmaxxing mindset.

Not everyone was convinced. One unnamed financial institution reportedly had employees burning hundreds of thousands of dollars a month, some of them using expensive premium models to answer simple questions or just make small talk.

The CEO of coding automator Factory compared it to hiring Albert Einstein to tutor your kid in algebra. He didn’t seem to mean it as a good thing.

On the developer side, the shift was less about ideology and more about habit. Agentic coding tools had quietly become infrastructure. Developers built entire workflows around them, making it feel like having a whole engineering team on call. They could leave the coding tools running all night, trying things they didn’t have the time for before.

Then GitHub Copilot switched to usage-based billing on June 1, and the math became visible. Users flooded Reddit within hours. One person burned through 50%of their monthly credits on a single prompt. Another shared a screenshot showing their bill jumping from around $50 to $3,000. Many claimed they canceled their subscriptions then and there.

The hangover, with one exception

More cracks had been forming for weeks. Uber burned through its entire annual budget for agentic AI in three months and capped its use to $1,500 a month per developer. Microsoft pulled back employee access to Claude Code, Anthropic’s popular coding tool. Meta’s CTO sent a memo telling staff that token usage alone is not a measure of impact.

Palantir’s CEO compared the whole culture to pornography addiction, arguing that employees burning tokens all day weren’t producing anything of value, just consuming. Some data suggests he might be on to something: A study tracking more than 100,000 GitHub developers found that agentic coding tools increased the volume of code written by 741%, but actual software releases rose by only 20%.

The problem with leaderboards and unlimited budgets, it turns out, is that they measure activity rather than output. An engineer can climb to the top of a token leaderboard by running agents in circles, generating documentation no one reads, or asking a frontier model what to have for lunch. The metric was always gameable. It just took a few months and some very large bills for that to become obvious.

Those months, though, were very good for Anthropic, with Claude Code being widely considered the best coding tool. The company reported $4.8 billion in first-quarter revenue and projected $10.9 billion for the second quarter, a 130%increase that is expected to produce its first operating profit. Shortly after the positive financial news came out, Anthropic confidentially filed for an IPO on June 1, beating OpenAI to the paperwork and joining what could be a once-in-a-generation moment on Wall Street alongside SpaceX and OpenAI, each potentially valued above a trillion dollars.

It is worth noting that Anthropic itself has shared financial information with investors that suggests the profitability may not last the full year, as planned spending on computing infrastructure is expected to eat into margins. And if tokenmaxxing is truly over, Anthropic may feel that more than anyone. Meta’s internal leaderboard was called “Claudeonomics” for a reason. The tokenmaxxing era, brief and bizarre as it was, may have handed the company a quarter of champagne just as the party was ending everywhere else.

Originally Appeared Here

Pages

Categories

How AI’s hottest trend turned into a costly hangover

Token legends

The hangover, with one exception

About the Author:

Token legends

The hangover, with one exception

You May Also Like

Why Small Changes Can Cost Big

Why Context Is More Important Than Prompt Engineering — Virtualization Review

The Evolution No CHRO Is Mapping Correctly

AI Is Taking Over for Prompt Engineers

20 Roles That Will Define High-Performing Comms Teams In The AI Era

The Basics of Prompt Engineering: Turning AI Assistants into More Trustworthy Partners Through Programming

About the Author: