GamesBeat is excited to partner with Lil Snack to have customized games just for our audience! We know as gamers ourselves, this is an exciting way to engage through play with the GamesBeat content you have already come to love. Start playing games here.
Etched announced that it has raised $120 million in a challenge to Nvidia in designing AI chips.
Etched is designing a new chip dubbed Sohu that handle a critical part of AI processing: transform. By burning the transformer architecture into its chips, the company said it is creating the world’s most powerful servers for transformer inference. Etched says it is the fastest transformer chip of all time.
This is interesting, considering one of its rivals is Nvidia, which just beat out Microsoft as the world’s most valuable company last week as it hit a valuation of $3.3 trillion. Etched believes its team of 35 people can beat Nvidia, and this is why folks like Peter Thiel are backing Etched.
Primary Venture Partners and Positive Sum Ventures led the round, with support from institutional investors including Hummingbird, Fundomo, Fontinalis, Lightscape, Earthshot, Two Sigma (strategic), and Skybox Data Centers (strategic).
Lil Snack & GamesBeat
GamesBeat is excited to partner with Lil Snack to have customized games just for our audience! We know as gamers ourselves, this is an exciting way to engage through play with the GamesBeat content you have already come to love. Start playing games now!
The angel investors included Peter Thiel, Stanley Druckenmiller, David Siegel, Balaji Srinivasan, Amjad Masad, Kyle Vogt, Kevin Hartz, Jason Warner, Thomas Dohmke, Bryan Johnson, Mike Novogratz, Immad Akhund, Jawed Karim and Charlie Cheeve.
Alex Handy, director of the Thiel Fellowship, said in a statement, “Investing in Etched is a strategic bet on the value of AI. Their chips tackle scalability issues that competitors fear to address, challenging the stagnation rampant among their peers. Etched’s founders embody the unconventional talent we support—dropping out of Harvard to take on the semiconductor industry. They did the hard work so the rest of Silicon Valley can continue programming peacefully, unburdened by worries about the underlying tech of whatever they’re working on.”
GPU demands are growing.
Not bad for a company founded by three founders who dropped out of Harvard University: Robert Wachen, Gavin Uberti and Chris Zhu. In a new blog post, the founders said the company made the biggest bet in AI in June 2022, when it wagered that a new AI model would take over the world: the transformer.
At the time, there were many kinds of AI models, CNNs for self-driving cars, RNNs for language, and U-Nets for generating images and videos. However, transformers (the “T” in ChatGPT) were the first model that could scale.
“We bet that if intelligence kept scaling with compute, within years, companies would spend billions of dollars on AI models, all running on specialized chips,” said CEO Gavin Uberti, in the blog post. “We’ve spent the past two years building Sohu, the world’s first specialized chip (ASIC) for transformers. We burned the transformer architecture into our chip, and we can’t run traditional AI models: the DLRMs that power your Instagram feed, the protein-folding models from a bio lab, or the linear regressions in data science.”
Uberti added, ” We can’t run CNNs, RNNs, or LSTMs either. But for transformers, Sohu is the fastest chip of all time. It’s not even close. Sohu is an order of magnitude faster and cheaper than even Nvidia’s next generation of Blackwell (GB200) GPUs for text, audio, image, and video transformers.”
Uber said that since they started, every major AI model (ChatGPT, Sora, Gemini, Stable Diffusion 3, Tesla FSD, etc.) became transformers. If transformers are suddenly replaced by SSMs, monarch mixers, or
any other kind of architecture, Etched’s chips will be useless.
“But if we’re right, Sohu will change the world,” he said. “We’re excited to partner with TSMC to start production on their 4nm node. Engineers have left every major AI chip project to build Sohu.
Why make this bet?
Etched is focusing on transforms.
The company said that scale is all you need for superintelligence. In five years, AI models have gone from nonsensical to smarter than humans on most standardized tests. There’s only one explanation: scale.
By simply making AI models larger and feeding them more (and better) training data, AI models get miraculously smarter. The amount of FLOPS used to train the state-of-the-art model has increased by $50,000times$ between GPT-2 and Llama-3-400B, in just five years.
Architectures have standardized: today’s state-of-the-art models like Llama 3 are nearly identical to the state-of-the-art models give years ago (GPT-2), aside from tweaks to normalization, attention, activation functions, and positional encodings.
This trend will continue, Uberti said. Google, OpenAI, Amazon and Microsoft are each spending more than $100 billion on AI data centers. While some academics disagree, all major AI labs believe scaling LLMs will bring us superintelligence.
“We are living in the largest infrastructure buildout of all time. Scaling laws have held for the last ten orders of magnitude (from $10^{16}$ to $10^{26}$ FLOPS). If they hold for the next four (to $10^{30}$ FLOPS), we will achieve superintelligence, and AI chips will become the largest market of all time,” Uberti said.
Flexible chips have stopped improving
The Etched view of GPUs.
Model architectures used to change rapidly. In the span of a decade, we invented CNNs, DLRMs, LSTMs, RNNs, and dozens of other architectures that were state of the art for various domains. These models each cost $10 million to $20 million, and the total market for AI chips was $10 billion to $20 billion.
To serve this broad, shallow market, many companies built flexible AI chips to handle the various architectures. To name a few: Nvidia’s GPUs, Google’s TPUs, Amazon’s Trainium, AMD’s accelerators, Graphcore’s IPUs, SambaNova SN Series, Cerebras’s CS-2, Groq’s GroqNode, Tenstorrent’s Grayskull, D-Matrix’s Corsair, Cambricon’s Siyuan and Intel’s Gaudi. That, by the way, is a lot of competitors.
These companies’ chips are all worse than the Nvidia H100, Uberti said. While some claimed improvements (e.g. the MI300X from AMD has 1.3 PFLOPS FP16, while the H100 from Nvidia has 0.99 PFLOPS FP16), this came from putting more chips together and counting them as the same chip, Uberti said.
The Nvidia B200, the AMD MI300, the Intel Gaudi 3, the Amazon Trainium2, and many others count two chips as one card to double their performance. All performance improvements from 2022 to 2025 will come from this trick… except Etched.
Plotting all 5nm-derived AI chips from 2022-2025 shows that performance per area hasn’t improved – chips just got bigger. In the absence of real performance improvements, the only reason to switch off GPUs is being cheaper. AMD does this with the MI300X and sells 10 times fewer chips than Nvidia. Being 20% worse and 40% cheaper is not enough to switch your software stack.
Specialized chips are inevitable
Etched believes in transforms.
Every large, homogenous computing market ends in specialized chips: networking, Bitcoin mining, high-frequency-trading algorithms are hard-coded into the silicon, Uberti said.
These chips are orders of magnitude faster than GPUs. There are zero companies that use GPUs to mine Bitcoin – they simply can’t compete with specialized bitcoin miners. This will happen for AI. With trillions of dollars on the line, specialization is inevitable, Uberti said.
“We believe the supermajority of spending (and value) will be on models with more than 10 trillion parameters. Due to the economics of continuous batching, these models will be run in the cloud in one of a few dozen MegaClusters,” Uberti said. “This trend will mirror chip fabs: there used to be hundreds of cheap low-resolution fabs, and now, the high-resolution fabs costs ~$20 billion to $40 billion to build. There are only a few MegaFabs in the world, all using very similar underlying architectures (EUV, 858 mm^2 reticles, 300 mm wafers, etc).”
Etched said the transformer has massive switching costs. Even if a new architecture is invented with benefits over transformers, the friction to re-write the kernels, re-build features like speculative decoding, build new specialized hardware, re-test scaling laws, and re-educate your team are enormous. This will only happen once or twice a decade, like what has happened in chips: changes in lithography, reticle/wafer size, and photoresist composition do continue to happen, but do so very slowly, Uberti said.
“The more we scale AI models, the more we will centralize on model architectures. Innovation will happen in other places: speculative decoding, tree search, and new sampling algorithms,” Uberti said. “In a world where models cost $10 billion to train and chips cost $50 million to fab, specialized chips are inevitable. The company that makes them first wins.”
Etched will be first
Etched founders: Robert Wachen, Gavin Uberti, Chris Zhu.
No one has ever built an architecture-specific AI chip, Etched asserted. Even last year, it made no sense. An architecture-specific chip requires massive demand and deep conviction in its staying power.
“We’ve placed our bet on transformers, and both requirements are becoming true,” Uberti said.
It noted the market has reached unprecedented demand. When it started, the market for transformer inference was under $50 million, and now it’s more than $5 billion. All big tech companies use transformer models (OpenAI, Google, Amazon, Microsoft, Facebook, etc.).
And Uberti said they are seeing architecture convergence: AI models used to change a lot. But since GPT-2, state-of-the-art model architectures have remained nearly identical. OpenAI’s GPT-family, Google’s PaLM, Facebook’s LLaMa, and even Tesla FSD are all transformers.
Uberti said it has been moving at breakneck speeds to make Sohu a reality.
“We’re on track for the fastest cycle of architecture to validated silicon for a reticle-sized 4nm die of all time,” Uberti said. “We’re partnering directly with TSMC and are dual-sourcing HBM3E from both of the top-tier vendors. We have tens of millions in reservations from AI and foundation model companies and have ample supply chain capacity to scale well beyond that. If our bet is right and we execute, Etched will be one of the largest companies in the world.”
The company reiterated that if it’s right, Sohu will change the world.
Today, AI models are too expensive and slow to build most products:
- AI coding agents cost $60/hour in compute and take hours to complete tasks
- Google Gemini takes over 60 seconds to answer a question about a video
- Most video models are too expensive to release publicly, and the public ones generate one frame per second (24x slower than real time)
If GPUs keep getting 2.5 times better every two years, with no other improvements, it will take a decade to make video real-time. With Sohu, real-time video, audio, agents, and search are finally possible. The unit economics of every AI product will invert overnight, Uberti said.
Beating Nvidia?
Etched has a different approach to parallel processing.
I asked how a small company like Etched can beat out Nvidia. Etched COO cofounder Robert Wachen said in an email to VentureBeat:
“In the past, the AI compute market was fragmented: people used different kinds of models, like CNNs, DLRMs, LSTMs, RNNs, and dozens of others across domains. The spend for each architecture was in the tens to hundreds of millions, and across these workloads there was a large enough market for a general-purpose chip to win (GPUs),” Wachen said.
He noted the the market is rapidly consolidating to one type of architecture: transformers. In a world where people are spending billions of dollars on transformer models and custom chips cost $50 million to $100 million, specialized chips are inevitable.
“Our chips will not beat GPUs at most workloads – we can’t support them. However, for transformer inference (which powers every major “generative AI” product), we will clear the market. By specializing so much, our chips are an order of magnitude faster than even the next generation of Blackwell GPUs,” Wachen said.
VB Daily
Stay in the know! Get the latest news in your inbox daily
Thanks for subscribing. Check out more VB newsletters here.
An error occured.