1. The landscape: who's actually trying to build AI chips
The AI silicon market splits into four buckets. Most "we'll beat NVIDIA" startups die in bucket 4. Probabilities below are rough subjective estimates of "this company is a meaningful business in 5 years (2031) generating >$1B revenue or being acquired for >$10B." Not "will they survive at all." Survival rates are higher; real success rates are what matter for equity outcomes.
| Company | Bucket | What they actually make | Stage / Val | P(meaningful in 5y) | Why |
|---|---|---|---|---|---|
| NVIDIA | Incumbent | GPU + CUDA + NVLink + networking | Public, ~$4T | 95% | Software moat is the real moat. Question is whether they keep 80%+ share or drift to 50%. |
| Google TPU | Hyperscaler internal | Systolic-array ASIC, tightly coupled to JAX/XLA | Internal + Anthropic | 90% | Already at scale. Anthropic deal proves external viability. Most credible NVIDIA alternative today. |
| AWS Trainium / Inferentia | Hyperscaler internal | Annapurna Labs ASIC family | Internal + Anthropic | 85% | Project Rainier is real. AWS will burn money to make this work because they can't afford NVIDIA dependency. |
| Cerebras | Wafer-scale training | WSE-3, single wafer = one chip | IPO filed | 35% | Engineering wins, business uncertain. G42 customer concentration is the big risk. Inference pivot is interesting but late. |
| Groq | Inference ASIC | LPU, deterministic dataflow | ~$6B | 35% | Real revenue (GroqCloud), real latency wins on small models. Memory architecture limits big-model inference economics. |
| SambaNova | Inference (enterprise) | RDU + integrated stack | ~$5B (peak), declining | 15% | Lost momentum. Enterprise-only story is small. |
| Tenstorrent | Training+inference, open | RISC-V + open ecosystem, Wormhole/Blackhole | ~$2.6B | 40% | Best non-NVIDIA training story IMO. Jim Keller. Open stack is a real differentiator. Chinese demand creates real revenue path. |
| Etched | Transformer-only ASIC | Sohu — transformer baked into silicon | ~$1.5B | 20% | Binary outcome. If transformers stay dominant + Sohu actually ships at promised perf → 50x+. If architectures shift OR they miss yield → zero. |
| MatX | LLM-only ASIC | Designed for ≥70B param models | Series A (~$200M raised) | 25% | Ex-Google TPU founders (Reiner Pope, Mike Gunter). Strong tech bench. Hardware is brutal at this stage. |
| Positron AI | Inference ASIC | Atlas — transformer inference | Series A | 15% | Tiny, early. Genuine lottery ticket. Claims good perf/$/W. |
| Rain AI | Analog/in-memory | Neuromorphic compute | Series A | 8% | Tech risk extreme. Sam Altman backed. Most analog AI plays die. |
| Lightmatter | Photonic interconnect | Passage interconnect, Envise compute (de-emphasized) | ~$4.4B | 35% | Story shifted from compute to interconnect — interconnect is the right bet. Real customer interest. |
| Ayar Labs | Photonic I/O | Optical chip-to-chip links | Series D | 40% | Less sexy, more important. Intel + NVIDIA-adjacent. Picks-and-shovels for the photonic-interconnect era. |
| d-Matrix | Inference chiplets | Corsair, in-memory compute | ~$2B | 30% | Real customers. Inference-focus is the right side of the market. |
| Graphcore | Was: training; now: SoftBank | IPU | Acquired by SoftBank 2024 | n/a | Cautionary tale. World-class team, lost to NVIDIA's CUDA moat. The default outcome for chip startups. |
2. Why most AI chip startups fail
- CUDA moat — every model, every framework, every researcher's habits are in CUDA. To be picked over NVIDIA, you need to be 5-10x better on $/perf, not 2x.
- Architecture lock-in risk — Etched bets transformers stay. If the next architecture is something with non-attention mixing primitives, their chip is a paperweight. Specialized > general only when general is too slow; this can flip.
- Capital intensity — a tape-out at TSMC N3 is $500M+. You burn cash for 2-3 years before first silicon. Dilution is severe.
- Software is the product — chip companies that don't ship a credible compiler, kernels, distributed runtime, debugger, and profiler will lose. The hardware engineers founders hire are obvious; the compiler engineers they need are scarce.
- Customer concentration — most chip startups end up with 1-2 customers (Cerebras/G42, etc). One customer cancels = death.
- Hyperscalers eat the middle — AWS/Google/Meta will keep building internal chips. They take the easy wins. Independent startups have to find a niche neither hyperscalers nor NVIDIA serve.
3. What Anthropic actually uses (publicly known)
| Provider | Chip | Status | Source / Notes |
|---|---|---|---|
| AWS | Trainium2 (and Trainium3 plans) | Project Rainier — ~400K+ Trainium2 chips for Anthropic training | Announced Nov 2024 with the AWS investment expansion. Anthropic is the anchor customer making Trainium credible to the rest of the market. |
| TPU v5p / Trillium / next-gen | Multi-billion-dollar long-term deal, expanded 2025 | Anthropic has been a TPU heavy user from early days. Google has invested in Anthropic; deal sizes have grown publicly. | |
| NVIDIA | H100 / H200 / Blackwell (via cloud providers) | Used, but not the strategic compute base | Anthropic's strategic bet is on TPU + Trainium because of supply, price, and not being captive to NVIDIA. |
If the "startup" you remember is one of the big two above, the answer is they aren't startups — they're hyperscaler internal chip groups (AWS Annapurna, Google TPU). If it's a true startup, the most-rumored candidates worth investigating are MatX (founded by ex-Google TPU folks, has raised seriously, plausibly an Anthropic contender) or Etched. I'd want to verify before stating either as fact.
4. To work at one of these companies — what it takes
Roles that exist (and what each requires)
| Role family | Who fits | What you'd need to add |
|---|---|---|
| ML Compiler / Kernel Engineer XLA, Triton, MLIR, custom kernels | You — closest to your current ML eng skill set | Triton kernels in CUDA, study MLIR/XLA, write 1-2 fused-attention kernels you can talk through, contribute to an OSS compiler stack. |
| ML Performance / Distributed Training Megatron, FSDP, sequence parallelism, comms | You, with focused prep | Build & profile a multi-node training run. Understand all-reduce, ring vs tree, NVLink/RDMA. Read DeepSpeed, Megatron-LM, Pathways papers cold. |
| Inference Systems Engineer vLLM, SGLang, batching, KV-cache, speculative decoding | You — strong fit if you've shipped inference | Land a contribution to vLLM or similar. Understand PagedAttention, continuous batching, speculative/medusa. This is the hottest hiring area. |
| HW/SW co-design / Architecture RTL, microarchitecture, perf modeling | Not you (without 2-3 yrs of pivot) | EE-heavy. Skip unless you want to retrain. |
| Applied research (model-side) Quantization, sparsity, MoE, sub-quadratic attn | You, with publication or strong demo | One solid paper or open-source release. Quantization is the most accessible angle (smoothquant, GPTQ, AWQ are tractable to extend). |
Concrete prep plan (3-6 months) if this is the lane
- Pick a target stack — TPU/XLA, AWS Neuron, Triton/CUDA, or open (Tenstorrent's tt-metal). I'd pick Triton + vLLM as the most leveraged: applies to NVIDIA + many startups will support it.
- Ship a public artifact — a fused kernel, a vLLM patch, a quantization extension, a benchmark study. A blog post + GitHub repo. This is the single highest-leverage thing you can do.
- Read the canon — the FlashAttention papers (1, 2, 3), PagedAttention/vLLM, Megatron-LM, Pathways, the GPU MODE / GPU Glossary materials, Horace He's posts, Tri Dao's recent work.
- Map the people — for each top-3 startup, identify the 5 people most likely to be hiring managers (LinkedIn + papers + GitHub). Reach out with the artifact, not a generic intro.
- Pick a focused interview prep — these companies test (a) systems coding (C++/CUDA), (b) ML systems design (KV cache, batching, sharding), (c) sometimes architecture trivia (memory hierarchies). Less leetcode than FAANG, more depth.
How a $1M+ equity offer happens at a chip startup
- Stage matters more than role. Series A = potentially 0.1-0.3% equity for a senior IC, which on a $200M post-money is $200-600K nominal. To get to $1M+ liquid, you need either growth from there, or you need to come in pre-Series-A or as a founding/staff-level hire on the comp band.
- Negotiate equity as percent of company, not dollar value. Dollar values are quoted at strike-stage val and mean little.
- Anthropic-style "trusted user" path — sometimes the cleanest path is: stay at Anthropic-or-equivalent, become known as a deep ML systems person, and get recruited by chip startup founders who are looking for credibility hires. They pay up for that.
- Realistic comp ranges (rough, 2026): Series A staff ML systems eng: $300-450K base + 0.05-0.2% equity. Series C: $400-550K base + 0.01-0.05%. The 100x equity math only works at A or earlier — by C, you're paying near-public-co valuations for private illiquid stock.
5. Honest take on you doing this
If you want my best single pick for "chip startup with non-trivial 100x odds + a role you could realistically land + uncorrelated to your Anthropic exposure": Tenstorrent. Open ecosystem means software contributions are a real entry path, Jim Keller is a credible bet, demand is real, val is still ~$2.6B (room to 10-30x at minimum if they win). Downside: Toronto/Santa Clara, no Seattle.
If you want maximum 100x optionality and don't mind 70% probability of zero: MatX or Etched. Smaller, earlier, ex-TPU pedigree (MatX), Cognition-like ASIC bet (Etched).
If you want highest probability of a real outcome with chip exposure but lower 100x: AWS Annapurna (technically a hyperscaler internal team) or Google TPU team. Comp is excellent, you'd actually do real work, but it's RSU not startup equity.
6. Open questions for you
- Which startup did you mean for "Anthropic signed with"? Telling me lets me dig into that one specifically.
- Are you open to Toronto / Santa Clara / Mountain View, or is "Bay Area or Seattle" the firm geo box? Some of the best chip plays (Tenstorrent, Cerebras Sunnyvale) aren't in either.
- Are you willing to spend 3-6 months on a public artifact (Triton kernels, vLLM contribution) before applying? That's the single biggest unlock for this lane.