The AI compute stack has multiple layers, each growing at different rates. The slowest-growing layer constrains total system throughput — this is the bottleneck. Use the calculator below to model what happens when these rates shift.
AI compute capacity is a chain of dependencies: you need chips to process data, memory bandwidth to feed those chips, energy to power the data centers, and high-speed interconnects to link everything together. The total usable compute grows only as fast as its weakest link.
Meanwhile, demand for compute — driven by larger models, more users, and new applications — is growing at a staggering pace. The gap between supply and demand tells you where economic pressure (and profit) concentrates.
Bottlenecks determine pricing power. Whichever layer is most constrained can charge the most. This is why NVIDIA has had extraordinary margins — chips were the bottleneck. But as chip supply scales, the bottleneck may shift to energy, interconnects, or memory — and with it, the locus of value creation.
Each slider in the calculator represents an annual growth rate for a key layer of the AI compute stack. The defaults come from Epoch AI's empirical research:
| Factor | Default | What It Measures | Source |
|---|---|---|---|
| Chip Production | 45%/yr | Total FLOPS manufactured annually across all AI accelerators (GPUs, TPUs, custom ASICs) | Epoch AI chip sales data |
| Memory Bandwidth | 28%/yr | Rate at which GPU memory bandwidth improves generation over generation (HBM2 → HBM3 → HBM3E) | Epoch AI hardware db |
| Energy Capacity | 15%/yr | Growth in available power for AI data centers, including new builds and grid capacity | IEA, Epoch AI estimates |
| Network / Interconnect | 35%/yr | Improvement in inter-GPU and inter-node bandwidth (NVLink, InfiniBand, optical interconnects) | Epoch AI hardware db |
| Compute Demand | 310%/yr | Growth in total FLOPS consumed for AI training and inference, driven by scaling laws and adoption | Epoch AI compute trends |
The calculator applies a simple but powerful principle: effective supply growth = min(all supply factors). This is Liebig's Law of the Minimum applied to compute infrastructure.
In practice, the real system is more nuanced — factors aren't fully independent (more chips means more energy demand), and different workloads stress different bottlenecks. But this simplified model captures the first-order intuition that matters for investment and strategy: where is the constraint, and which direction is it moving?
Scenario 1: NVIDIA solves chip supply. Crank chip production to 80%+. Watch the bottleneck shift to energy. This is roughly the 2025–2026 trajectory — TSMC expanding capacity, but data center power struggling to keep pace.
Scenario 2: Demand slows. Pull demand down to 100%/yr. Now supply might actually catch up, and the gap closes. This is the "AI winter" scenario where investment returns get harder.
Scenario 3: Energy breakthrough. Push energy to 50%/yr (nuclear renaissance, modular reactors). The bottleneck moves to memory bandwidth — this is what some researchers argue is the next wall for training frontier models.
Scenario 4: Everything improves equally. Set all supply factors to ~60%. The system has no single chokepoint, competition intensifies at every layer, and margins compress across the stack.
This is a pedagogical toy, not a forecasting model. Key simplifications:
• Treats each factor as independent (in reality, they're coupled — more chips requires more energy)
• Uses a single "bottleneck" when real systems can be partially constrained by multiple factors
• Doesn't model geographic variation (energy constraints vary hugely by region)
• Doesn't distinguish training vs. inference compute, which face different bottlenecks
• Growth rates are annualized averages; real progress is lumpy
For rigorous analysis, see the full Epoch AI datasets linked below.
All default values are derived from publicly available research by Epoch AI, licensed under CC BY 4.0.