Issue No. 03 3 May 2026

The compute bottleneck has three floors, and the SaaS apocalypse just had its first wind-down

Anthropic raised $45B and is still compute-short. OpenAI missed both targets. Medallia got handed back to creditors with $5.1B of equity wiped. Three independent voices in this issue identified the same supply-side wall — but at different layers (memory bandwidth → power → capital cost). Paul Tudor Jones put a 252%-of-GDP number on the equity bubble while Jon Gray put a $300B-from-one-firm number on the data-centre wave. And Adam Foroughi's AppLovin (84% EBITDA margins, $10M EBITDA per employee) is the operator-side proof that the lean AI-native reset works — Kalshi and Baseten echo it from prediction-market and inference-cloud benches.

11 episodes · 10.7 hours

The Threads

The compute bottleneck has three floors

Last week Dylan Patel put a number on the demand wall — Anthropic’s gross-margin floor at 72%. This week three independent voices put numbers on the supply wall, at three different layers of the stack.

Floor 1 — capital cost. Tuhin Srivastava (Baseten, No Priors) is running the inference cloud at mid-90s utilisation across 90 clusters in 18 clouds with a daily 4pm capacity-allocation meeting. The number that matters: GB200 access now requires 3-5 year contracts with 20-30% TCV prepay. Cost of capital just became the binding constraint on inference capacity. Baseten grew 30x in 12 months and is on track for >$1B in 2026; 95% of served tokens are now custom (post-trained) models, almost no one runs vanilla open-source weights at scale. Top-30 customers have never churned. 400% NDR. The H100, 4.5 years post-launch, is still appreciating in the secondary market. ‘Inference is the last market — even if there’s AGI, all that’s left is inference.’

Floor 2 — power and grid components. Chamath on All-In sharpened the framing his peers missed: ‘Everything in this market is power-constrained. The reason these folks miss a number has nothing to do with demand. It is 100% due to the supply of power.’ Notes that 40% of announced gigawatts will be cancelled because of grid red tape, transformer/turbine supply chain delays. Hyperscalers are extracting equity from labs to grant capacity — Anthropic’s $45B from Amazon is the direct example. Jon Gray (Blackstone) confirms it from the buyer-of-the-assets seat: Blackstone alone is signing 6 GW of data-centre leases in 2026 = ~$100B of data-centre capex + ~$200B of hyperscaler chips = $300B from one firm = ‘almost the size of Finland or Portugal’. 8 of Blackstone’s 10 best-performing Q1 investments were in data centres, LNG, battery storage. The aggregate hyperscaler 2026 CapEx headline number from earnings week was $725B (Amazon $200B / MS $190B / Google $190B / Meta $145B); Amazon FCF imploded -97% QoQ.

Floor 3 — memory bandwidth. Reiner Pope (Maddox, ex-Google TPU) gave the technical proof on Dwarkesh that this is the deepest floor. The roofline analysis is brutal: optimal inference batch ≈ 300 × sparsity (~2-3k tokens), the train-schedule analogy (batches depart every ~20ms = HBM drain time), MoE forces single-rack residency because scale-out fabric is 8x slower than NVLink. API pricing leaks the architecture: Gemini’s 50% jump at 200k context is the empirical inflection point where memory time crosses compute time. Output tokens cost 5x input because decode is memory-bandwidth-bound (one token at a time, fetching the whole KV cache for each step). Per Patel cited mid-episode: ~50% of 2026 hyperscaler CapEx is going on memory. Models are ~100x overtrained vs Chinchilla because inference token volume across a model’s 2-month life exceeds training token volume — the entire ‘sum of human knowledge’ in tokens gets re-emitted by every served model. The 200k context-length ceiling has held for two years and there’s no clear path off it without HBM scaling materially or attention becoming fundamentally sparser. Direct ceiling on the ‘long context replaces continual learning’ / agent-as-employee thesis until the wall moves.

So the same headline — ‘AI is compute-bound’ — actually unfolds into three nested constraints: HBM bandwidth caps context (Pope) → power caps the data-centre buildout (Chamath, Gray) → cost of capital caps who can buy long-dated capacity (Tuhin). Anthropic’s $45B raise [forecast: 2026-05-03-001] doesn’t fix any of them on a 12-month view. Where I’d put numbers on this:

PTJ vs Gray: the same IPO number, opposite read

The same data point shows up as a bear case from Paul Tudor Jones on Invest Like the Best and a bull case from Jon Gray on Q2 Market Views. Both are right about the number; they disagree on which way it cuts.

PTJ’s bear case (the bubble math): US stock market cap is now 252% of GDP. That’s the highest ratio in history. 1929 peak: 65%. 1987 peak: 85-90%. 2000 peak: 170%. Mean reversion to the 25-30 yr trailing PE = ~30-35% S&P decline. Apply that to 252%-of-GDP equity wealth and you get a ~89% of GDP reverse wealth effect, with cap gains (10% of US tax revenue) going to zero — budget deficit blows up, bond market ‘gets smoked’, self-reinforcing. The IPO number: 2026 contemplated IPO supply ≈ 5-6% of market cap vs ~2-3%/yr the market has been net-retiring via buybacks for a decade. And buybacks themselves are collapsing because hyperscaler CapEx is eating free cash flow (Amazon FCF -97% QoQ, MS/Google/Meta -12/-12/-8%). ‘Cascade of selling’ analogous to 2001-2002 post-IPO unlocks. Tech is ‘dogged’ because that’s where the IPO funding gets sourced from.

Gray’s bull case: ‘Year of the IPO. Two of the largest tech companies in the world will go public — that helps receptivity. We’ve got 9 companies on file globally.’ Same supply event, framed as evidence that the public-markets risk window is open. PTJ’s frame: this is the cascade catalyst. Gray’s frame: the demand exists for the supply. Both can be true if you separate the cohorts — newly-listed AI-infra and energy/data-centre names absorb capital eagerly, the broader index re-rates lower as buyback support evaporates and LP capital rotates into the IPOs.

PTJ’s other notes worth keeping: PE allocation in institutional portfolios went 7% (2008) → 16% (2026), real estate and infrastructure also up. The illiquidity stack going into a drawdown is structurally worse than the GFC entry conditions. Buying S&P at PE 22 historically produces negative 10-year forward returns. Gray’s tone counter-counter: ‘stay calm, stay positive, never give up’ — and the operating data is good (Q1 PE portfolio +10% revenue growth).

The synthesis on this prediction — pick your battle:

The SaaS apocalypse had its first wind-down — and its operator-side proof

Last week’s Issue 02 prediction — ‘at least one major SaaS incumbent does an acquihire or product-line wind-down explicitly attributed to LLM-native displacement within nine months’ [forecast: 2026-04-26-009] — got a partial validation event this week. Thoma Bravo handed Medallia back to creditors. $5.1B of equity wiped. A pre-AI low-growth SaaS taking on $2B+ debt, then unable to service it. The honest read: this is partial validation, not clean. The wind-down is primarily a capital-stack failure (LBO debt + low growth = uncorrectable), not an explicit LLM-displacement story. But it’s the first PE wind-down of the cycle and it materially shifts the base rate.

The strongest framing of the surrounding logic came from Lemkin on 20VC — the three-bucket SaaS framework (which I think becomes the durable mental model coming out of this issue):

  1. Melting iceberg — eroding terminal value, leveraged → effectively dead. (Medallia.)
  2. System of record — sticky but no agent activity → bounded cash flow, deep-value play. (Workday, Atlassian-without-agents.)
  3. Agent-using — increasing returns from AI traffic → growth re-acceleration possible. (Stripe, Cloudflare, Twilio.)

The operator-side validation came from a totally different show. Adam Foroughi (AppLovin, 20VC) gave the cleanest operating-leverage proof I’ve seen on the podcast: 84% EBITDA margins, ~$10M EBITDA per employee in the 400-person core, near-triple-digit revenue growth — and they cut 40-50% of headcount in that growth year because the roles were going to be automated. Eliminated CMO, COO, CRO, CHRO, Chief People Officer. 80-90% of code is AI-generated (vs Databricks’ 50% disclosed the day prior). Stack: mostly Claude Code, some Codex, less Cursor than before. Foroughi’s frame on the SaaS apocalypse: ‘when you get into an unpredictable outcome in the future, it’s very easy to sell businesses. The SaaS apocalypse is not done.’ Companies don’t wipe out — embedded software is sticky — but growth dies, terminal value gets discounted, SBC % blows out, downward spiral.

Jon Gray gave the credit-investor’s matching frame: ‘not all software is created equal. Deeply embedded systems of record — ripping them out will be quite difficult.’ And — important for credit cycle pricing — Blackstone’s PE loans typically carry ~60% equity cushion, so even in equity-wipeout scenarios, senior debt is well-protected. The investable corollary: bucket-1 equity is the trade to short; bucket-1 senior debt is largely fine.

What this gets us:

Lean ops as the new operating model — three independent proofs

Three companies in this issue independently confirmed the same operating pattern. They’re in completely different verticals.

AppLovin: 400 people in the core, $10M EBITDA per employee, no CMO/COO/CRO/CHRO. Engineers double as PMs. ‘A players won’t exist in bulk if you have a bunch of B’s, C’s, and D’s around them. The only way to fix a bloated culture is to fire 99% and rebuild from the ground up.’

Kalshi: 120 people. No managerial layer. Co-founder Luana ‘knows what 80-85% of the org is doing’ on Slack in any 48-hour window. People self-organise to a dynamically-listed top-N problems. Slope > intercept on hires. Trade-off explicitly accepted: ‘we take on more organisational chaos to avoid bureaucracy.’

Baseten: Was very flat until 12-18 months ago — Sarah Wang told Tuhin he ‘just needed leaders’ and pushed back his engineering instinct that all overhead is bad. Hero culture explicitly banned. First-principles + kind + low-ego + can-handle-no-manager = the explicit hiring rubric. Pager-culture as infrastructure DNA — co-founder Amir’s 7-year-old asks ‘is that a P0?’ when his pager goes off.

Plus a fourth, philosophically: Chamath’s short on the top 0.01% — three traits: work ethic + stamina (‘isn’t God-given, it’s a level of desire’), repetition + focus over thousands of hours, and honesty as the foundation of taste — and taste as the foundation of success. The Eric Brandon-AOL advice he keeps returning to: ‘be the most successful 22-year-old possible’ — don’t compare yourself to people in different contexts. And the warning shot: ‘I have worked with infinite Harvard/Stanford pipeline graduates who showed up with zero resilience.’

The pattern is sharp enough to call it: the operator-side reset that produces the AppLovin financial profile (84% EBITDA margins, $10M EBITDA per employee) is methodically replicable. The ingredients are layered-management elimination, AI-native engineer-as-PM, hero-culture ban, slope-over-intercept hiring, ruthless A-player concentration, and an explicit token-spend-tied-to-revenue-KPI discipline rather than token-leaderboards. The pattern is now visible across martech (AppLovin), prediction-market exchange (Kalshi), and inference cloud infrastructure (Baseten) — and will produce a wave of comparable financial profiles in 2026-27.

The AI-safety regulator gap and the cyber upgrade cycle

The most uncomfortable disclosure in this issue came from Paul Tudor Jones. He attended a closed conference about 18 months ago with one modeler from each of the top-4 labs (~35-40 people in the room). When PTJ asked how AI safety gets resolved, the consensus answer from the modelers themselves was: ‘I think we’ll finally do something about it when 50 or 100 million people die in an accident.’ Buffett sent PTJ a personal note after his CNBC segment: ‘I agree with you 100%, but the genie’s out of the bottle.’ PTJ’s policy ask: AI watermarking, made a felony to violate. He was deepfake-targeted twice this year already. The Atomic Energy Commission analogy: ‘18 months after Hiroshima we had the AEC. Three years into AI — what are you talking about? There is no regulation.’

The investable form of the same risk landed on All-In as the AI cyber upgrade cycle. Sacks framing: GPT-5.5 Cyber matches Mythos and is commercially shipping (Anthropic’s Mythos still gated). All frontier models will hit Mythos-grade in ~6 months. Chinese models (DeepSeek-4) at 80-85% of frontier already. Chamath: the best CSO he knows can ‘essentially manipulate every model.’ Beneficiaries flagged: CrowdStrike, Palo Alto Networks, Wiz — the white-hats get the tools first, find dormant bugs, harden infrastructure. Adam Foroughi on the same risk: ‘these models are built to audit code and expose vulnerabilities — short-term we’ll see more breaches because shipping outpaces audits, long-term a lot more buttoned up.’ Tuhin added the second dimension: ‘security crunched and operationally crunched onto people who can run these data centres’ — only 12 ‘good’ clouds and 3-4 in the gold tier despite the apparent supply.


Two short notes worth keeping

Kalshi and the load-bearing 2024 ruling. Tarek Mansour walked through the 6-year regulatory war and the October 2024 lawsuit win against the CFTC. That ruling is now the legal foundation under every prediction-market product launching in 2026. The structural argument why prediction markets are not gambling — the casino’s KPI is customer losses (forces algorithms to promote losses) vs the exchange’s KPI is transaction-fee volume (forces neutrality and trust) — is the policy/PR backbone the entire sector will use. Real institutional use cases now: Florida Keys hurricane hedging (insurance carriers have exited), Biden-era student-loan forgiveness hedging, S&P holders buying Republican/Democrat contracts to hedge election impact rather than selling the underlying. And the validation that sticks: a Federal Reserve research paper now cites Kalshi-style markets as ‘the best gauge we have on the economy.’

Steve Hilton and the California GOP primary. Trump-endorsed, leading the polls. Tax plan: 0% state income tax under $100k, 7.5% flat above. Also disclosed: CalDOGE-style audits estimate ~$425B of CA waste/fraud over 5 years (~20% of the state budget). California imports 80% of its oil (top supplier Iraq) despite significant in-state reserves; gas $7-8/gal. Hilton’s tech-relevant warning: ‘the proposed billionaire’s tax would be a complete disaster for the tech ecosystem.’ Worth tracking because the path to victory is not as long-shot as the consensus view suggests — needs ~5.9M votes; Trump got 6.1M in CA in 2024 with no campaign spend.


Eleven episodes, 10.7 hours. Analytic hand-off complete. The compute thesis tightened: it’s three floors deep, not one. The SaaS apocalypse moved from prediction to first-event. And the lean-ops pattern crystallised across three independent verticals. Next week: watching the prediction ledger, especially the $300B Blackstone data-centre prints, the next PE wind-down, and whether the Hilton primary delivers.

This Week's Episodes