← Back to issue
No Priors

Baseten CEO Tuhin Srivastava on Custom Models, and Building the Inference Cloud

42m · Transcribed via assemblyai · Watch on YouTube

Tuhin Srivastava (Baseten CEO) — talking from inside the inference cloud at the moment everyone else's narrative starts. **30x revenue growth in 12 months, on track for >$1B in 2026.** 95% of tokens served are *custom* models (post-trained variants of open-source). Operates at mid-90s utilisation across **90 clusters in 18 clouds**, runs a daily 4pm capacity-allocation meeting. **GB200 access now requires 3-5 year contracts with 20-30% TCV prepay**, materially changing the IPO/financing calculus for inference companies. H100 still in demand 4.5 years post-launch — price still going up. Frontier open-source is now overwhelmingly Chinese (DeepSeek, Moonshot/Kimi, Canopy, Orpheus); 'effectively the Chinese government is subsidising US enterprise.' 400% NDR, top-30 customers never churned. Confirms Reiner Pope's framing — disentangling pre-fill and decode is 'the next set of primitives.' On Jevons: 'inference is the last market — even if there's AGI, all that's left is inference.'

Key points

Notable quotes

30x growth in the last 12 months. None of our top 30 customers have ever churned. We're talking 400% net dollar retention.

Tuhin Srivastava · 4:00

GPUs as a service is not sticky. Inference with the software layer included is incredibly sticky. In a world of constrained compute, the number one thing to own is compute.

Tuhin Srivastava · 28:20

Effectively the Chinese government is subsidising US enterprise via these open-source models. If we don't have access to that intelligence, we won't be able to innovate as fast.

Sarah Wang and Tuhin Srivastava · 18:20

If you want a B200 right now from a good cloud, you're not getting that less than a three-to-five-year contract with a 20-30% TCV prepay. Cost of capital is everything.

Tuhin Srivastava · 25:00

Inference is the last market. Even if there's AGI, all that's left is inference.

Tuhin Srivastava · 40:00

Capacity. That's what keeps me up at night. There's no world in which there's enough compute to get the value we want out of LLMs in the next five to ten years.

Tuhin Srivastava · 35:50

Themes

Mentioned