NZVC - General Compute - Sidecar
We have secured an early allocation in General Compute, the world's first neocloud dedicated to ASIC (Application-Specific Integrated Circuit) inference — and we are making this opportunity available for our broader network.
We were the first backers of General Compute, alongside Village Global (LPs include Jeff Bezos, Bill Gates, Reid Hoffman, Eric Schmidt, and Mark Zuckerberg). We believe General Compute is positioned to become a category-defining infrastructure company at exactly the moment the AI industry is shifting from training to inference as its dominant workload.
Why we are excited
-
The inference bottleneck is the defining AI infrastructure problem of the next decade. There will be over 1 billion deployed agents by 2029, each consuming tokens 24/7. LLM generation speed is physically constrained by memory bandwidth, and General Compute's SambaNova-based platform is already delivering 500–1,900 tokens/second in live benchmarks — 5–7x faster than the fastest GPU providers like together.ai. Speed is intelligence, and speed is where pricing power is migrating: Anthropic's Opus Fast Mode charges 6x for 2.5x the speed, Meta acquired Groq for $20B, and OpenAI signed a $10B+ deal to move Codex to Cerebras. The market is repricing on speed, and ASIC is the only architecture that delivers it.
-
A proven founder who has built and exited before. Finn Puklowski (CEO) scaled Fluency Academy to $40M+ ARR and sold 20% for $50M USD — a real exit with real cash, not a paper markup. Previously backed by General Atlantic. He brings a 300K-strong developer community as built-in distribution, and has already out-hustled his stage: personally securing the SambaNova partnership, meetings with AMD's SVP of Global Markets (en route to Lisa Su), and a referral pipeline to MiniMax (OpenRouter's #1 model at 1.82T tokens/week). Co-founder Jason Goodison (ex-Microsoft, YC alum) leads the platform and has already shipped it.
-
Privileged access to the hottest silicon in the world. General Compute has acquired 10% of SambaNova's entire operating cloud (live within 8 weeks) and has secured first-mover allocation on next-gen SN50 silicon (Q1 2027) — third customer in line globally, behind only SoftBank and a European telecom. GPU incumbents cannot follow: NVIDIA and AMD underwrite GPU neoclouds and invest directly in their partners, structurally conflicting them out of the ASIC category. ASIC hardware pays back in ~11 months vs ~3 years for GPUs, creating a faster reinvestment flywheel.
Company Snapshot: General Compute
- Category: ASIC inference neocloud (the first of its kind)
- Silicon partner: SambaNova RDU (Reconfigurable Dataflow Unit)
Go To Market Approach
- Phase 1 (live in 8 weeks): 10 SN40 nodes — 10% of SambaNova's operating cloud
- Phase 2 (Q1 2027): 15 SN50 racks on order, third in line globally
- Phase 3 (2027–2028): Scale to 80–200 racks, primarily via asset-backed trade finance
- Platform: Live, OpenAI-compatible API, 1-line-of-code migration via OpenRouter
- Existing investors: Village Global, Carya, NZVC
- Comparable neocloud outcomes give a sense of the prize: CoreWeave at $68B post-IPO, Crusoe at $10B, Lambda at $2.5B — all GPU-based, and all capped by GPU architecture.
The AI Inference Market: A Primer
The Basics: Training vs. Inference
Every AI model goes through two phases. Training is when a model learns — ingesting massive datasets to build its capabilities. This happens once (or periodically) and is enormously compute-intensive. Inference is what happens every time you actually use the model — every ChatGPT query, every Claude response, every AI agent action. Training built the brain; inference is the thinking.
For years, training dominated the conversation and the spending. But as AI moves from research labs into production — powering coding assistants, voice agents, and autonomous systems used by millions — inference has become the dominant workload. Every token a user sees is an inference cost. At scale, inference spending dwarfs training.
The Hardware: GPUs, ASICs, and Why It Matters
GPUs (Graphics Processing Units) were originally designed to render video game graphics. Their ability to do many calculations in parallel made them useful for AI training, and NVIDIA built a trillion-dollar business on this accidental fit. But GPUs are general-purpose chips — they're good at many things, not optimized for any one thing. For inference specifically, they're inefficient: they draw enormous power, generate heat that requires liquid cooling, and hit a ceiling around 120 tokens per second (roughly how fast they can produce words of output). ASICs (Application-Specific Integrated Circuits) are chips designed for one job. Bitcoin miners use ASICs. Google's TPUs are ASICs. For AI inference, companies like Cerebras, Groq, and SambaNova have built ASICs specifically to generate tokens as fast as possible. The results are dramatic: 500–1,900 tokens per second (5–15x faster than GPUs), with roughly 10x less energy consumption per rack. The tradeoff is flexibility — ASICs are less versatile than GPUs — but for inference at scale, that tradeoff is worth it.
The key insight: agents need to think fast to be useful. A coding agent, voice assistant, or robot that pauses for seconds between thoughts isn't collaborative. Faster inference literally means more intelligence per second of wall-clock time.
The Players: Hyperscalers, Neoclouds, and the Stack
Hyperscalers are the giants — AWS, Azure, Google Cloud. They offer every service imaginable and rent out GPU capacity alongside everything else.
Neoclouds are a newer category: specialized cloud providers focused exclusively on AI compute. CoreWeave, Lambda, Together AI, and Crusoe are examples. They typically buy GPUs in bulk from NVIDIA and rent them out at better prices and with better AI-specific tooling than hyperscalers. Most neoclouds today are GPU-based.
General Compute is building the first ASIC-based neocloud — the same business model (specialized AI cloud), but running on purpose-built inference silicon from Cerebras and SambaNova instead of NVIDIA GPUs.
The Market Dynamics Right Now
Three forces are converging:
-
The pricing model is flipping from cost-per-token to cost-per-speed. NVIDIA acquired Groq for $20B in late 2025. OpenAI signed a $10B+ deal with Cerebras to run Codex at 1,000+ tok/s on separate hardware. Anthropic launched an "Opus Fast Mode" charging 6x the price for 2.5x the speed — the same model, just faster. Customers are clearly willing to pay premiums for speed.
-
Incumbents are structurally blocked from pivoting. NVIDIA and AMD underwrite neocloud revenue, provide preferred GPU quota, and invest directly in their cloud partners — a circular financing arrangement. When OpenAI moved Codex workloads to Cerebras, NVIDIA reportedly cut investment in that customer from ~$100B to ~$30B. Existing neoclouds can't easily switch to ASICs without cannibalizing their NVIDIA relationships.
-
ASIC supply is scarce and just coming online. The best inference chips (like SambaNova's SN50, launching Q1 2027) are supply-constrained. Whoever locks in allocation now owns the capacity when demand explodes.
The Economics
ASIC inference clouds have fundamentally better unit economics than GPU clouds:
Energy cost: $0.033/kWh (General Compute's rate) vs. $0.13/kWh US commercial average — 65% cheaper
Energy consumption: 17 kW per rack vs. 120 kW for GPU equivalents
Cooling: Air-cooled vs. liquid-cooled (less overhead, less complexity)
Payback period: ~11 months for ASIC racks vs. ~3 years for GPU clouds
Why This Matters for Investors
The bet is simple: inference is the largest and fastest-growing segment of AI compute, agents need speeds GPUs can't deliver, and ASICs are the only path. The market is already voting with its wallet (NVIDIA acquiring Groq, OpenAI partnering with Cerebras, Anthropic monetizing speed directly). Building an ASIC-native neocloud now — with locked-in silicon allocation, cheap energy, and a customer base already trained on speed-based pricing — positions General Compute to capture the infrastructure layer of the agentic era before the incumbents can pivot.
Investment structure
The investment is structured as two separate post-money SAFEs on the YC standard form, total not to exceed US$1.0M. Tranche 1 (US$500K) converts at a US$40M post-money cap with no discount. Tranche 2 (up to US$500K additional) converts at the final seed round cap — to be set at close alongside Village Global as lead — also with no discount. This lets us anchor US$500K at today's attractive US$40M entry while preserving flexibility to top up at the market-clearing seed price once the round prices.
You are invited to join the sidecar investment fund arranged by NZVC. As a syndicate member, your interest would be held on bare trust for you by Catalist Nominee Limited. Please see the Investment Agreement and full terms in the Sidecar Investor Terms and Information folder below.
Terms for syndicate investors
- One-time management fee: 2.5% (discounted to 1.5% for NZVC Fund LPs)
- Carry: 20% (discounted to 10% for NZVC Fund LPs)
All amounts shown on this page are in US dollars unless otherwise specified.
Discussion forum
Use of this site remains subject to the Catalist Investor Terms and Conditions.
