The emergence of neoclouds—independent GPU-as-a-service1 (GPUaaS) providers—is a direct response to two structural forces: a global scarcity of high-end compute, and the revenue diversification strategies of the largest advanced-chip producers.
Neoclouds originally emerged as stopgaps to address the GPU shortage, but their bare-metal-as-a-service (BMaaS) economics are fragile. Their long-term viability hinges on their ability to move up the stack into AI-native services, which puts them in direct competition with hyperscalers. Their future, however, likely lies not in rivaling hyperscalers but in securing positions in enduring niche markets, such as sovereign compute and specialized workloads, while also compounding the early footholds they’ve built with AI start-ups—relationships that can persist as those companies scale into multibillion-dollar platforms.
This article looks at neoclouds’ position in the market, the challenges they face as they plan their next move, and the likely mid- to long-term solutions.
The origins of neoclouds
Demand for advanced GPUs has surged in recent years as generative AI models have expanded in scale and complexity. Hyperscalers have secured the lion’s share of advanced-chip allocations to manage advancing workloads,2 leaving many AI start-ups, research labs, and enterprises unable to access capacity at the speed they require. Into this gap stepped neoclouds, a new wave of GPU cloud providers. Neoclouds offer flexible contracts, faster provisioning, and specialized infrastructure configurations. In addition, they price GPUs as much as 85 percent less than hyperscalers do, making them attractive to smaller gen AI start-ups.3
Neoclouds present lower barriers to entry than traditional cloud providers—standing up a compute cluster does not require building a full tech stack, as a hyperscale platform does—so new entrants can move quickly to capture unmet demand. More than 100 neoclouds exist in the world today: Between ten and 15 are operating at meaningful scale in the United States, and their footprint is growing across Europe, the Middle East, and Asia,4 often backed by venture capital, private equity, or sovereign-wealth capital. AI chip producers, through a deliberate diversification strategy, also encouraged this ecosystem. To broaden adoption, diversify revenue, and create competitive pressure among buyers, they have seeded these new channels, sometimes even acting as offtake customers to neoclouds.
The current state of neoclouds is not without precedent. For example, in the Cloud 1.0 era (which started in the early 2000s), start-ups grew rapidly by filling early compute gaps, but as hyperscalers expanded capacity and service breadth, nearly all of those start-ups were acquired, sidelined, or forced into niche roles.
Given the market context, can neoclouds evolve into a durable category, or will history repeat itself?
The promise of neoclouds
Four key beliefs are fueling the investment in neoclouds.
The BMaaS model is only a stepping stone. The BMaaS model that many neoclouds have adopted is inherently commoditized: It has limited differentiation, high spending intensity, and price-driven competition. But investors are not betting on BMaaS as the endgame. Instead, the investment thesis behind neoclouds is that they will be able to transition to AI-native software stacks that include training orchestration, distributed inference platforms, domain-specific stacks (for life sciences or financial services, for example), developer tools, and managed machine learning services. These layers will create stickiness, improve retention, and result in economics similar to those of software companies, providing for software-as-a-service-like multiples for neoclouds.
The demand for compute is too big to ignore. Even if neoclouds stick to a BMaaS model, the demand curve for AI compute is steep and accelerating. Training and inference workload demand will continue to grow rapidly (it’s expected to reach approximately 200 gigawatts by 2030), with infrastructure supply presenting the main bottleneck.5 In such an environment, any credible provider with a rack online can reasonably expect to find buyers.
Depreciated compute fleets have sustainable long-tail value. Even after primary contracts with hyperscalers wind down, GPU fleets can retain meaningful residual value if they are repurposed for enterprise and mid-market clients. Neoclouds can use large, low-margin offtake agreements with hyperscalers to finance fleet acquisition and build scale, and then they can extend the assets’ economic life by renting them at lower rates to enterprises that don’t need the newest generation of chips. In theory, this could create an enduring business model, though it remains unclear whether enterprises will adopt AI workloads at a scale sufficient to absorb such second-cycle capacity.
Chip producers can have a true derisking effect. While support from chip producers doesn’t guarantee that neoclouds will endure, it does create an implicit backstop.6 Chip producers often provide neoclouds with preferential allocations, financing structures, and even offtake commitments, thereby increasing the chances that neoclouds will survive.
The BMaaS model’s shaky economics
While there is potential for neoclouds to endure into the future, the economics of the BMaaS model, which is most dominant today, are unpromising for three reasons.
Margins leave little room for error
With a BMaaS model, gross margins are typically 55 to 65 percent before depreciation, depending on utilization and pricing (Exhibit 1).7
At these gross margins and the capital intensity levels of buying GPUs and CPUs and standing up the servers, the BMaaS model has almost no margin of safety and is beholden to price and utilization fluctuations. If there is a small decrease in GPU rental prices, or if utilization slips below 80 percent, returns will flatline. The economics become even more fragile when debt financing is taken into account, because interest costs quickly erase any residual cushion.
Price erosion and capital intensity put pressure on investments
The chip release cycle puts additional pressure on pricing levels. With each new chip generation, the price of older GPUs drops. Over a typical five-year depreciation horizon, the price of a GPU hour could decline by half or more (Exhibit 2). This dynamic requires service providers not only to recover capital within the first four to five years after the GPU becomes active in a data center to avoid stranded assets but also to continually reinvest in new GPU generations to stay competitive as older fleets lose relevance.
Big deals are less lucrative than they appear
According to some reports, the gross profit margin of GPU rental businesses is between 14 and 16 percent after labor, power costs, and depreciation, leaving the GPU rental business with lower margins than many nontech retail businesses.8
For neoclouds, these contracts are attractive less for their stand-alone economics than for what they can provide, including an almost-guaranteed baseline level of utilization and a stamp of credibility that makes the neocloud more attractive to investors—which in turn supports future fundraising. On the other side, hyperscalers are willing to pay a premium for ready-to-go capacity and the ability to use neocloud balance sheets to offload assets from their own.
These big deals have resulted in high revenue concentration for neocloud players. Public disclosures show that for some players, more than half of their revenue comes from just one or two customers.9
The road ahead for neoclouds
The neocloud model carries with it a structural paradox. A common expectation of investors for neoclouds is that they will move up the stack into AI-native software and managed services, which inevitably pits neoclouds against the hyperscalers that are, today, their anchor customers. Building orchestration layers, inference platforms, and verticalized solutions increases retention and margin potential, but it also overlaps directly with hyperscaler offerings. In addition, building a tech stack robust enough to compete and a go-to-market strategy that can break into the enterprise market takes an enormous amount of capital, time, and resources.
So what’s next for neoclouds? To escape commodity economics, neoclouds must pursue differentiation without alienating the same hyperscalers that provide their baseline utilization. Based on how the Cloud 1.0 era shook out, few players will be able to resolve this tension at scale.
We see three potential mid- to long-term paths for neoclouds to endure:
- Carve out defensible positions in niche markets. The neoclouds most likely to endure will be those that can carve out defensible positions in markets where hyperscalers are less effective or less welcome. For example, sovereign compute providers that are backed by governments or regional champions value independence from hyperscalers because of their hyperlocal focus and sensitive data pools. Specialized service providers could also provide promising partnerships and are optimized for use cases such as ultralow-latency inference or regulated verticals.
- Stay focused on start-ups and grow from there. Another enduring path for neoclouds is to stay focused on AI start-ups as their core customer profile, rather than chasing large enterprises. By providing compute services from day one, neoclouds can build footholds and trusted relationships with start-ups that can last as the start-ups scale into multibillion-dollar companies that consume massive workloads. That level of loyalty is often difficult for hyperscalers to replicate—and that launchpad could allow neoclouds to expand to AI-native enterprises.
- Consolidate. Consolidation is another potential trajectory for today’s neoclouds. Like many Cloud 1.0–era start-ups, some companies will be absorbed by hyperscalers, telcos, or sovereign buyers. At the same time, other companies will likely fade once supply catches up.
Whether neoclouds become enduring players or fade into history will depend on their ability to evolve faster than the market around them. Those that can turn early scarcity into long-term differentiation may help define a lasting new layer in the AI infrastructure stack.






