Table of Contents

The Nvidia-powered service bundles SoftBank’s telecom edge network with central GPU data centers

In sum – what we know:

Sovereign by design – All compute and data stay within Japanese jurisdiction, targeting a gap that AWS, Azure, and GCP have yet to fill locally.
Telecom as advantage – SoftBank ties its nationwide network to central GPU data centers via AITRAS edge nodes, claiming “5G for free” on shared hardware.
Phased rollout – The beta is live now, but commercial availability isn’t until October 2026, starting with internal SoftBank group use.

SoftBank has announced its “AI Data Center GPU Cloud,” a sovereign AI infrastructure service that pushes the operator further away from its telecom roots and into direct competition with global cloud giants. The service was unveiled as a key pillar of the company’s broader “Activate AI for Society” strategy.

A beta version went live immediately on the day of the announcement, though commercial availability isn’t scheduled until October 2026. Even then, the initial rollout will be restricted to internal use across SoftBank group companies before the service opens up to wider commercial customers.

It’s a notable move — and one that builds on a string of partnerships and pilots SoftBank has been quietly assembling over the past year, particularly with NVIDIA. Rather than launching a generic GPU cloud, SoftBank is bundling its telecom assets, edge network, and AI compute into a single offering pitched squarely at customers who want their data to stay inside Japan.

The software stack

At the core of the service is SoftBank’s proprietary software stack, the Infrinia AI Cloud OS. It pulls together SoftBank’s AI computing infrastructure with the software layers needed to actually run modern AI workloads at scale, rather than leaving customers to assemble bespoke solutions themselves.

Practically, that means two main delivery modes. The first is Kubernetes as a Service (KaaS) for multi-tenant environments, giving customers a managed orchestration layer for containerized workloads. The second is Inference as a Service (Inf-aaS), exposing large language model inference through APIs. Between the two, the platform is meant to support a broad range of workloads, from model training through inference and general data processing.

The pitch is fairly standard for this category — reduce total cost of ownership, cut the operational burden of running a GPU fleet, and give customers something closer to a turnkey AI platform than a raw infrastructure rental.

Hardware and technical infrastructure

On the hardware side, SoftBank is leaning heavily on Nvidia. The cloud is built on Nvidia GB200 NVL72 systems based on the Grace Hopper architecture, hosted within Japan-based data centers and running on SoftBank’s neocloud business framework. Infrinia AI Cloud OS sits across the stack, handling everything from BIOS configuration up through Kubernetes management on the GPU platforms.

There’s also a networking story worth flagging. SoftBank is using Nvidia BlueField-3 DPUs to accelerate both vRAN and generative AI workloads, with an integrated Nvidia Spectrum Ethernet switch providing the 5G timing protocol. T

Telco AI Cloud and AI-RAN integration

The AI Data Center GPU Cloud is a core component of what SoftBank is calling its “Telco AI Cloud” vision, a framing the company is pushing as next-generation social infrastructure for the AI era. The idea is to tie together central large-scale GPU data centers with multi-access edge computing distributed across SoftBank’s existing telecom network.

The edge piece runs on AITRAS, SoftBank’s fully software-defined AI-RAN solution, which is currently deployed at Nvidia’s Santa Clara headquarters. The goal is low-latency distributed inference processing at the network edge, with central data centers handling training and the heavy lifting.

Because the hardware is shared between AI and telecom workloads, SoftBank claims it effectively gets “5G for free” out of the same infrastructure — and Nvidia has said the approach delivers up to a 4x improvement in ROI for vRAN workloads compared to single-purpose 5G vRAN deployments.

Conclusions

Taken together, this is SoftBank pivoting from a traditional telecommunications operator into an AI infrastructure provider — and doing so by exploiting assets that pure-play cloud providers simply don’t have. The nationwide telecom network, which would otherwise be a single-purpose cost center, becomes a distributed AI competitive advantage.

The timing makes sense. Japanese enterprises have been increasingly vocal about data sovereignty and keeping AI processing within national borders, and the major global cloud providers — AWS, Azure, GCP — still have limited sovereign options in Japan. By guaranteeing that data and processing stay within Japanese jurisdiction, SoftBank is targeting a real gap in the market rather than trying to out-scale the hyperscalers on their own terms.

Useful Links

Edtior's Picks

Latest Articles

SoftBank launches sovereign AI GPU cloud