In telecom, ‘AI RAN’ and ‘AI in the RAN’ (AI in RAN) refer to the same concept, effectively: the integration of artificial intelligence (AI) into radio access network (RAN) infrastructure. The first is a more specific term, for which an official industry group (the AI RAN Alliance) has been named and tasked to drive deep integration of AI into RAN hardware, software, and operational processes. It sets out a future where AI is not just an associated tool, but a systematic part of the RAN function. The second is a more general term, encompassing various applications of AI within the RAN.
Either way, the concept – AI RAN; AI in RAN – is important for new standalone 5G (5G SA) networks as it enables advanced features that would be difficult or impossible without AI automation at RAN level. It enables live network traffic prediction, dynamic resource allocation, and predictive maintenance, while also optimizing handovers, slices, and quality of service. It drives cost efficiency through automation, optimized network deployment, and energy savings, and is crucial for enabling advanced use cases like edge computing and network slicing.
But there is a newer concept for RAN-based AI, as well, which plays into the ecosystem narrative about how telcos will leverage their edge assets to rent space and host workloads (about ‘supporting AI’) – which might be termed ‘AI on the RAN’ (‘AI on RAN’) on the grounds radio networks have under-utilized compute capacity for other (AI) workloads. Actually, it is a sub-set of the whole AI-RAN initiative, borne of twin multi-tenancy and orchestration functions – providing the ability to run and manage RAN and AI workloads concurrently in the same infrastructure.
SoftBank and Nvidia have run trials to show that concurrent AI and RAN processing can be done, and can maximize capacity utilization. Nvidia reckons telcos can achieve almost 100 percent RAN-compute utilization compared to 33 percent for RAN-only workloads – while also implementing dynamic orchestration and prioritization policies to safeguard peak RAN loads. It splits AI-RAN workload distribution into three models: RAN-only, as normal, and RAN-heavy and AI-heavy, according to how capacity splits (1:2 or 2:1) between RAN and AI workloads.
Kanika Atri, senior director of telco marketing at Nvidia, writes in a blog: “From these scenarios, it is evident that AI-RAN is highly profitable as compared to RAN-only solutions, in both AI-heavy and RAN-heavy modes. In essence, AI-RAN transforms traditional RAN from a cost centre to a profit center. The profitability per server improves with higher AI use. Even in RAN-only, AI-RAN infrastructure is more cost-efficient than custom RAN-only options.” Indeed, the whole AI-on-RAN concept was a hot topic of conversation at MWC in Barcelona in March.
Stephen Douglas at Spirent comments: “AI RAN has been around for a while for energy efficiency and spectrum management, and to optimize RAN behaviour. … But we are also starting to see these new variants, about AI in the RAN and AI on RAN – in terms of, say, using RIC applications to drive non-real time network performance to fine-tuning behaviour or improve KPIs, or whether you could free-up GPU capacity in future RAN systems for third-party apps; maybe in low-peak periods or smaller deployments.”
But while the logic looks good, the logistics are unclear. He says: “AI in RAN makes absolutely sense for better energy efficiency and spectrum utilization, and so on. I am less convinced at the moment about this other concept – that if you build RAN on GPUs as well as CPUs, then you can rent space for other applications during idle RAN periods.” Douglas is not the only one to think this way.” He is not the only one. “It is a slightly imaginary concept,” responds Robert Curran at Appledore Research.
He goes on, raising questions more generally about the broader network-edge AI concept: “This idea is that if you can resell capacity, then maybe you can make a business out of it. A first generation of boxes is being built with clever software to manage combined telecom and non-telecom workloads. The problem is the monetization angle – to create a spot market for AI compute. The question is how much compute capacity is left over, and how it can be packaged-up and monetized? And whether there is even any demand for it.
“Which is the same with the whole [edge angle]. Because Germany, say, can run the whole country with about four data centers – without any latency issues. So the idea that you need compute power very close to the customer [maybe] depends on your geography and use cases…. Are companies really sitting around and waiting for this kind of pool of AI capacity – which has to be super cheap and might be taken away at any time because a base station wakes up, and bumps you down the list. Are there applications of that nature? There may be, but it is not clear.”