The new Mixture-of-Experts series runs on the open-source Huawei MindSpore framework
In sum – what we know:
TeleChat3 series – China Telecom’s TeleAI released the first large-scale Mixture-of-Experts (MoE) models trained entirely on domestically designed semiconductors.
Domestic hardware stack – Training was conducted exclusively on Huawei’s Ascend 910B AI chips and the open-source MindSpore framework, validating the feasibility of the domestic ecosystem.
Thinking mode – The models introduce a “Thinking” mechanism that makes reasoning processes traceable, aiming to improve logic and accuracy in complex tasks.
China may not have access to some of the US-designed hardware as easily as it might like, but it’s still clearly able to develop high-end large language models. China Telecom’s AI research arm, TeleAI, has open-sourced the TeleChat3 series of large language models, which are China’s first large-scale Mixture-of-Experts models trained entirely on domestically designed semiconductors.
It’s somewhat of a big deal for China’s home-grown AI efforts. Gaining access to Nvidia and other American-designed GPUs has been difficult for Chinese companies at best, but it seems as though it’s possible that China’s homegrown AI stack can actually support frontier-scale model development.
A massive model
The TeleChat3 lineup includes several model sizes, with the flagship being TeleChat3-105B-A4.7B-Thinking — a fine-grained MoE architecture packing 105 billion parameters. That naming convention highlights that only 4.7 billion parameters activate during any given inference pass, which is the core advantage of MoE designs. You get high performance without the computational overhead of running a dense model at that scale. There’s also TeleChat3-36B-Thinking, a dense architecture that likely offers different trade-offs depending on deployment needs.
Training happened at computing infrastructure in Shanghai Lingang, with the models consuming 15 trillion tokens along the way. The entire stack runs on Huawei’s Ascend 910B AI chips paired with the MindSpore deep learning framework — another Huawei-developed project, this one open-source. China Telecom is keen to emphasize full compatibility with the broader Huawei Ascend ecosystem, including Ascend Atlas800T A2 training servers. According to the company, Huawei’s hardware handled the “severe demands” of large-scale MoE training, though details about training efficiency, failure rates, or how this all stacks up against Nvidia hardware haven’t been shared.
China Telecom, which developed the model, was the first telco to adopt DeepSeek — but it makes sense the company would be looking to build its own model instead.
“Thinking Mode”
One of the features in TeleChat3 is called “Thinking Mode” — a mechanism that exposes the model’s reasoning process to users. The implementation works through specific guiding symbols in dialogue templates, prompting the model to generate intermediate reasoning steps before producing a final answer. This sounds a lot like chain-of-thought prompting techniques that have become standard practice in the field, though China Telecom positions it as a distinct architectural capability.
The goal is better performance on complex tasks involving logical deduction. China Telecom points to knowledge questions, mathematical reasoning, content creation, code generation, and intelligent agent applications as areas where this thinking mode should deliver advantages. The company claims performance across six core dimensions approaches “advanced international levels.” That said, no direct benchmark comparisons against GPT-5 or Claude have surfaced, so those claims deserve some skepticism until third-party evaluations emerge.
Geopolitics
There’s no way to understand the TeleChat3 release without considering the geopolitical backdrop. U.S. sanctions have cut both China Telecom and Huawei off from advanced semiconductors manufactured using American technology, which has pushed China’s tech sector to accelerate work on viable alternatives. TeleChat3 is the first public validation from a Chinese developer that large-scale MoE training can actually happen on domestic chips alone. To be clear, some of the bans of chip exports to China have been eased, but those chips still aren’t as easily available, and come at a high price.
Whether this amounts to genuine technological self-sufficiency or a workaround carrying hidden costs is harder to say. Critics of China’s semiconductor push have argued that Huawei’s chips remain less efficient than Nvidia’s latest hardware, potentially demanding more silicon, more power, and more time to hit equivalent results. China Telecom hasn’t released the kind of detailed comparisons that would let anyone assess these trade-offs independently.
The release also slots into China’s broader “Artificial Intelligence+” initiative — a government-backed push to deploy AI across sectors like government services, communications, energy, and finance. TeleChat3 looks positioned as part of that effort, offering a domestically-produced model that sidesteps reliance on foreign technology for sensitive applications.
In a departure from some other Chinese AI projects, China Telecom has made the model weights, inference code, and usage examples available on GitHub and ModelScope. Going open-source opens the door for academic researchers and commercial developers alike, potentially speeding adoption while also enabling some degree of independent scrutiny. Of course, it remains to be seen how much traction the models gain outside China.
