The global economy is shifting from software intelligence to embodied AI — where algorithms meet physical production
Editor’s note: This is part II of a two-part series. Read Part I: Why AI strategy is now an industrial race here.
The big idea: Software intelligence has already reorganized the global economy. Now, embodied intelligence (AI that moves, builds, and manipulates the physical world) will reorganize it again. This shift from bits to atoms changes everything: how value is created, where rents accumulate, and which infrastructure matters. Advantage will be defined not just by algorithms, but by mastering the feedback loop between machine learning and physical production.
The new production function
Embodied intelligence is not just a robot with an AI chip. It is the fusion of three technologies into a single, integrated system:
Perception & decision-making: Computer vision, sensor fusion, and real-time inference.
Actuation & control: Electric motors, hydraulics, and power electronics that convert decisions into motion.
Connectivity & coordination: Networks that allow distributed machines to synchronize behavior.
When these layers integrate tightly, they create machines that improve through physical interaction. This is a production function where capability compounds with deployment volume, not just R&D spending. The critical question is how fast this learning occurs and whether it justifies the enormous capital cost of mass deployment.
This competition is already reshaping three massive industries.
Arena 1: Autonomous vehicles (the test case for learning)
The AV industry is the clearest test of simulation versus deployment learning.
Waymo’s simulation-first approach: By simulating obsessively and deploying cautiously, Waymo achieves near-flawless performance within geofenced zones. As of late 2024, Waymo operates commercial robotaxi services in four U.S. cities: San Francisco, Phoenix, Los Angeles, and Austin. The challenge is generalization; expanding to a new city requires months of high-definition mapping and localized testing. The learning curve appears asymptotic: after a certain threshold, additional simulated miles yield diminishing returns in handling real-world edge cases.
Tesla’s deployment-first approach: Tesla’s Full Self-Driving Beta, deployed to several hundred thousand vehicles in North America, generates hundreds of millions of real-world miles annually. Every disengagement, every unmapped construction zone, every unusual weather condition feeds back into model training. This system encounters a far larger solution space than Waymo’s geofenced operations, betting that physical diversity beats virtual volume. Tesla’s approach accepts higher short-term failure rates in exchange for exposure to irreducible real-world complexity.
Baidu’s hybrid approach: Baidu’s Apollo combines HD maps for baseline safety with vision-based generalization. Government partnerships enable commercial or pilot robotaxi deployments across approximately ten Chinese cities, with scale ranging from limited pilot zones to fleets of hundreds of vehicles in cities like Wuhan and Shenzhen. This generates operational data at scale while maintaining tighter safety constraints than Tesla’s consumer-facing approach.
The empirical question: The winner will be determined by whether the edge cases of real-world driving are irreducibly complex or whether simulation fidelity can eventually capture them comprehensively, favoring Waymo’s strategy. The tens of billions of dollars being wagered will resolve this question within the next decade.
Arena 2: Industrial robotics (the battle for iteration speed)
Industrial robotics reveals a different dynamic: vertical integration determines iteration speed.
Traditional industrial robots from firms like FANUC, ABB, and KUKA are marvels of mechanical precision but operationally brittle. A misplaced component, unexpected lighting variation, or slightly warped part can halt production. Adding AI-driven flexibility such as vision-based adaptation, and force-sensing manipulation, requires these mechanical specialists to partner with software firms. This introduces coordination delays that slow system-level iteration.
In contrast, Chinese manufacturers like Estun and Inovance control more of the value chain. They manufacture servo motors, motion controllers, and drive systems in-house, though they still partner with AI vision specialists like Megvii and SenseTime rather than developing all software internally. This partial vertical integration is not about having superior AI, it is about faster system-level iteration. When perception algorithms improve, these firms can simultaneously optimize motor control parameters, sensor placement, and mechanical design without negotiating across organizational boundaries.
The result: Western industrial robots often exhibit higher initial performance, but partially integrated Chinese firms can improve faster through deployment-driven learning. Over a ten-year horizon, the question is whether initial capability advantage or iteration speed matters more. The answer likely varies by application.
Arena 3: Humanoid robots (the next general-purpose platform)
Humanoids represent the ultimate embodied challenge: generalized physical capability in unstructured environments.
The engineering-first model (Boston Dynamics): Atlas demonstrates stunning dexterity and represents the pinnacle of electromechanical engineering. However, Atlas has never been sold commercially; it remains a research platform. Its presumed production cost is extremely high given its complexity, and no clear business model has emerged despite two decades of development.
The manufacturing-first model (Tesla): Optimus inverts the equation. By designing explicitly for manufacturability and reusing automotive supply chains (think motors, batteries, power electronics, and compute platforms adapted from vehicle production) Tesla’s stated goal is a unit cost under $20,000, though this target remains unproven at scale. The robots are initially far less capable than Atlas, but the model is deployment-driven: iterate through volume production, improve through field data, scale through cost reduction. Success depends on whether manufacturing learning can close the capability gap faster than engineering-first approaches can reduce costs.
The niche-application model (Figure AI, 1X Technologies, Sanctuary AI): These venture-backed firms target specific commercial applications. This includes warehouses, retail fulfillment, elder care. where even limited capability commands premium pricing. By focusing on narrow use cases rather than general capability, they aim for faster commercialization and revenue generation.
The state-supported model (China): Dozens of startups (Unitree, Fourier Intelligence, UBTech) compete to build low-cost humanoids, leveraging Shenzhen’s dense manufacturing ecosystem for rapid hardware iteration. Local governments provide subsidies and guaranteed purchase orders to accelerate deployment, treating humanoid robotics as strategic industrial policy rather than purely market-driven development.
The compute wildcard: Humanoid robots require real-time, on-device inference for vision, balance control, and manipulation. These tasks are poorly suited to cloud processing due to latency and connectivity constraints. The race to build optimal edge AI chips for embodied systems is contested by Qualcomm (Snapdragon platforms for mobile robotics), Huawei (Ascend AI accelerators), Tesla (derivatives of its FSD and Dojo architectures), and Google (TPU Edge, though its robotics-specific commitment remains unclear). This battleground will create a bottleneck as economically significant as Nvidia’s GPU dominance in training infrastructure.
Where the new value will accumulate
The shift to embodied AI redistributes economic value, creating four new chokepoints.
Chokepoint 1: Materials and energy
Physical systems require atoms, not just bits. Lithium, cobalt, copper, rare-earth elements, and silicon carbide are the fundamental inputs.
China’s state-directed industrial strategy has given it control over approximately 60-65% of global lithium refining capacity and 85-90% of rare-earth element processing. Element processing is the critical midstream stages that convert raw ore into battery-grade and magnet-grade materials. While Australia and Chile dominate lithium mining, and the United States has significant rare-earth deposits, China’s two-decade investment in refining infrastructure means it controls the bottleneck between raw materials and finished components.
This translates directly into structural influence over the cost and availability of inputs for every intelligent machine. The U.S. Inflation Reduction Act attempts to rebuild domestic processing capacity, but the gap is substantial: CATL’s current annual battery production capacity (approximately 400 gigawatt-hours) vastly exceeds current U.S. manufacturing capacity and rivals the total planned U.S. capacity by 2030, even accounting for IRA-incentivized investments.
Chokepoint 2: Edge compute and custom silicon
Embodied systems cannot depend on cloud inference. Latency, connectivity reliability, power consumption, and bandwidth constraints demand local processing.
Nvidia dominates AI training infrastructure through CUDA lock-in and architectural leadership, but the race for edge inference remains wide open. The future is heterogeneous and domain-specific: different silicon optimized for different embodied tasks. Qualcomm targets mobile robotics and automotive applications; Huawei’s Ascend platforms integrate with industrial automation; Tesla designs custom inference accelerators for vehicle perception and future robotics applications; Google’s TPU Edge exists but its strategic focus on embodied systems remains ambiguous.
The architecture that wins must be power-efficient, capable of real-time inference, and co-designed with actuation and sensor systems. Whoever captures this layer will extract recurring revenue from every deployed intelligent machine. This is a rent stream potentially comparable to Nvidia’s current position in training, but distributed across vastly higher unit volumes.
Chokepoint 3: Vertical integration and system design
The most defensible advantage is not mastery of any single technology layer but the organizational capability to co-design hardware, software, and manufacturing processes together.
Tesla, BYD, and Huawei exemplify this approach. When Tesla improves its Full Self-Driving neural network architecture, it simultaneously optimizes custom chip design, sensor placement, thermal management, and vehicle dynamics. When BYD advances battery chemistry, it redesigns structural integration, cooling systems, and production tooling in parallel. These firms internalize the coordination costs between interdependent systems, collapsing decision cycles from months to weeks.
This integration advantage explains why incumbents struggle. Traditional automakers must coordinate across dozens of suppliers, each operating on different timelines with misaligned incentives. Vertically integrated firms make critical components in-house, enabling rapid iteration on system-level performance rather than component-level optimization.
Chokepoint 4: The industrial network control layer
Embodied intelligence at scale requires network infrastructure fundamentally different from consumer connectivity.
Critical industrial control applications, such as synchronized multi-robot assembly lines or autonomous vehicle platooning, require time-sensitive networking with bounded latency guarantees, typically sub-10 milliseconds for motion control and coordination tasks. Many other industrial use cases, such as remote diagnostics, video monitoring and over-the-air updates, tolerate higher latency but still demand reliability and determinism unavailable in best-effort consumer networks.
These capabilities require private 5G deployments with deterministic networking extensions (IEEE 802.1 Time-Sensitive Networking and IETF Deterministic Networking standards), not consumer-grade infrastructure.
China’s approach, driven by state-owned telecom operators (China Mobile, China Unicom, China Telecom) and equipment manufacturers like Huawei, prioritizes industrial deployments in manufacturing zones, ports, and logistics hubs. Spectrum allocation explicitly supports private industrial networks. Equipment vendors bundle AI processing, edge compute, and networking infrastructure into vertically integrated offerings.
The U.S. approach remains fragmented. Spectrum allocation prioritizes consumer applications. Industrial clients must negotiate with commercial carriers or build private networks independently. Edge compute infrastructure is distributed among hyperscalers with incompatible platforms and business models.
The country that treats machine-to-machine communication as public infrastructure comparable to roads or electrical grids, will enable faster deployment and tighter coordination of embodied intelligence systems.
The new strategic equation
The long-term winner will be whichever system closes the loop between algorithmic learning and manufacturing learning. Currently, both major economies remain incomplete:
- The U.S. leads in algorithms but lacks manufacturing depth at scale.
- China leads in manufacturing scale but depends on external sources for frontier AI capability.
| Layer | U.S. position | China’s position | Value accrual |
| Foundation Models | Dominant | Catching up | Declining (commoditization trend) |
| Training Compute (GPUs) | Dominant (Nvidia) | Limited (Huawei alternatives) | High but increasingly contestable |
| Edge Inference Chips | Emerging (Qualcomm, Tesla) | Emerging (Huawei) | Future rent center |
| Battery Supply Chain | Minimal | Dominant (CATL, BYD, Gotion) | High and growing |
| System Integration | Concentrated (Tesla) | Distributed across many firms | Winner-take-most dynamics |
| Industrial Networks | Fragmented | Coordinated (state-backed) | Infrastructure rent layer |
A new playbook for leaders
Three developments could decisively shift this strategic balance:
A breakthrough in simulation-to-real transfer would massively expand the U.S. advantage in compute and modeling. If physical testing becomes largely optional and if virtual environments can capture the full complexity of real-world physics, materials behavior, and edge cases, then firms like Nvidia, Google, and leading AI labs could design, validate, and optimize embodied systems entirely computationally, then manufacture wherever costs are lowest.
A breakthrough in automated manufacturing learning would exponentially compound China’s scale advantage. If production systems become genuinely self-optimizing (for instance, using AI to continuously improve factory processes, quality control, and supply chain coordination without extensive human expertise) then volume manufacturing would generate capability improvements automatically, reducing dependence on frontier research and widening the gap with lower-volume competitors.
Persistent geopolitical fragmentation would prevent either system from achieving full integration. The U.S. would continue optimizing for high-margin, low-volume production concentrated in frontier applications. China would continue optimizing for low-margin, high-volume deployment focused on cost reduction and scale. Neither would close the learning loop completely. Both would become vulnerable to whichever third system, India, the European Union, or a coalition of aligned nations, manages to combine frontier research capability with manufacturing scale.
The policy imperative
For leaders and policymakers, this new era requires abandoning legacy frameworks that treat manufacturing as low-value-added activity suitable for offshoring.
In an embodied intelligence economy, manufacturing is where learning happens. Losing production capacity means losing the feedback loop that improves both hardware and software. Algorithms that never encounter physical deployment constraints remain untested. Manufacturing processes that never benefit from algorithmic optimization remain static.
Nations that want to lead must:
Treat robotics and automation infrastructure as strategic assets—comparable to semiconductors, telecommunications, or energy infrastructure. Industrial policy must prioritize deployment subsidies, not just research grants.
Rebuild vertical integration in critical industries—or create incentives for tight coordination across fragmented supply chains. Iteration speed depends on internalizing the coordination costs between interdependent system layers.
Reform capital allocation to reward long-horizon industrial learning over short-term financial returns. Manufacturing learning compounds over decades, not fiscal quarters. Private capital markets systematically underinvest in these trajectories without policy support.
Build deterministic, machine-centric network infrastructure—treating machine-to-machine communication as public utility, not consumer luxury. Private 5G deployments with time-sensitive networking capabilities should receive the same infrastructure investment priority as roads and power grids.
Measure and optimize for learning rate, not just innovation announcements. Track how rapidly deployed systems improve in cost, capability, and reliability over time. These metrics matter more than breakthrough publications or patent counts.
The long view
The cognitive stack made AI valuable. Electrification and energy systems made it scalable. Embodied intelligence makes it inevitable: the infrastructure through which intelligence acts on the physical world.
The competition is no longer about who builds better models. It is about who learns faster from deploying them.
The country that closes the loop between invention and production (breakthrough research feeding volume manufacturing feeding breakthrough research) will capture the compounding returns that define industrial leadership for the next century.
That race has only just begun.
