Multi-instance GPU provisioning helps ‘right-size’ edge AI deployments
Edge AI moves the computational focus out of the data center and closer to the edge of the network, where the users and their data are both located. Nvidia announced Monday improvements to Fleet Command, its cloud platform to manage edge AI apps, to help streamline and manage those deployments. The new features include new just-in-time (JIT) security functions and improvements aimed at optimal GPU utilization.
Managing edge AI can quickly become unwieldy at scale, according to Nvidia. Deployments may be spread around thousands of independent locations, requiring computational power spread to the network edge to drive meaningful business outcomes. Fleet Command provides administrators with a unified interface for edge AI lifecycle management, regardless of the distributed network topology. Early adopters included businesses in healthcare, retail, and logistics.
Administrators set up virtual locations and systems, mapped to physical devices at edge locations. Once deployed, Fleet Command promises centralized edge AI app lifecycle management. It supports over-the-air (OTA) updates, health monitoring, and visualization using remote tools. Fleet Command supports zero-touch networking, to reduce potential security exposure. The platform also sports one-touch provisioning and a simplified interface to manage app deployment and scale.
Nvidia added multi-instance GPU (MIG) provisioning to Fleet Command. MIG provisioning partitions an Nvidia GPU into several independent instances. MIG is a foundational capability of Nvidia’s Ampere GPU architecture, the basis of its data center-friendly A100, H100 and A30 Tensor Core GPUs. MIG enables a single Nvidia GPU to operate as seven independent GPU instances, each fully isolated with its own memory, cache, and compute cores.
This “right-sizing” approach enables administrators to partition GPUs and assign multiple AI applications to operate on the same GPU more efficiently, according to Nvidia.
Nvidia’s bolstered Fleet Command’s remote management with new just-in-time (JIT) security features which, by their design, reduce security exposure by making time-limited sessions. The new features are designed to overcome limitations with traditional network security like Virtual Private Networks (VPNs), according to Nvidia.
“First, most VPN connections do not have the ability to set time limits or restrictions. Administrators could (and often do) forget to close out a VPN session, leaving an avenue open for malicious actors,” said Nvidia’s Troy Estes in a blog post announcing the new features.
“Second, VPN connections do not easily provide the access controls needed for securely deploying and managing edge AI given the number of different partners, vendors, contractors, and other actors that might need access to parts of the deployment solution,” he added.
To that end, Fleet Command has two features designed to provide secure edge AI app deployment and lifecycle management: Remote system access via Fleet Command’s remote console, which Nvidia says eliminates the need for additional ports and traditional VPN connections.
“To ensure the highest security across nodes, Fleet Command infrastructure isolates each of the open nodes in separate sessions and ensures that any issues on one system do not affect other systems,” said Estes.
Fleet Command also provides remote application access, which allows web-based access to edge AI apps without requiring a manual connection to the system and VPN networking.
“Remote application access gives you visibility to the application services, providing full access to all features and functionality of the web applications running on the edge devices,” said Estes.
Administrators can configure time allotments that automatically end remote access sessions, to help streamline remote session management and lower resource utilization.