Later this month, HP Enterprise will deliver what appears to be the first server aimed specifically at AI inference for machine learning.
Machine learning is a two-part process, training and inference. The training uses powerful GPUs from Nvidia and AMD or other high-performance chips to “teach” the AI system what to look for, such as image recognition.
Inference responds if the subject matches the trained models. A GPU is overkill for this task, and a much lower power CPU can be used.
Enter Qualcomm’s Cloud AI100 chip, designed for artificial intelligence at the edge. It has up to 16 “AI cores” and supports FP16, INT8, INT16, FP32 data formats, all of which are used in inference. These aren’t custom Arm processors, they’re entirely new SoCs designed for inference.
The AI100 is part of the HPE Edgeline EL8000 edge gateway system that integrates compute, storage, and management into a single edge device. Inference workloads are often larger in scale and often require low latency and high throughput to enable real-time results.
The HPE Edgeline EL8000 is a 5U system that supports up to four independent server blades grouped together using dual-redundant in-chassis switches. Its smaller sibling, the HPE Edgeline EL8000t is a 2U design supporting two independent server blades.
In addition to performance, Cloud AI100 has low power consumption. It comes in two form factors, a PCI Express card and two M.2 chips mounted on the motherboard. The PCIe card has a power envelope of 75 watts while the two M.2 form factor units draw either 15 watts or 25 watts. A typical CPU consumes over 200 watts and a GPU over 400 watts.
Qualcomm says Cloud AI 100 supports all major industry-standard model formats, including ONNX, TensorFlow, PyTorch, and Caffe, which can be imported and prepared from pre-trained models that can be compiled and optimized for the deployment. Qualcomm has a set of tools for porting and preparing models, including support for custom operations.
Qualcomm says Cloud AI100 targets manufacturing/industrial customers, as well as those with advanced AI requirements. Use cases for AI inference computing at the edge include computer vision and natural language processing (NLP) workloads.
For computer vision, this could include quality control and quality assurance in manufacturing, object detection and video surveillance, and loss prevention and detection. For NLP, this includes programming code generation, smart assistant operations, and language translation.
Edgeline servers will be available for purchase or rental through HPE GreenLake later this month.
Copyright © 2022 IDG Communications, Inc.