Inspur Information Leads MLPerf™ v2.0 Inference Across All Data Center Closed Divisions

0

News and research before you hear about it on CNBC and others. Claim your one week free trial for StreetInsider Premium here.


Significant performance gains were observed in image classification, speech recognition and natural language processing tasks of 31.5%, 28.5% and 21.3% respectively

SAN JOSE, Calif.–(BUSINESS WIRE)–MLCommons™, a well-known open engineering consortium, has released the results of MLPerf™ Inference v2.0, the leading suite of AI benchmarks. Inspur AI servers set records in all 16 tasks in the closed data center division, exhibiting the best performance in real-world AI application scenarios.

This press release is multimedia. View the full press release here: https://www.businesswire.com/news/home/20220408005153/en/

Inference performance improvements from MLPerf v1.1 to v2.0 (Graphic: Business Wire)

MLPerf™ was created by Turing Award winner David Patterson and top academic institutions. It is the world’s leading AI performance benchmark, hosting AI inference and training tests twice a year to track and evaluate the rapidly growing development of AI. MLPerf™ has two divisions: Closed and Open. The Closed division provides an apples-to-apples comparison between vendors as it requires the use of the same model and optimizer, making it a great benchmark.

MLPerf™’s first AI inference benchmark in 2022 aimed to examine the inference speed and capabilities of computer systems from different manufacturers in various AI tasks. The closed division for the data center category is the most competitive division. A total of 926 results were submitted, double the previous benchmark’s submissions.

Inspur AI servers set new records for inference performance

MLPerf™ AI Inference Benchmark covers six widely used AI tasks: Image Classification (ResNet50), Natural Language Processing (BERT), Speech Recognition (RNN-T), Object Detection (SSD-ResNet34) , medical image segmentation (3D-Unet) and recommendation (DLRM). MLPerf™ benchmarks require an accuracy of over 99% of the original model. For natural language processing, medical image segmentation, and recommendation, two accuracy goals of 99% and 99.9% are defined to examine the impact on computational performance when improving the accuracy goal. quality of AI inference.

To better match real-world usage, MLPerf™ inference tests have two scenarios required for the data center category: offline and server. Offline scenarios mean that all data required for the task is available locally. The server scenario has data delivered online in bursts when requested.

Inspur Information Results in MLPerfTM Inference v2.0

Author

Division

Task

Model

Result

Units

Accuracy

Script

Infuse

Information

Data

Center

Farm

Picture

classification

ResNet50

449,856

samples/s

99%

Offline

400 583

requests/sec

99%

Server

Natural

Tongue

Treatment

BERT

38,776.7

samples/s

99%

Offline

35,487.4

requests/sec

99%

Server

19,370.4

samples/s

99.9%

Offline

16,790.5

requests/sec

99.9%

Server

Speech

recognition

RNN-T

155,811

samples/s

99%

Offline

136,498

requests/sec

99%

Server

Object

detection

(big)

SSD-

ResNet34

11,081.9

samples/s

99%

Offline

10,893.4

requests/sec

99%

Server

Medical image

segmentation

3D-Anet

36.25

samples/s

99%

Offline

36.25

samples/s

99.9%

Offline

Recommendation

DLRM

2,645,980

samples/s

99%

Offline

2,683,620

requests/sec

99%

Server

2,645,980

samples/s

99.9%

Offline

2,683,620

requests/sec

99.9%

Server

The Inspur AI server set a performance record by processing 449,856 frames per second in the ResNet50 model task, which is equivalent to completing the classification of 1.28 million images in the ImageNet dataset in just 2, 8 seconds. In the 3D-UNet model task, Inspur set a new record for processing 36.25 medical images per second, which is equivalent to completing the segmentation of 207 3D medical images in the KiTS19 dataset in 6 seconds. In the SSD-ResNet34 model task, Inspur set a new record by completing target object recognition and identification of 11,081.9 frames per second. In the BERT model task, Inspur set a performance record by completing an average of 38,776.7 questions and answers per second. In the RNNT model task, Inspur set a record of achieving 155,811 speech recognition conversions per second on average, and Inspur set the best record of achieving 2,645,980 click predictions per second on average in the task of DLRM model.

In the Edge Inference category, Inspur’s AI servers designed for Edge scenarios also performed well. NE5260M5, NF5488A5 and NF5688M6 won 11 titles out of 17 tasks in the Closed Division.

With the continuous development of AI applications, faster inference processing will bring higher efficiency and AI application capabilities, accelerating the transformation to smart industries. Compared to MLPerf™ AI v1.1 Inference, Inspur AI servers improved image classification, speech recognition and natural language processing tasks by 31.5%, 28.5% and 21.3 % respectively. These results mean that the Inspur AI server can perform various AI tasks more efficiently and quickly in scenarios such as autonomous driving, voice conferencing, smart Q&A, and smart medical care.

Full Stack Optimization Drives Continuous AI Performance Improvement

The outstanding performance of Inspur AI servers in the MLPerf™ benchmarks is due to Inspur Information’s excellent system design capabilities and full stack optimization capabilities in AI computing systems.

The Inspur AI NF5468M6J server can support 12 NVIDIA A100 Tensor Core GPUs with layered and scalable computing architecture, and set 12 MLPerf™ records. Inspur Information also offers servers supporting 8 x 500W NVIDIA A100 GPUs using liquid and air cooling. Among the high-end consumer models adopting 8x NVIDIA GPUs with NVLink in this benchmark, Inspur AI servers performed best in 14 out of 16 tasks in the data center category. Among them, the NF5488A5 supports 8 3rd Gen NVlink A100 GPUs and 2 AMD Milan CPUs in a 4U space. NF5688M6 is an AI server with extreme scalability optimized for hyperscalers. It supports 8 NVIDIA A100 GPUs and 2 Intel Icelake CPUs, and supports up to 13 PCIe Gen4 IO expansion cards.

In the Edge Inference category, the NE5260M5 comes with optimized signaling and power systems, and offers broad compatibility with high-performance processors and a wide range of AI accelerator cards. It features a shock-absorbing and noise-reducing design, and has undergone rigorous reliability testing. With a chassis depth of 430mm, almost half the depth of traditional servers, it is deployable even in space-constrained edge computing scenarios.

Inspur AI servers optimize the data path between CPU and GPU through precise calibration and comprehensive optimization of CPU and GPU hardware. At the software level, by improving circular scheduling for multiple GPUs based on GPU topology, the performance of a single GPU or multiple GPUs can be increased almost linearly. For deep learning, based on the computational characteristics of the NVIDA GPU Tensor Core unit, optimization of model performance is achieved through a channel compression algorithm developed by Inspur.

To view the full results of MLPerf™ Inference v2.0, please visit:

https://mlcommons.org/en/inference-datacenter-20/

https://mlcommons.org/en/inference-edge-20/

About Inspur Information

Inspur Information is a leading provider of data center infrastructure, cloud computing and artificial intelligence solutions. It is the 2nd largest server manufacturer in the world. Through engineering and innovation, Inspur Information delivers industry-leading hardware design and broad product offerings to address important technology sectors such as open computing, cloud data center, AI and deep learning. Performance-optimized and purpose-built, our world-class solutions enable customers to tackle specific workloads and real-world challenges. To learn more, visit https://www.inspursystems.com.

For more information:

fiona liu

Professional manager

Stimulating information

[email protected]

Viviane Kelly

Interprose for Inspur Information

+1 703.509.5412

[email protected]

Source: Inspur information

Share.

Comments are closed.