Liwaiwai Liwaiwai
  • /
  • Artificial Intelligence
  • Machine Learning
  • Robotics
  • Engineering
    • Architecture
    • Design
    • Software
    • Hybrid Cloud
    • Data
  • About
Liwaiwai Liwaiwai
  • /
  • Artificial Intelligence
  • Machine Learning
  • Robotics
  • Engineering
    • Architecture
    • Design
    • Software
    • Hybrid Cloud
    • Data
  • About
  • Artificial Intelligence
  • Machine Learning

Optimizing Infrastructure For Neural Recommendation At Scale

  • February 19, 2020
  • relay

We are sharing an in-depth characterization and analysis for infrastructures used to deliver personalized results in deep neural network-based (DNN) recommendation at scale. Although DNNs are often used to help generate search results, to provide content suggestions, and for other common applications for internet services, relatively little research attention has been devoted to optimizing system infrastructures to serve such recommendations at scale. In addition to sharing insights about how this important class of neural recommendation models performs at production scale, we’ve also released the open source workloads and related performance metrics that we used, to help other researchers and engineers to evaluate their DNNs.

Notable findings from this analysis include the following:

  • System heterogeneity leads to wide variations in inference latency across three generations of Intel servers.

  • Batching and colocation of recommendation inference can drastically improve latency-bounded throughput.

  • Heterogeneity in recommendation model architectures necessitates different system optimization strategies.

How it works:

To analyze the performance of production-scale recommendation models, we first identified quantitative metrics to evaluate recommendation workloads. We then designed a set of synthetic recommendation models to characterize inference performance on a variety of server-class Intel CPU systems. Our results highlight the unique challenges posed by efforts to increase the efficiency of DNNs used for recommendations, compared with the techniques used to optimize traditional convolutional neural network and recurrent neural network architectures.

For example, we found that the three generations of Intel servers commonly used in data centers — Broadwell, Haswell, and Skylake architectures — handle inference latency differently when serving production-scale recommendation models. Skylake systems make it easier to accelerate compute-intensive recommendation, and the exclusive cache hierarchy is less susceptible to latency degradation when multiple models are co-located on the same system. Given the improvement in throughput when colocating models, recognizing these characteristics can help improve how data centers schedule recommendation inference queries and optimize infrastructure efficiency.

Read More  TabNet On AI Platform: High-performance, Explainable Tabular Learning
This chart shows the execution flow of deep learning recommendation inference: Inputs to the model (N) are a collection of continuous (dense) and categorical (sparse) features. Sparse features, unique to recommendation models, are transformed to a dense representation using embedding tables (shown in blue). The number/size of embedding tables, number of sparse feature (ID) lookups per table, depth/width of Bottom-FC and Top-FC layers varies based on the use-case.
More generally, we showed that DNN-based recommendation systems differ from traditional neural networks in several important ways:
  • High-quality personalized recommendation requires much larger storage capacity.

  • At-scale recommendation inference execution produces irregular memory accesses.

  • The diversity of recommendation use cases in production can produce a diverse set of operator-level performance bottlenecks.

The resource requirement characteristics are due in part to the prevalence of both sparse and dense features; when ranking videos, for example, models must account for each individual user providing sparse input, interacting with only a handful of the thousands or even millions of videos available on a given platform. Engineers need to consider a wide range of performance and resource requirement characteristics when accelerating DNN-based recommendation models, including designing and optimizing for the recommendation inference hardware. Additional details of the system-level analysis and architectural insights are available in the paper linked below.

Why it matters:

Improving infrastructure efficiency for at-scale recommendation inference will hopefully contribute to faster and more accurate personalized recommendations for videos, products, and other ranked results. The insights from this analysis can be used to motivate broader system and architecture optimizations for at-scale recommendation.

This work builds on Facebook’s previous release of an advanced deep learning recommendation model, which can enable algorithmic experimentation and benchmarking for recommendation systems. We hope that sharing our results and open source synthetic models will shed further light on optimization opportunities for next generation AI-systems and help accelerate innovation across the AI community related to the design and modeling of neural recommendation systems.

Read the full paper:

The Architectural Implications of Facebook’s DNN-based Personalized Recommendation

Read More  Artificial Intelligence Model Detects Asymptomatic COVID-19 Infections Through Cellphone-recorded Coughs

 

Carole-Jean Wu

Source: Facebook AI Blog

relay

Related Topics
  • Deep Learning
  • Facebook Research
  • Open Source
You May Also Like
View Post
  • Artificial Intelligence
  • Technology
  • Tools

Ditching Google: The 3 Search Engines That Use AI To Give Results That Are Meaningful

  • March 23, 2023
View Post
  • Engineering
  • Machine Learning

Peacock: Tackling ML Challenges By Accelerating Skills

  • March 23, 2023
View Post
  • Data
  • Machine Learning
  • Platforms

Coop Reduces Food Waste By Forecasting With Google’s AI And Data Cloud

  • March 23, 2023
View Post
  • Artificial Intelligence
  • Machine Learning
  • Robotics

Gods In The Machine? The Rise Of Artificial Intelligence May Result In New Religions

  • March 23, 2023
View Post
  • Artificial Intelligence
  • Machine Learning

6 ways Google AI Is Helping You Sleep Better

  • March 21, 2023
View Post
  • Artificial Intelligence
  • Machine Learning

AI Could Make More Work For Us, Instead Of Simplifying Our Lives

  • March 21, 2023
View Post
  • Artificial Intelligence
  • Platforms

Microsoft To Showcase Purpose-Built AI Infrastructure At NVIDIA GTC

  • March 21, 2023
View Post
  • Artificial Intelligence
  • Engineering
  • Tools

The Next Generation Of AI For Developers And Google Workspace

  • March 21, 2023

Leave a Reply

Your email address will not be published. Required fields are marked *

Stay Connected!
LATEST
  • 1
    Ditching Google: The 3 Search Engines That Use AI To Give Results That Are Meaningful
    • March 23, 2023
  • 2
    Peacock: Tackling ML Challenges By Accelerating Skills
    • March 23, 2023
  • 3
    Coop Reduces Food Waste By Forecasting With Google’s AI And Data Cloud
    • March 23, 2023
  • 4
    Gods In The Machine? The Rise Of Artificial Intelligence May Result In New Religions
    • March 23, 2023
  • 5
    The Technology Behind A Perfect Cup Of Coffee
    • March 22, 2023
  • 6
    BigQuery Under The Hood: Behind The Serverless Storage And Query Optimizations That Supercharge Performance
    • March 22, 2023
  • 7
    6 ways Google AI Is Helping You Sleep Better
    • March 21, 2023
  • 8
    AI Could Make More Work For Us, Instead Of Simplifying Our Lives
    • March 21, 2023
  • 9
    Microsoft To Showcase Purpose-Built AI Infrastructure At NVIDIA GTC
    • March 21, 2023
  • 10
    The Next Generation Of AI For Developers And Google Workspace
    • March 21, 2023

about
About
Hello World!

We are liwaiwai.com. Created by programmers for programmers.

Our site aims to provide materials, guides, programming how-tos, and resources relating to artificial intelligence, machine learning and the likes.

We would like to hear from you.

If you have any questions, enquiries or would like to sponsor content, kindly reach out to us at:

[email protected]

Live long & prosper!
Most Popular
  • 1
    ABB To Expand Robotics Factory In US
    • March 16, 2023
  • 2
    Introducing Microsoft 365 Copilot: Your Copilot For Work
    • March 16, 2023
  • 3
    Linux Foundation Training & Certification & Cloud Native Computing Foundation Partner With Corise To Prepare 50,000 Professionals For The Certified Kubernetes Administrator Exam
    • March 16, 2023
  • 4
    Intel Contributes AI Acceleration to PyTorch 2.0
    • March 15, 2023
  • 5
    Sumitovant More Than Doubles Its Research Output In Its Quest To Save Lives
    • March 21, 2023
  • /
  • Artificial Intelligence
  • Machine Learning
  • Robotics
  • Engineering
  • About

Input your search keywords and press Enter.