Optimizing Infrastructure For Neural Recommendation At Scale
We are sharing an in-depth characterization and analysis for infrastructures used to deliver personalized results in deep neural network-based (DNN) recommendation at scale. Although DNNs are often used to help generate search results, to provide content suggestions, and for other common applications for internet services, relatively little research attention has been devoted to optimizing system infrastructures to serve such recommendations at scale. In addition to sharing insights about how this important class of neural recommendation models performs at production scale, we’ve also released the open source workloads and related performance metrics that we used, to help other researchers and engineers to evaluate…
Share