AI Exchange
timezone
+00:00 GMT
SIGN IN
  • Home
  • Events
  • Content
  • People
  • Messages
  • Channels
  • Help
Sign In
Sign in or Join the community to continue

How to Deploy Models at Scale with GPUs

Posted Oct 19
# TransformX 2022
# Breakout Session
Share
SPEAKER
Varun Mohan
Varun Mohan
Varun Mohan
CEO and Co-Founder @ Exafunction

Varun Mohan is the CEO and Co-Founder of Exafunction, which builds infrastructure to optimize deep learning workloads. Previously, Varun was a technical lead and senior manager at Nuro, where he saw the power of deep learning and the large challenges of productionizing it at scale. Before that, he received a B.S. and Masters in Computer Science from MIT.

+ Read More

Varun Mohan is the CEO and Co-Founder of Exafunction, which builds infrastructure to optimize deep learning workloads. Previously, Varun was a technical lead and senior manager at Nuro, where he saw the power of deep learning and the large challenges of productionizing it at scale. Before that, he received a B.S. and Masters in Computer Science from MIT.

+ Read More
SUMMARY

Graphics Processing Units (GPUs) are used for training artificial intelligence and deep learning models, particularly those related to ML inference use cases. However, using GPUs to deploy models at scale can create several challenges for ML practitioners. In this session, Varun Mohan, CEO and Co-Founder of Exafunction, shared the best practices he’s learned to build an architecture that optimizes GPUs for deep learning workloads. Mohan explained the advantages for using GPUs for ML deployment, as well as where they might not have as many benefits. Mohan also discussed cost, memory, and other factors in the GPU-vs-CPU equation. He discussed inefficiencies that may arise in different scenarios and some of the issues related to network bandwidth and egress. Mohan offered techniques, including the importance of batching workloads and optimizing your models, to solve these problems. Finally, he discussed how some companies are using GPUs to run their recommendation and serving systems. Before Exafunction, Mohan was a technical lead and senior manager at Nuro, where he saw the power of deep learning and the large challenges of productionizing it at scale.

+ Read More

Watch More

42
Posted Sep 09 | Views 39.4K
# Large Language Models (LLMs)
# Natural Language Processing (NLP)