Scale Events
+00:00 GMT
Sign in or Join the community to continue

Top Tips from Netflix, NVIDIA, and Meta on Large Language Models (LLMs)

Posted Oct 21, 2022 | Views 2.3K
# TransformX 2022
# Expert Panel
# Large Language Models (LLMs)
# Natural Language Processing (NLP)
Share
speakers
avatar
Susan Zhang
Research Engineer @ Meta AI

Susan Zhang is a Research Engineer at Meta AI working on developing large language models. She has over 10 years of experience building software systems tackling a wide variety of domains at scale, ranging from quantum simulations at Los Alamos National Laboratory to reinforcement learning systems at OpenAI.

+ Read More
avatar
Faisal Siddiqi
Director Machine Learning Platform @ Netflix

Faisal Siddiqi has built and leads the Machine Learning Platform org at Netflix, responsible for accelerating the impact of all ML practice including personalized recommendations, growth, studio, content, streaming, and systems applications. This ML Platform provides opinionated, flexible, and scalable solutions to help research scientists and engineers be productive with all stages of the ML lifecycle from early exploration through productization. Prior to joining Netflix in 2015, Faisal led the initial platform engineering team at Conviva, a streaming-media quality-of-experience startup. Early career work included network performance and distributed systems domains, as well as an entrepreneurial stint at a grad-school dorm startup. Faisal has an MSEE from Stanford University. Originally from India, Faisal has called the SF Bay Area home since 1998. In his free time, he likes to paint oil landscapes, but stands no chance against DALL-E-2, Midjourney and their ilk.

+ Read More
avatar
Bryan Catanzaro
Vice President, Applied Deep Learning Research @ NVIDIA

Bryan Catanzaro is Vice President of Applied Deep Learning Research at NVIDIA, where he leads a team of AI researchers working on chip design, audio and speech, language modeling, graphics and vision, with the goal of finding practical new ways to use AI for NVIDIA’s products and workflows. DLSS, Megatron, CUDNN, Pascaline, WaveGlow and DeepSpeech are some of the projects he’s helped create. Bryan received his PhD in EECS from the University of California, Berkeley.

+ Read More
avatar
Erhan Bas
Machine Learning Engineer @ Scale AI

Erhan Bas is currently working as a ML engineer at ScaleAI. Before joining Scale, he was with the AWS-AI Computer Vision group as a tech lead working on multimodal learning and biomedical CV. Before joining AWS, he worked as a computer scientist at HHMI, where he built large scale computational pipelines for bioscience applications. He received his B.Sc in Electrical and Electronics Engineering and in Physics in 2005 from the Middle East Technical University, Turkey. He obtained his M.Sc. in Electrical and Computer Engineering from Koc University, Turkey in 2007 and his Ph.D. in Electrical Engineering from Northeastern University, Boston in 2011. His interests include machine learning and computer vision with various applications in multimodal search and bioscience applications. He served as a member of IEEE Bio Imaging and Signal Processing Technical Committee and is a member of IEEE, EMBS, and HKN Eta Kappa Nu.

+ Read More
avatar
Elliot Branson
Director of Machine Learning and Engineering @ Scale AI

Elliot Branson Is the Director of Machine Learning and Engineering at Scale and leads the Machine Learning, Platform, Federal, 3D, and Mapping products. In his prior work, he helped create the Cruise Automation self-driving car and served as the first Head of Perception and AI. His interest in robotics and AI started with national and international robotics competitions in high school and continued in college and grad school where he published work on field robotics, localization, computer vision, and AI systems. His previous work includes stints on the Google Project Tango AR platform and Air Force MURI research programs.

+ Read More
SUMMARY

Join this enterprise-focused, spirited discussion on how best to train, use, and fine-tune foundation models in the enterprise. Elliot Branson, Director of Machine Learning & Engineering, Scale AI, will moderate the panel with industry experts from AWS, NVIDIA, Netflix, and Meta.

Erhan Bas, formerly Applied Scientist at Amazon Web Services and now at Scale, shares his perspective on training large language models (LLMs). Bryan Catanzaro, Vice President of Applied Deep Learning Research at NVIDIA, shares how the GPU manufacturer is targeting foundation models as a core workflow for enterprise customers. Faisal Siddiqi, Director of Machine Learning Platform at Netflix, will share how his company is using foundation models to analyze highly produced video content. Susan Zhang, Researcher at Facebook AI Research (FAIR), a division of Meta, will share insights from training and fine-tuning Meta’s OPT model.

Members of the panel will share how they scale their training across multiple nodes, attempt to avoid overfitting by mitigating data quality issues early on, and address bias in models trained on a large internet-based text corpus. The panelists will discuss the compute cost inherent in training an LLM from scratch, how to avoid costly and tedious hyperparameter optimization, the need to mitigate training failure risk in clusters with thousands of GPUs, including sticking to synchronous gradient descent, and the need for extremely fast storage devices to save and load training checkpoints.

+ Read More

Watch More

35:23
Building Trust in AI: Testing and Evaluating Large Language Models (LLMs)
Posted Sep 28, 2023 | Views 5.3K
OpenAI's Greg Brockman: The Future of Large Language (LLMs) and Generative Models
Posted Oct 20, 2022 | Views 5.5K
# TransformX 2022
# Fireside Chat
# Natural Language Processing (NLP)
# Foundation Models
# Large Language Models (LLMs)
# Generative Models