Scale Events
+00:00 GMT
Sign in or Join the community to continue

Dataset Management: Using the Right Tools for the Job

Posted Oct 21, 2022 | Views 1.6K
# TransformX 2022
# Autonomous Vehicles
# Expert Panel
# Robotics
# Computer Vision
Share
speakers
avatar
Mostafa Rohaninejad
Founding Research Scientist @ Covariant

At the University of California Berkeley for both his undergraduate degrees and now his Ph.D., Mostafa joined BAIR in 2016. In 2018, he joined the founding team at Covariant. He has worked in academia as well at the industry level in meta-learning, generative models, deep unsupervised/reinforcement learning, and computer vision. He has published at top-tier conferences such as ICLR and ICML, and is part of the core team that built the AI stack at Covariant from the ground up.

+ Read More
avatar
Ariana Eisenstein
CTO @ Pickle Robot

Eisenstein graduated from MIT in 2016, with a focus on optimized machine vision. She is the CTO of Pickle Robot and runs the engineering team. Prior to Pickle, she worked at LeafLabs developing neurotechnologies for FAANG companies and NIH.

+ Read More
avatar
Louis Tremblay
AI/ML Engineering Leader @ Resideo

Louis Tremblay is an Engineering Leader at Resideo, where he works on creating ML solutions to make homes safer and greener. He gets most excited by working across technology stacks, from hardware to software, to solve challenging problems. He previously worked at FLIR, where he led an ML/CV/Innovation team that used thermal imagery to create multiple camera products that addressed problems ranging from driver-assistance systems, fever detectors, maritime navigation, traffic control, etc. He currently is also wondering how to fit a dog into his life.

+ Read More
avatar
Jack Guo
Head of Autonomy Platform @ Nuro

Jack Guo is the Head of Autonomy Platform at Nuro, a robotics company that aims to better everyday life through robotics with its first application in autonomous goods delivery. Autonomy Platform consists of simulation, evaluation, data platform, data science, ML infra, ground truth eng teams and data labeling operation team, and its mission is to build tools, infra and services that accelerate the development of autonomy. Before joining Nuro, Jack was managing the machine learning infrastructure team at Twitter, powering key ML applications like ads prediction and feeds ranking. Jack earned Bachelor degree from Tsinghua University and Masters in Electrical Engineering from Stanford University.

+ Read More
avatar
Russell Kaplan
Director of Engineering @ Scale AI

Russell Kaplan leads Scale Nucleus, the data management platform for machine learning teams. He was previously founder and CEO of Helia AI, a computer vision startup for real-time video understanding, which Scale acquired in 2020. Before that, Russell was a senior machine learning scientist on Tesla's Autopilot team, and he received his M.S. and B.S. from Stanford University, where he was a researcher in the Stanford Vision Lab advised by Fei-Fei Li.

+ Read More
SUMMARY

Machine learning leaders from robotics (Covariant), home automation (Resideo), autonomous delivery (Nuro), and warehouse automation (Pickle Robot) sit down with Russell Kaplan, Scale’s Director of Engineering, to share their approaches to dataset management. Pickle Robot CTO Ariana Eisenstein will share how she thinks about modulating quantities from different data sources like synthetic and public open datasets with real-world data for training datasets. Mostafa Rohaninejad, Founding Research Scientist at Covariant, will describe how the object “picking” problem requires synthetic data for unsafe scenarios and how he also incorporates structured and time-series data—supervised and unsupervised learning should go hand-in-hand. Jack Guo, Head of Perception at Nuro, will explain how it’s essential to have tools and mechanisms to automatically highlight recorded data that deviates from the norm, especially if it was captured in a new location. Like Rohaninejad, he will stress the importance of simulation as a component of successful reinforcement learning. Louis Tremblay, AI/ML Engineering Leader at Resideo, will explain how security cameras in the home represent an even more unbounded environment than do warehouses.

The group will also discuss why maintaining separate datasets and training pipelines for different customers is both costly and incurs additional technical debt over time. Testing on fault-tolerant customers first before deploying to the wider fleet is also important. Scale’s Kaplan will share how, in his experience, when metrics and anecdotes seem at odds, it makes sense to re-think the metrics and establish new ones.

+ Read More

Watch More

Panel: Building a Resilient MLOps Strategy Through Dataset Management
Posted Oct 06, 2021 | Views 2.8K
# TransformX 2021
DEBAGREEMENT: A Comment-Reply Dataset for (Dis)agreement Detection
Posted Mar 30, 2022 | Views 4.4K
# Tech Talk
Using Data to Drive Private Equity – Lessons, Trends, and Opportunities for Data Scientists
Posted Feb 09, 2022 | Views 5.8K
# Applied AI
# Fireside Chat