Sign in or Join the community to continue

AI + Synthetic Data = Smarter Robots

Posted Oct 06, 2021 | Views 1K

# TransformX 2021

# Keynote

Share

speaker

Dr. Danny Lange

Senior Vice President of Artificial Intelligence and Machine Learning @ Unity Technologies

Dr. Danny Lange is Senior Vice President of Artificial Intelligence and Machine Learning at Unity Technologies. As head of machine learning at Unity, Lange leads the company’s innovation around AI (Artificial Intelligence) and Machine Learning, focusing on bringing AI to simulation and gaming. Prior to joining Unity, Lange was the head of machine learning at Uber, where he led efforts to build the world’s most versatile Machine Learning platform to support the company’s hyper-growth. Lange also served as General Manager of Amazon Machine Learning -- an AWS product that offers Machine Learning as a Cloud Service. Before that, he was Principal Development Manager at Microsoft where he led a product team focused on large-scale Machine Learning for Big Data. He holds MS and Ph.D. degrees in Computer Science from the Technical University of Denmark. He is a member of the Association for Computer Machinery (ACM) and IEEE Computer Society, and has several patents to his credit.

+ Read More

SUMMARY

Advancement in the robotics field has been slow going due to the costs and complications of innovating on a large scale. Progress on perception and intelligence has brought the industry to the verge of real application breakthroughs. Danny Lange, SVP of Artificial Intelligence and Machine Learning at Unity, will demonstrate how synthetic data is now helping robots learn rather than be programmed and the potential this has for advancing the industry. Simulation technology is highly effective and advantageous when testing applications in situations that are dangerous, expensive, or rare. Validating applications in simulation before deploying to a robot shortens iteration time by revealing potential issues early. In this session, you will see a powerful example of a system that learns instead of being programmed, and as it learns from synthetic data, it is able to capture much more nuanced patterns than any programmer ever could. Layering technologies together is making the future vision of what is possible a reality. Discover how AI is now proving the efficiencies possible in training robots.

+ Read More

TRANSCRIPT

Nika Carlson (00:15): Our next speaker is Dr. Danny Lange. Dr. Danny Lange is Senior Vice President of Artificial Intelligence and Machine Learning at Unity Technologies. There he leads the company's innovation around AI and machine learning, focusing on bringing AI to simulation and gaming.

Nika Carlson (00:36): Prior to joining Unity Lange was the Head of Machine Learning at Uber. Lange also served as General Manager of Amazon Machine Learning and Principal Development Manager at Microsoft. He holds MS and PhD degrees in Computer Science from the Technical University of Denmark. He is a member of the Association for Computer Machinery (ACM), IEEE Computer Society, and has several patents to his credit.

Nika Carlson (01:07): Dr. Lange, over to you.

Dr. Danny Lange (01:10): Hello everybody. I'm Danny Lange from Unity Technologies. I've been looking forward to this event, and I would like to thank Alex and Scale AI for inviting me. Today I'm going to talk about how artificial intelligence and synthetic data in combination can really revolutionize robots and autonomous systems. But first, let's take a quick look at Unity Technologies.

Dr. Danny Lange (01:40): We're a realtime 3D platform company. We are in games, AR/VR film, automotive, robotics. 60% of the top 1,000 games are made with Unity, and we are installed on well over 6 billion unique devices. And every month more than 3 billion people play a game developed on the Unity platform. We're about 5,000 employees, headquarters San Francisco. And we have a big AI team at Unity, and it's this team's vision to basically enable organizations of all sizes to create, deploy, and unlock the value of AI. And we are focused primarily in the graphical space, so it's computer vision, it's robotics, and spatial simulation.

Dr. Danny Lange (02:31): So, let me get started here. Games have been used to drive AI research for a long, long time. We have seen how board and trivia games have basically led to huge breakthroughs in AI. It started out back in late forties, and a published paper in 1950 by Claude Shannon, who wrote the first chess program for a computer, and then has been fast forward ever since over those last 70 years. In '89, Chinook basically became the best player in the world, computerized player in Checkers. And when I was at IBM in 1997, we had Deep Blue defeating the world champion in chess, Garry Kasparov, and later on, IBM Watson became pretty famous for winning Jeopardy.

Dr. Danny Lange (03:30): For these systems, all these systems up to this point, it's about very smart humans programing computers with really smart algorithms, like in chess, or big databases of questions and answers, like in IBM Watson. But in the 20 teens here, well, late in the 20 teens, there was a big change. There was a revolution taking place with systems like DeepMind's AlphaGo and later on AlphaZero, which were systems that deployed machine learning and used self-training to become superior players, and would beat humans. This has sparked a whole new interest in using video games in AI. We have seen the Itai games, both DeepMind and Open AI, use that, and we have seen Minecraft being used by Microsoft, and StarCraft Dota 2 for Open AI. All of these video games have really been used because synthetic data, generated data, is really superior in AI.

Dr. Danny Lange (04:42): So, let's look at the video game business on AI. What is it that makes games so attractive to AI research and development? Well, games, the video games, they have a visual component. We all know that. There's a lot of action taking place. We can see it. There's a physical component. We have physics engines in connection with games. So we have gravity, we have inertia coalitions, and of course games, to a great extent, have Bias parcels in them. They have cognitive aspects. You need to solve some kind of problem to move on to the next level in the game.

Dr. Danny Lange (05:18): And finally, there's the social aspect, both social within the games where you have multiple NPCs or characters in games that operate together. And you also have multiplayer games where you've got multiple humans playing games together. So there's a social aspect too. All of this comes together in a realtime 3D engine. You have the spatial environment. You have graphical rendering. You have multi-sensory systems, which means that the cameras can move around, you can have multiple viewpoints. And you have the physics engine. So when you look at a game engine like Unity's, well, that's essentially your private AI biodome that you can run your experiments with.

Dr. Danny Lange (06:03): So I'm going to just focus today on simulation, and in particular simulation for building autonomous systems. And when we look at autonomy, it's really a matter of two key components. The first component is the perception component. It's essentially known broadly as Computer Vision. It's the sensors, it's machine learning models, its object detection, and related aspects of what an autonomous system needs to perceive around the world.

Dr. Danny Lange (06:38): And then there's a control element. Control element, control and planning for these systems, wide range of established methods. I'm going to today just focus on one that is particularly interesting, which is Reinforcement Learning. So, basically the idea of observe, think and act driven by a reward system and an underlying model of reliable physics. There are many other approaches to have control, but I'm going to talk about Reinforcement Learning because this is one of the very successful approaches of non programmatic ways of building autonomy.

Dr. Danny Lange (07:21): So let's start talking about Perception, or Computer Vision. Building Computer Vision models is just very difficult. It's very costly. You have to manually collect lots of data. You need to annotate it. Living in the real world, you may have actual cars or robots moving around, you have just incredible complexity of multiple technologies that are playing in. And then, let's not even go into privacy, safety and regulatory compliance hurdles when it comes to collecting training data, and essentially doing all this at scale. So, that's where we see the synthetic data advantage. I can produce high volume of very low cost data, and it's perfectly labeled. Parallelism, instant elasticity. If I, during the machine learning training process, need more data, I can just request more data. It's generated in fractions of a second. I only generate the data I actually need. I can even produce intelligence around the data set I generate, and essentially, well, there's a really, really high ease of use here.

Dr. Danny Lange (08:50): So, let me show this small video here, short video. It shows bounding boxes, 2D bounding boxes, 3D bounding boxes, instance and semantic segmentation, and then of course, also environment randomization. So you can basically change the materials on the cabinetry, or the walls, change the lights. It's just basically an automated way of generating a high variety of data. And here we see some examples from warehouses, and from grocery, and street views.

Dr. Danny Lange (09:33): In this short video, I'll show you an example of one of the tools that we can use to explore these synthetic data sets. We can select different kinds of 2D or 3D bounding boxes, or instant segmentation, for example. But most importantly, here on the right side, you see all the JSON coming up. So this is like you have your image data, and then you have perfectly labeling associated with it.

Dr. Danny Lange (10:06): So, I showed you a lot of images. What really matters here is that we're actually often living in a world of robots that move like autonomous vehicles. So the game engine has a unique role to play here, to generate not just single images, but sequences of images. So, streaming that's all perfectly labeled for training these autonomous systems.

Dr. Danny Lange (10:35): So, that was on the Computer Vision side. Let me talk a bit about control, and as I mentioned, specifically Reinforcement Learning. We want to have the robots come out of the cages that they are in in manufacturing, coming out of the cages and collaborate with people, or even have multiple robots to collaborate with each other. So, these robots are not trivial to program. If you're going to write a program to do this, it's going to be very complicated. So we really want to take nature's learning method, which is Reinforcement Learning, and use that for autonomous systems.

Dr. Danny Lange (11:20): So the whole idea here is to, in the principles of Reinforcement Learning, is to observe, take action and then reap the rewards or take the penalties from those actions, and do that in a flywheel. This is how humans, this is like animals, they learn, and we basically move through this flywheel in a fashion going from exploration to exploitation. This is how many recommendation systems they work. This is something that's very popular at companies like Amazon and Netflix. They show you stuff, you click, or you may not click, and that's the reward. And they get to know you better that way. We want to use that in robots.

Dr. Danny Lange (12:14): I want to show this little, piece of video that we did at Unity. It's basically, can we have a chicken? Can we train a chicken to cross the road? Actions, left, right, forward, backward. And the reward signals is simple. It's negative for being hit by a car and positive for gift pickup. And you see the yellow chicken here in a moment, moving randomly around. This is just training is starting up right now. Chicken is moving randomly forward, backwards, sideways. But look, in a moment he will accidentally pick up a gift package right there, and then get killed by a car.

Dr. Danny Lange (12:53): But after one hour of training it gets pretty sophisticated. You can see it manages to pick up a lot of gift packages. It also manages to avoid a few vehicles, but then getting in two, three, four vehicles and it gets killed. And you have to be aware here that all these vehicles are coming in completely random fashion. So after six hours of training, the chicken has become superhuman. It will pick up most gift packages, and it will never get run over by a car. And just again, what we managed to do here in six hours was, through Reinforcement Learning, to train a policy for the chicken so that it can safely cross the road.

Dr. Danny Lange (13:42): Now you can, of course, use this for many different things. Here we just simulate some racket play. And here we have a joint reward, which is basically it's negative if the ball touches the ground. So these players here, or these two agents, will just learn to play. And magically actually, what they learn here is they learn actually that the physical trajectory of a ball under gravity, which is pretty interesting, I think.

Dr. Danny Lange (14:16): Here we have an example of indoor soccer. These two teams, with four agents on each team, have been trained from scratch. We haven't instructed these agents in any way, shape or form. And you should notice that one of them, at any given moment, has become a defensive player, more like a goalie. And whenever that goalie runs up and becomes offensive, one of the other players pull back and become defensive. Apparently this is the best way to play four on four indoor soccer. After millions and millions of games from scratch the system figured out that that's constructive behavior to have a defensive player, even if that player stands at the other end of the field and just waits.

Dr. Danny Lange (15:04): So, of course, we can take this into training a robotic arm, like this one. So here the arm has a lot of joints, it's pretty complicated and it needs to learn to touch an object. We have, in Unity, actually included in import utility. So a lot of robots are basically shipped with an XML specification, it's called UDF, that describes the robot in detail. We can actually import that description directly into Unity and your robot will appear in realtime, as I'll show you here.

Dr. Danny Lange (15:51): So that's the first step, you get your robot and then you run your training phase. And in this case here, you train the robot to understand an object pose so that it can grab an object at the right angle, lift it and put it down. And then we show that actually with the real physical robot next to it here in this video where we actually transfer that capability from the simulation to the real world. And this is the SIM to real aspect, which is of course, a very important part of building a robot, building an autonomous system.

Dr. Danny Lange (16:33): A very important part of this is, of course, accurate physics. In Unity, we support Invidious Physics Engine. We also support other engines. Here's an example of AlgoriX, that provides high precision physics for an industrial robot. And you can sense from this video, you can sense how all of these movements are very, very realistic, very natural. And it's very important to have that level of accuracy when you're going to do the SIM to real transfer.

Dr. Danny Lange (17:17): So let's talk a bit about how to elevate the use of AI here. I have spent many, many years in this business, and what I have noticed over the last couple of decades is that we have our own model in AI. Remember Moore's law is the one that the density of transistor logic doubles every 18 to 24 months in the CPU business, or in the silicone business. Well, I noticed that roughly the growth of data used to train AI system is doubling every 18 months, and the world is basically running out of data. You cannot keep up with real world data. You have to move to synthetic data to keep the growth up for these extremely data hungry AI models that we build.

Dr. Danny Lange (18:25): And I want to show you some examples of that, but first, to give you some context, when you play a game it runs at 30 to 60 frames per second. More than 30 up to 60 gives people the most fluent experience. You can't really see the frames at that point, it's just fluent motion. But when we talk about simulations you don't really run the simulation for a human eye to watch it. You run the simulation for another computer. So, why not run at unlimited number of frames per second?

Dr. Danny Lange (19:11): Third observation is that, and you may find this fascinating, but one year of your life, at 30 frames per second, is about a billion frames. Well, that's your life in realtime, in wall clock time. But again, think about it. If I run not at 30, but 60, 100, or 1,000 frames per second, that's actually accelerating that kind of human experience. And if I do that in massive parallel system in some Cloud infrastructure I can generate a billion frames in a matter of minutes, or maybe even seconds. And remember that billion of frames, that was equivalent to a year's experience.

Dr. Danny Lange (20:04): So let's take a look at some of the opportunities here when we are in a world where I'm not tied to the limitations of the real world. I can actually go out and beat nature in making things happening on a accelerated way. I can also, when I look at synthetic data coming at very high speed, I can even create a training feedback loop, which is I can impact the training data on the fly to generate more visual training data for stuff that my model's not good at recognizing, or more difficult situation, that my Reinforcement Learning system needs to be better at dealing with. So, you have some power and capabilities that was not previously possible.

Dr. Danny Lange (21:02): So let me take this example from Open AI. It made a big splash in the media a few years ago. So let me explain to you what you're seeing here. You're seeing a real robotic hand flipping a cube on demand. So it needs to flip the cube to a certain position on demand. There's a camera up on the right. The camera is watching the cube and the robotic hand, and that camera feeds into a machine learning model that has been trained to control the hand. And the objective here is to flip the cube so that it meets the goal. So right now, show the green side here and then flip it and show O and A. This system was... If you're going to train a real robotic hand it's going to really take time, so what Open AI did was that they used Unity to generate 300 billion simulated frames to build that machine learning model. Think about 300 billion frames, it's 300 years of experience. So 300 years of experience in about one month, in a distributed environment.

Dr. Danny Lange (22:29): Here's another example from Open AI. This is a Rubik's cube. Much more complex. For this, same solution. They simulated in this case 10 trillion frames. This is 10,000 years of experience in a very short time span by basically using thousands of instances in Cloud deployment.

Dr. Danny Lange (22:51): So, of course we want to train all kinds of robots, autonomous vehicles, in all kinds of conditions and in all kinds of scenarios, and we really want to do that at massive scale. This not about a single simulator for human, it's about thousands and thousands of simulators for other computers to consume. So, I have a lot of URLs for you that I'm going to share with you, so you may want to snap some pictures on the phone, or you can go back and Google. But everything I've showed you is something you can actually touch yourself, if you're a developer, or you can ask someone to do it, or you can learn more about it. So, let's take it from an end.

Dr. Danny Lange (23:40): For developers, we have a bunch of totally free GitHub Repositories. It's all open source. We have the Unity Perception package, which basically allows you to generate training data directly in Unity. We have Unity Robotics package. It's essentially the software that includes the robotic operating system interface, so that's a standard ROS. It basically allow you to use your normal robotic software up against Unity to run things, like testing your robotic arm or your robotic vehicle, through standard protocols, but in a virtual manner. So it allows you basically to test your robot through simulation.

Dr. Danny Lange (24:36): And finally, we have the ML-Agents package. I showed you the Reinforcement Learning examples. Emulations have become, I would run the risk of arguing it or claiming this, probably the most popular reinforcement learning framework on the planet now. We have 10,000s of users, many students, many developers using it, and it basically provides an API and key enforcement learning algorithms for you to actually play with very scalable reinforcement learning environments directly in Unity.

Dr. Danny Lange (25:20): And we also published some data sets. We have a Home Interior data set and a Retail data set. Both are sample data sets. There are thousand images in each, but you can go and download them and get an idea of what it looks like, what that notation looks like, and see if it's something that's going to work for you.

Dr. Danny Lange (25:47): And then we have a couple of public landing pages, one for our Computer Vision Work, and one for our Robotic Simulation work, where you can go and read more about what's possible. And for the more curious people, getting under the hood, we have two archive papers that you can go and read. There's one on Unity Perception. So these papers are of more academic nature with a lot of data in. And we have one on the ML-Agents work.

Dr. Danny Lange (26:27): And with that, I thank you very much for your attention. Feel free to follow me on LinkedIn, or link up, or hook up on Twitter. I hope you enjoyed this. I certainly did so, so thank you very much.

+ Read More

Watch More

Embedding Synthetic Assets to Train AI Models

Posted Oct 06, 2021 | Views 2.6K

# TransformX 2021

# Breakout Session

Smart Model Ensembles for Smarter Construction

Posted Oct 18, 2022 | Views 956

# TransformX 2022

# Breakout Session

The Data-Centric AI Approach With Andrew Ng

Posted Oct 06, 2021 | Views 6.8K

# Keynote

# TransformX 2021