Scale Events
timezone
+00:00 GMT
Sign in or Join the community to continue

An AI-First Approach to Enable Self-Driving at Scale, with Raquel Urtasun of Waabi

Posted Oct 06, 2021 | Views 2.4K
# TransformX 2021
# Keynote
Share
SPEAKER
Raquel Urtasun
Raquel Urtasun
Raquel Urtasun
Founder and CEO @ Waabi

Raquel Urtasun is Founder and CEO of Waabi, an AI company building the next generation of self-driving technology. Waabi is the culmination of Raquel’s 20 year career in AI and 10 years of experience building self driving solutions. Raquel is also a Full Professor in the Department of Computer Science at the University of Toronto, a co-founder of the Vector Institute for AI and the recipient of several high profile awards including an NSERC EWR Steacie Award, two NVIDIA Pioneers of AI Awards, three Google Faculty Research Awards, an Amazon Faculty Research Award, two Best Paper Runner up Prize awards at CVPR in 2013 and 2017 and more. In 2018, Raquel was named Chatelaine Woman of the year and one of Toronto's top influencers by Adweek magazine.

+ Read More

Raquel Urtasun is Founder and CEO of Waabi, an AI company building the next generation of self-driving technology. Waabi is the culmination of Raquel’s 20 year career in AI and 10 years of experience building self driving solutions. Raquel is also a Full Professor in the Department of Computer Science at the University of Toronto, a co-founder of the Vector Institute for AI and the recipient of several high profile awards including an NSERC EWR Steacie Award, two NVIDIA Pioneers of AI Awards, three Google Faculty Research Awards, an Amazon Faculty Research Award, two Best Paper Runner up Prize awards at CVPR in 2013 and 2017 and more. In 2018, Raquel was named Chatelaine Woman of the year and one of Toronto's top influencers by Adweek magazine.

+ Read More
SUMMARY

In this keynote session, Raquel Urtasun, Founder and CEO of Waabi, offers a new approach to developing AI-based systems for autonomous vehicles and explains why it is critical in enabling self-driving autonomy at scale. Raquel explores why autonomous self-driving is a complex problem to solve for AI engineers. She shares an in-depth explanation of the technical issues that must be solved, which commonly lead to scalability and cost challenges for today's AV manufacturers. She critiques current AI approaches and offers a different approach that is better able to learn and scale. How does an AI-first approach improve model interpretability and verification? How can you perform complex reasoning that is generalized across different sensor configurations and geographies? How does this AI-first approach reduce overall cost and complexity? Join this keynote to learn how an AI-first approach differs dramatically from current industry methodologies and positively impacts all parts of the AV software stack, from mapping and perception to prediction, planning, and vehicular control.

+ Read More
TRANSCRIPT

Nika Carlson (00:00): For our next keynote, we are delighted to welcome Raquel Urtasun. Raquel Urtasun is founder and CEO of Waabi, an AI company building the next generation of self-driving technology. Waabi is the culmination of Raquel's 20 year career in AI and 10 years of experience building self-driving solutions. Raquel is also a full professor in the Department of Computer Science at the University of Toronto, a co-founder of the Vector Institute for AI and the recipients of several high profile awards, including two NVIDIA pioneers of AI awards, three Google Faculty Research Awards and Amazon Faculty Research Award, two Best Paper runner-up prize Awards at CVPR and more. Over to you Raquel.

Raquel Urtasun (01:09): Hi, my name is Raquel Urtasun and I am the founder and CEO of Waabi. Self-driving is one of the most exciting and important technologies of our generation. And if you think about it is really going to change the way that we live. Both from a safety perspective, as well as changing the landscape of our cities. Now, before I go into what Waabi is doing for self-driving, I want to give you a tutorial of how self-driving works. So the first thing that any self-driving vehicle does is that it senses the environment. This can be with the LiDAR, the cameras, the radar, et cetera. In the image here, I'm showcasing how the vehicle is in the scene from the LiDAR perspective, meaning that there is a point cloud seen from the top. Once the vehicle senses the environment, typically what you will see is that this vehicle localizes where it is in the world with precision of a few centimeters.

Raquel Urtasun (02:08): This is showcase here by the fact that we can actually import this high-definition maps. Once we know where we are, the next step is to perceive the world around us. In particular here, the vehicle is seeing the pedestrians as well as the vehicles around it. But just understanding the presence is not sufficient. We need to be able to predict how the different actors are going to be moving in the next few seconds. And this is necessary in order to plan a maneuver around us, such that we can drive safely to our destination and comfortably. Once we do this, basically we repeat this process, every fraction of a second, as you can see here in the video. This is a summary of how almost every single software stack works in industry.

Raquel Urtasun (03:00): Now, the industry has made significant meaningful progress in the past 20 years. However, if we look at commercial deployment, it is still very limited to very small and very simple operation domains. And you probably are thinking why is this the case? Well, on one side, the problem is really hard. And this is something that we need to acknowledge. If you think as a human and you're driving around, right? Actually the task of driving requires very nuanced decision making that is non-trivial. What makes it extremely difficult is the fact that there is many, many possibilities of how the scene will appear in front of you and the vehicle has to handle all these different situations. What makes it even harder is the fact that many of these situations appear very, very rarely and we still need to be able to handle all of them.

Raquel Urtasun (03:56): Furthermore, it's not just about the problem being hard, but actually the solutions are important. There has been meaningful products, as I mentioned before, since the DARPA challenge, which was 17 years ago. However, there is fundamental issues in the way that this technology is developed today, as well as the type of technology. In particular, there is a like for automation and current approaches require a lot of manual tuning that prevents these different approaches to really scale. They are also very, very capital intensive. And we see many billions of dollars have been spent in this industry and all this together basically limits the ability to scale this technology and really deploy at a larger scale that we see today.

Raquel Urtasun (04:46): Now, let's go a little bit deeper into why the current approach in self-driving is problematic in terms of not being able to generalize any scale to the solution that is required. The first thing to note is that when you look at teams developing this technology, you typically have thousands of engineers working on this problem. And in particular, every group of engineers is tasked with a very small sub-problem of this very complicated task that is all intertwined. Now, think about the fact the analogy if you wanted to build a puzzle and you start by just building the individual pieces, it's going to be extremely hard for those pieces to piece together into what is a whole. So the same problem is happening in self-driving that instead of having this bottom of approach, we need to instead have a much more holistic approach that really is designing the entire system from the top down. And this is really lacking in the industry.

Raquel Urtasun (05:52): One of the other problems that we see is that in the design of the traditional software stacks, there is a lot of manual design and as humans we cannot handle a lot of data. So what teams typically do is that they, you have very, very small interfaces between the different modules. And as a consequence, there is very little information pass and you can think about this as the vehicle is more and more blind as it has to do more and more sophisticated processing. So this is kind of the opposite of what you would like, right? And in order to do this complex decision making, you would like to access as much information as possible and process this very efficiently. So that's problematic as well. And it basically, if you have a mistake at the beginning of the stack, it's going to be impossible for the self-driving vehicle to actually correct that mistake.

Raquel Urtasun (06:50): Now, if you look at the development as a whole, typically you have a particular software release that is running on the vehicles. And when you observe a particular error, basically this is categorized in terms of what kind of mistakes do we see and then there is a team that is basically tasked with solving this type of error. And in order to do so, they basically go and build yet another model that is one of this a small model we think hundreds and hundreds of these models that form a very sophisticated, very complicated software stack with all dependencies. This is one of the reasons why this technology doesn't really scale. And why every time that you add one of these models, you actually pay the cost that you need more maintenance and you need to grow your team even further.

Raquel Urtasun (07:40): On top of this, there is actually no manual tuning where you have this complicated expert system where basically it's looking a little bit like if this happens in the world, you should behave this way. If that happens and you should do that, which is something that with this expert systems, we are not going to be able to solve the complexity of the real world. And as a consequence, every time that there is a change that you want to make in the software stack, you need to tune every single module one at a time until you end up at the end of the stack. As a consequence, typically every time that you basically create one of these changes, it takes you almost a quarter or more to basically land it in production, even if it's a very, very small change. And this is because of the lack of automation and the fact that you need to do this one module at a time by hand. So all this prevents you from really having a scalable solution to develop and deploy this technology.

Raquel Urtasun (08:42): An alternative that you might think of is why don't we use AI in order to automate this process. Now, if you look at current approaches that utilise AI, the problem is that typically they think about AI as a black box. That if you give the system enough examples, it's going to learn to handle all the possible situations. Unfortunately, there is two issues with this approach. There is the fact that you are going to need too many examples and in particular, you need to observe all these rare risk situations in order to be able to handle them. And this is not necessarily something that you might be able to do, or that it might be ethical to actually do. Some of those situations, right, might finish in a coalition, for example.

Raquel Urtasun (09:29): On top of this, there is no interpretability if you treat the AI system as a black box. So it's going to be very difficult, too impossible to also verify the system. So this is something that regulators are not going to feel comfortable in terms of approving such a system to be deployed in the real world. So there is a need for something different in order to move forward. For a different type of technology that is neither this black box AI approach nor this hand engineer model tuning approach that is actually really difficult to scale. And this is what Waabi is providing. We are providing a new approach that basically marries the advantages of this two traditional approaches while at the same time without incurring disadvantages. So it's an AI approach that is able to learn from fewer examples, is able to scale much better and generalize to these long tail events. And I will tell you in a second a bit more about our technology.

Raquel Urtasun (10:26): But it's not just about the technology that we develop. I think there is also a need for change in terms of the way that we build this technology. We've seen a lot of consolidations. We've seen teams really being copycats of each other in terms of the way that they work, the experience that they provide as well as the technology that they build. Instead, we know that in order to solve such a difficult task of self-driving, we need to look at this problem from many different perspectives in order to really find the best solution. So we need a diversity of solutions and the best way to bring diversity is actually to do it with a diverse team. Because if we are to think the same way, we are not going to challenge each other and come up with better solutions. This is what Waabi is providing, it's a much more diverse approach in order to be able to solve this problem.

Raquel Urtasun (11:18): Now let's go a little bit deeper into this technology. And I mentioned that we are going to have an AI-first approach, but it's not going to be a black box and that's very, very important. The fact that this an AI-first approach give us the automation, give us the fact that the entire system can be trained and tuned from just data. As a consequence, we basically remove fully the need for having tuning one module at a time in a serial process. Instead, changes can be landing as little as a single day and progress can be made much faster. The fact that we have a way to also train the system and to enable us to have much more complex interface between these modules and also to be able to explore the utilized gross sensor data, all along our processing. As a consequence, there is no more cascade of mistakes and very sophisticated decision making process can be done at the end of the stack.

Raquel Urtasun (12:21): This is a much more scalable solution that basically through the combination of probabilistic graphical models, as well as deep learning and complex optimization, we are able to really have a new generation of algorithms that can do very, very complex reasoning, similarly to how humans actually reason in the world. This technology also enable us to learn from much less data and also be able to generalize across different sensor configurations and geographies in the world. This is one of the things that you will see in industry where basically teams operate in a very, very small operation domain and it will take a very, very long time to expand that operation domain to any other place. Instead, with this technology that generalizes much better, basically we can do this much, much faster. And finally, this technology is much more affordable because it can actually be develop at a fraction of the time and a fraction of the cost.

Raquel Urtasun (13:27): And one of the key things in order to develop this technology is not just about how we build autonomy that bring of the self- driving car as I was mentioning before. But also it's about how we actually test and train test the software stack. In particular, Waabi has a breakthrough simulator that has an unprecedented level of fidelity that is able to both test simple scenarios or scenarios that happen all the time, as well as the rare cases. And not only can test sub modules of the software stack, it can actually test the entire software stack in close loop. And furthermore, since it has an unprecedented level of fidelity on a scale, basically we can correlate what you see in simulation with what you will see in the real world in terms of driving. And this is actually very, very important.

Raquel Urtasun (14:23): And in the sense that since what we see in the simulator is very similar to what we observe in the real world in terms of not only the scene, but also how the vehicle behaves. Basically we don't need to drive millions of miles in the world as we typically see out there in other companies, instead we can basically test and train the system just in this simulator. And this is a total game changer compared to what we see out there.

Raquel Urtasun (14:53): Now, our innovations span all the different aspects of self- driving from building autonomic systems, right? How to build perception, prediction, and motion planning, such that you have systems that can be training to and as well as the generalized much better than existing approaches, right? Our expertise in simulation, which I'm going to be into more details in a second. The team is also an expert in building robust and safety for AI, which is a very, very key component of modern systems because AI is and will be a key component of self-driving. We also have expertise on mapping, localization as well as dealing with larger scale data in an intelligent manner, vehicle to vehicle communication and just core AI in order to develop this new generation products.

Raquel Urtasun (15:46): Now let's dig a little bit deeper in terms of simulation. Now, if you look at the industry and you look at where simulation is used at the scale today, it's only really used in order to test the motion planner. And in particular, scenarios are generated where the input is simple abstract representations of the scene. For example bonding boxes, representing actors, as well as the trajectories and then basically only one module is tested. Now the problem of this is twofold. First is that you don't test the entire software stack so you can't really verify the safety of your system, if you're only testing a single sub component. And furthermore, the input for this test which mimics or tries to mimic what is the output of the procession of prediction system, doesn't have the same level of noise characteristics as the real system has. So even that subsystem that you're trying to verify, you don't really verify it fully.

Raquel Urtasun (16:49): So we need something else. What is something else that we need in the industry? We need to be able to really build a closed loop simulator that basically has the virtual worlds. It has the scenarios and behaviors of all the actors in a way that is really reflective of all the distribution of things that can happen in the real world. And then we need a way to simulate how the sensors will have observed those scenarios happen in that virtual world in a way that is super realistic and in near real time or real time. You will not see anything like this in the industry. However, Waabi has a solution to this. Now, let's look into the three components that I mentioned creation of the virtual worlds, the scenarios and behaviors, as well as the sensor simulation. And then let's revisit a little bit what you will see in the industry and what Waabi has.

Raquel Urtasun (17:47): In order to create virtual worlds, what you will see typically done in the industry is that people use artists to generate the different worlds, whether it is the background, for example, the buildings, trees. Whether it is the different assets, for example, vehicles, pedestrians, et cetera. And then there is a very procedure modeling that is used in order to give those assets generate a virtual world. The problem with such an approach is that it doesn't a scale world. Using artist is a very expensive way to actually try to go and build replicas of the entire world. On top of this, it's very difficult to have the long tail of all the possible vehicles that you might observe in the world, et cetera. And it's very difficult to also build realistic world this way just by hand.

Raquel Urtasun (18:44): So Waabi's approach is very, very different. What we do instead is that resort to automation to our favorite thing, which is AI. And in particular, the way that we build our virtual world is actually automatic. We drive around the world and for the data capture by our sensors, we obtain automatically the background, which is a static part of the environment. We also automatically reconstruct every single vehicle that we observe as we drive as well as we can reconstruct automatically every single pedestrian that we have ever seen.

Raquel Urtasun (19:27): And what is important, for example, for pedestrians, which are complicated because they are articulated on the formable objects, is that we can reconstruct these pedestrians such that we can then automatically retarget them to do any possible activity or action that you might want them to do, in order to recreate different worlds that didn't exist before. And we can do this all automatically at the scale. So this is a game changer as well, right? Where on one side we have this very expensive manual process of using artists and here we're going to use just automation. And basically every single thing you've seen is going to be part of our virtual world.

Raquel Urtasun (20:08): Now, the next thing that you need to do is to be able to model the scenarios and behaviors of that potentially you might see in the real world. The current approach in industry typically is to script this as scenarios, for example, with procedural model, which is just a bunch of rules that basically you invocate one at a time. With different parameters where yes you can generate a lot of scenarios, however, they look very robotic, the way that the different actors behave. Instead, we go for a very different approach, which is we have a multi-agent AI system that is able to recreate scenarios in a way that look very naturalistic, very similar to how humans actually drive. And what I'm showing you here is examples of simulations that we have built in the past, where as you're going to see, it's going to be able to do very complex maneuvers in a way that is very, very naturalistic.

Raquel Urtasun (21:12): For example, right in this case, we are going to go and do a lane change in order to pass somebody that is stuck in our lane, we are going to do U-turns and there is no script here of you have to wait for the vehicle to pass in order to turn, et cetera, as you will see in a procedural modeling. Instead, by this is just a single AI system orchestrating all the different actors at once and then just running the simulation forward. Very, very different approach so much more scalable. And I can really show the diversity of the real world.

Raquel Urtasun (21:46): Lastly, the last thing that we need to do is to be able to render the world with high fidelity and in real time. And so we have created this virtual world where we have place all different actors to the different behaviors. And now we need to basically create however vehicle we have observed the scene if it was in a particular location, driving around in the virtual world. So we do this through a combination of two things, which is physics as well as AI. And the combination of these two things is very, very important because not only it enables you to have an unprecedented level of fidelity, but also it enables you to actually do this rendering in real time. And this is key in order to be able to really use this at scale in closed loop simulation. If your rendering takes hours, days, there is no way that you can really test the software stack.

Raquel Urtasun (22:39): On the left hand side, you see our LiDAR simulation. On the right hand side, you see the real LiDAR in this case. And what is important is that the output perception system, which is in green, is able to do to basically obtain the same results in the simulation as well as in the real world. And this is very important, as I mentioned before, because this is the way that you can mathematically say that actually the system will be able in this case, the perception system is basically seen and reacting the same way on simulation as well as in the real world. Right, so there is no need to observe that in the real world, we can just basically create that particular scenario in simulation.

Raquel Urtasun (23:26): Furthermore, you can go one step forward on this. And basically create, utilize the virtual world that we have generated, place the actors. In this case, we are going to create a safety- critical case where we place an actor that is forming an occlusion and then we place another actor that is coming out of occlusion to test whether the vehicle is able to basically react to this actor that this encroaching in front of us. This is a situation that never happened in the real world, right? But it is created in our simulation. And on the right hand side, you see what basically the sensors will have observed if the green car was our self-driving vehicle. All right, so you see here that we can actually create these scenarios, right? And by doing sensor simulation, we're able to basically take test the software stack. And in this case, it's able to react by slowing down to this vehicle. This is getting passing in front of us.

Raquel Urtasun (24:33): Now, when you do modern self-driving vehicles have more than one sensor, right? So in this case, we have typically cameras as well as LiDAR. And what I'm going to show you now is what we call other injections. So basically we are going to take logs that have been recorded by the self-driving vehicle but that was driving around the world. And then we are going to compose a different world by incorporating vehicles that were not there in the first place on that simulation, on that real log, sorry. So as consequence, you have this mixed reality where some things will real, some things are not, and this is a very easy way to basically test safety-critical cases without observing them in the real world. So I'm going to show you twice every video and what I want you to basically try to do here is try to basically guess which ones of these vehicles are actually fake and which ones are real.

Raquel Urtasun (25:36): And there is like four to five fake vehicles in every single one of these videos. So in most of the cases, it's actually really difficult to see which ones are fake. For example, this vehicle in front of us, it turns out that it was not real. It was not part of the log in the first place. This one is also fake, right and you see here the level of fidelity, which is like super, super high fidelity, how we can simulate with all the dynamics and in 3D. How we will have observed the scene, if those fake agents were actually there in the first place, right? And we can do this also very, very fast, as I mentioned before. Now this is a full game changer where we can resimulate the sensors in a way that incorporate all this new information of this real world, right?

Raquel Urtasun (26:25): We can do it in a way that there is no difference with the real world with respect to the perception and prediction of motion planning system, so that we can basically test our entire software stack at the scale. And at the same time, we also can use this to train the system to handle this safety-critical and difficult situations that maybe at the beginning, the system didn't learn enough. And it wasn't good enough for driving, but after using all these new situations, the system basically gets better and better and better until it can graduate and basically go and drive in the real world.

Raquel Urtasun (27:02): Now, I hope that you can really see how this approach significantly really reduces the need to drive in the real world. And this is very important, right? Because first it's very capital intensive to have a fleet of hundreds of vehicles as driving and driving and driving the miles on the road. But also it's not the safest way to develop this technology, right? Because every time that you drive in the physical world, there is a risk associated with it, right? So we have a way to develop in a much safer way, much more affordable. And also thanks to all the automation that we have also within the software stack, we're able to do this much more efficiently than any other company out there. So it's a much less capital intensive and much simple approach.

Raquel Urtasun (27:48): Now, in terms of what is the first application domain that Waabi is tackling is long-haul trucking. And there is really two reasons for this. One is that this is a there is a business need for automation that is very, very large in this domain because there is an acute drive shortage that is just getting worse and worse. COVID just accentuated this problem even further. Also, it's very, very dangerous to actually drive trucks, it's one of the most dangerous professions in North America. So there is a need for increased safety that will come from automation. And from a technical perspective, if you think about it, driving in cities is much more difficult than driving in highways. Highway is a much more structured environment. It's still very difficult to do but is particular set of situations that we believe that we can deploy safely in much faster than if you have robots actors say in our than the core in our cities.

Raquel Urtasun (29:00): So I hope that after this run through self-driving, I hope that every one of you is actually really excited about the fact that self-driving vehicles are here, right? They're going to be here very soon, right? We see them in very, very small operation domains. But with this new technology, we see that the promise of a better world is closer than ever before. And we believe that we are building the right technology to help us really get there in a safely manner. And we also believe that in order to build this technology, we should actually do this in a diverse, with a diverse team in a diverse manner, such that we can also reflect the diversity of the users that this technology is going to have. So this is all that you're going to see from me today. We are hiring, we have plenty of open jobs. Check out our website waabi.ai if you're interested into really self-driving 2.0 and building this self-driving vehicles are reality. And this is all for me. Thank you very much.

+ Read More

Watch More

34:50
Posted Jun 21, 2021 | Views 1K
# Transform 2021
# Keynote