The Week in AI is a roundup of key AI/ML research and news to keep you informed in the age of rapid high-tech change. From accurate prediction of drivers’ and pedestrians’ next moves to AI that tackles climate change, here are this week’s highlights.
Researchers at MIT CSAIL have created a machine learning system called M2I that efficiently predicts the future trajectories of multiple road users that could allow autonomous vehicles to more safely navigate city streets. Trained on the Waymo Open Motion Dataset, M2I takes two inputs: past trajectories of cars, cyclists, and pedestrians interacting in a traffic setting such as a four-way intersection, and a map with street locations and lane configurations. It then uses a relation predictor to infer which two agents have the right of way, and a marginal predictor model that guesses the trajectory of the passing agent.
Humans represent one of the biggest roadblocks preventing autonomous vehicles from being deployed without safety drivers in city streets. For a robot to successfully and safely navigate a vehicle through busy cities, it needs to be able to predict what road users will take next. Current AI systems are either too simplistic (assuming humans walk a straight line), too conservative (choosing to leave a car in park to avoid pedestrians), or can only predict the next move of one agent (though roads are typically used by many users at once).
To address these limitations, M2I devises a simple solution that breaks down a complex, multi-agent behavior problem into smaller pieces, allowing a computer to solve it in real time. The framework guesses the relationships between two agents (example: which car has the right of way) and uses these relationships to predict future trajectories for multiple agents.
Not only were these trajectory estimates more accurate than those from other ML models when tested on real traffic flow from the Waymo Open Motion Dataset, but the technique even outperformed a model recently published by Waymo. In addition, breaking the problem down to smaller pieces allows the technique to use less memory. A research paper, which also involves scholars from China’s Tsinghua University, will be presented at the Conference on Computer Vision and Pattern Recognition (CVPR).
Even though M2I’s performance helps promote trust in autonomous vehicles in the long run, the system has limitations. Its framework cannot account for cases where two agents are mutually influencing each other, such as when two drivers proceed forward because they aren’t sure who should yield in a confusing four-way stop. The researchers plan to address these limitations during future iterations, as well as use their method to simulate realistic interactions between road users, creating huge amounts of synthetic driving data that can improve model performance.
In celebration of World Quantum Day, Google Quantum AI and Doublespeak Games collaborated to launch the Qubit Game, a playful journey that lets you “build” a quantum computer, no quantum physics required. In addition to broadly introducing people to the world of quantum computing, the Qubit Game aims to solve challenges that quantum engineers face in their daily work. Successful gamers will discover new upgrades to their in-game quantum computer and complete research projects, as the game creators hope their interest in how quantum computers are built will increase.
Quantum computing has the potential to help practitioners pursue big opportunities, from better understanding the world by simulating quantum systems, to broad industrial applications like efficient energy consumption, public health issues and the design of disease-curing medicine. To help solve these problems in the long term, Google Quantum AI joined the National Q-12 Education Partnership, created by the White House Office of Science and Technology Policy to provide access to K-12 quantum learning tools, in a bid to inspire the next generation of quantum leaders. A beta version of the Qubit Game is listed with other resources on the partnership’s QuanTime site.
To further expand access to quantum computing research and to keep the community updated with Google Quantum AI’s quantum journey, the team has released several resources. These include an immersive guide that explains their quantum computing effort and highlights the components of a quantum computer; a glimpse of what the future looks like; conference presentations; and additional games such as Quantum Chess in collaboration with Quantum Realm and Caltech.
Meta AI, in partnership with Carnegie Mellon University’s Department of Chemical Engineering, announced the Open Catalyst Project, which aims to design new ML models that predict new chemical reactions for energy storage. This project is part of a bigger mission by Meta Platforms to leverage ML technologies to combat climate change and increase the efficiency of industrial systems.
The transition to renewable energy such as solar and wind is one of the main strategies known to support the deceleration of climate change, but it depends on available sunlight and wind to generate power. When it’s not sunny or windy for an extended period of time, power generation significantly drops, and battery-based energy systems are needed help to save excess energy for distribution during off-peak times. However, according to the researchers, the battery storage process isn’t scalable.
The Open Catalyst Project hopes to solve this problem by speeding up the testing of millions of potentially different material combinations in labs to create the highest-efficiency storage technology. Open Catalyst has more than 8 million data points and 40,000 unique simulations across a variety of materials, which give researchers a significant experimental jump-start.
To further address climate change use cases, the Meta AI team ran experiments that optimized large-scale AI models, reducing infrastructure resources used for language translation by a factor of 800. The researchers believe that this algorithmic optimization practice can have a major impact on emissions caused by the use of AI for tasks such as natural language processing and translation.
A team of MIT CSAIL researchers just released the Self-Supervised Transformer with Energy-based Graph Optimization (STEGO) ML model, in collaboration with Cornell University and Microsoft, to help machines better “see” the same way people do. Today, it takes hundreds of hours of hand-labeling of images to train AI and computer vision systems and help them develop a high-fidelity understanding of their surroundings. Specifically, creating a computer vision training data involves a human drawing boxes around objects within an image so that the trained AI is able to identify the differences between those objects.
The STEGO algorithm is capable of identifying images down to the individual pixel. Using a technique called semantic segmentation, it assigns a class label to each pixel of an image, giving AI a more accurate view of the world around it. Semantic segmentation achieves a more accurate labeling performance than the boxing method because it labels only the pixels that constitute an object, compared to the boxing method’s use of other items in the surrounding pixels within the boxed-in boundary. In an example image of a dog sitting on grass, semantic segmentation would provide labels of only dog pixels or grass pixels to better differentiate both objects.
Scalability is, however, a top-of-mind concern for the researchers. Multi-shot supervised systems usually require hundreds of thousands of labeled images for training purposes. And since there are 65,536 pixels in a single 256-by-256-pixel image, the required workload for labeling each pixel grows rapidly. To address this, STEGO looks for familiar objects that appear throughout a dataset and groups them together to construct a consistent view of the world across all the images it learns from.
STEGO doubled the performance of previous semantic segmentation techniques, nearing human-level segmentations. In a driverless car dataset, STEGO successfully segmented out roads, people, and street signs with much higher resolution and granularity than previous systems.
Despite its superior performance, STEGO does have limitations. For example, it gets confused by nonsensical images such as a banana sitting on a phone receiver; STEGO can’t tell if that object is food-related or a phone handset. The researchers plan to increase STEGO’s flexibility during future iterations by allowing it to better identify multi-class objects.
This week’s highlights hint that partnerships between big tech and academia will continue to yield favorable returns on AI and ML efforts. The Open Catalyst Project allows researchers to tackle one of humankind’s biggest challenges: energy storage. And since many of these challenges require brute-force computation to accelerate discoveries, quantum computing seems well suited for the job. However, quantum computing will require the early involvement of students to ensure its long-term success.
Moreover, while the academia-based M2I will contribute to the evolution of autonomous vehicles, its enablement was made possible by Waymo’s big traffic flow dataset. Lastly, computer vision systems such as STEGO have the potential to equip autonomous vehicles with humanlike attention on the road, further promoting trust in these computers on wheels.
Until next time, stay informed and get involved!