The Week in AI is a roundup of high-impact AI/ML research and news to keep you up to date in the fast-moving world of enterprise machine learning. From a tool for calculating the carbon footprint of ML workloads to datasets and ML models that can improve computer vision, here are this week’s highlights.
A group of research scientists from the Allen Institute for AI, the University of Washington, Carnegie Mellon University, the Hebrew University of Jerusalem, Hugging Face, and Microsoft revealed a new method for calculating CO2 emissions from AI models, while providing guidance on ways to reduce emissions during training. Several software packages are available that can estimate the carbon emissions of AI workloads, but recently a team at the Université Paris-Saclay tested a group of these tools and found they’re not reliable in all contexts.
Their new approach, as presented about two weeks ago at the ACM Conference on Fairness, Accountability, and Transparency (FAccT), differs in two aspects: It records server chips’ energy use as a series of measurements, rather than summing their use over the course of training, and it aligns the usage data with a series of data points indicating the local emissions per kilowatt-hour (kWh) of energy used.
In a preliminary experiment, the team found that a server’s GPUs used 74% of its energy. CPUs and memory used a minority, and they support many workloads simultaneously, so the team focused on GPU usage. Other measurements left out included the energy used to collect data or run trained models and the energy used to build the computing equipment, to cool or build the data center, and to transport engineers to and from the facility.
The researchers trained 11 ML models of different sizes to process language or images and obtained carbon emissions per kWh of energy used in 2020 in five-minute chunks for 16 geographical regions. Preliminary testing revealed that powering the GPUs to train the smallest models emitted about as much carbon as charging phones, while GPUs that completed 13% of training for the largest model containing 6 billion parameters emitted carbon levels comparable to powering a U.S. home for a year.
As for emission reduction, geographical region was the biggest measured factor, because grams of CO2 per KWh ranged from 200 to 755. And in addition to changing location, they tested two CO2-reduction techniques: The first, Flexible Start, which aims to delay training up to 24 hours, saves between 10% and 80% of carbon emissions for a small model but saves only 1% for a large model. The second, Pause and Resume, which halts training at times of high emissions, benefited the largest model by 10% to 30% in half of the geographical regions, while having savings of only a few percent for the small model.
While the researchers found these optimization techniques interesting when using retrospective data, they assert that real-time predictions of emissions per kWh will happen in the near future.
Scientists from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and Stanford University recently trained robots to manipulate soft, deformable material into various shapes, from visual inputs, which could one day enable better home assistants. Modeling and manipulating objects with a high degree of freedom is an essential capability for robots that need to perform complex industrial and household interaction tasks, such as stuffing dumplings, rolling sushi, and making pottery.
The new system learns directly from visual inputs to let a robot with a two-fingered gripper see, simulate, and shape doughy objects. “RoboCraft” could reliably plan a robot’s behavior to pinch and release children’s modeling compound to make various letters, including ones it had never seen. With just 10 minutes of data, the two-finger gripper rivaled human counterparts that teleoperated the machine, performing on par, and at times even better, on the tested tasks.
Traditionally, researchers used complex physics simulators to model and understand the force and dynamics applied to objects, but RoboCraft just uses visual data. By turning the images into graphs of little particles, coupled with algorithms, RoboCraft, using a graph neural network as its dynamics model, makes more accurate predictions about the material’s change of shapes.
The inner workings of the system rely on three parts to shape soft material into the letters B, R, T, X, and A out of the modeling compound. The first part, perception, is about learning to “see” by collecting raw, visual sensor data from the environment that is turned into clouds of particles to represent shapes. The second, a graph-based neural network, uses particle data to learn to “simulate” the object’s dynamics, or how it moves. The third, algorithms, helps to plan the robot’s behavior so it learns to “shape” a blob of dough, armed with training data from the many pinches.
In a means to expand the technology, the scientists envision using RoboCraft for assistance with household tasks and chores, which could be of particular help to the elderly or those with limited mobility.
Google’s AI researchers claim they have achieved quantum advantage— when a quantum computer can perform tasks that aren’t possible with a classical machine—in the field of ML. This is the first demonstration of a provable exponential advantage in learning about quantum systems that is robust even on today’s noisy hardware.
Google reached the milestone on its Sycamore quantum computer, which completed a series of learning tasks using a quantum learning algorithm that analyzes the output of quantum sensors. A classical ML system could not directly access and learn from quantum information, but the quantum learning agent can access and directly interact with the quantum data, providing a level of analysis not possible in any other way.
In a new paper published in Science, the team described how their quantum learning agent performed exponentially better than classical ML on other tasks not involving quantum data. To achieve this, the researchers had to scale up the number of qubits and improve quantum error correction.
Even though these results were obtained in lab conditions, businesses are already studying the ways quantum ML techniques could be deployed in the real world. At the AI Summit in London, for example, data scientists from Barclays mentioned several realistic use cases for quantum ML in financial services, including fraud detection.
Overall, researchers forecast a continued growth in interest in extending the recent success of quantum ML advantage to more meaningful tasks such as predicting properties of physical systems, performing quantum principal component analysis on noisy states, or learning approximate models of physical dynamics.
Visibility into carbon footprint data from heavy AI workloads unlocks many possibilities for sustainable technologies. Given such information, users might decide to train at different times or in different places, buy carbon offsets, or train a different model—or no model at all. Before running experiments, practitioners should think about resource optimization techniques and verify whether ML is even needed to solve a particular problem.
Currently, training state-of-the-art ML models is not only computationally very expensive, but also potentially harmful to nature—and even with the latest hardware, the training times can reach up to several weeks. Researchers’ hope is that quantum advantage will continue to cut both time and costs when performing tasks that cannot be completed by classical ML systems.
Until next time, stay informed and get involved!