The Week in AI is a roundup of high-impact AI/ML research and news to keep you up to date in the fast-moving world of enterprise machine learning. From an ML model that estimates people’s age based on video selfies to a superchip aiming to democratize AI, here are this week’s highlights.
Meta is partnering with Yoti, a company that specializes in digital identity, to estimate people’s age on Instagram by looking at their face. This happens without the need to physically check documents and without human intervention, all while protecting users’ privacy. Previously, any Instagram user who attempted to change their date of birth so that they would be over the age of 18 was required to verify their age by uploading an ID.
Now, Meta is testing two other options: taking a video selfie or having three followers vouch for you. The video selfies are evaluated using Yoti’s technology to make age estimations. Yoti offers verification services for several industries, including social media, gaming, and age-restricted e-commerce. Its ML algorithms were verified by the Age Check Certification Scheme.
Meta says that once Yoti shares the estimates with Meta, both parties delete the videos to protect users’ privacy.
For its training datasets, Yoti uses anonymous images of diverse people from around the world who have given permission to the company, with the option to delete their data upon request. For people under the age of 13, Yoti collected data collected only with explicit consent from parents or guardians.
Results from a recent Age Estimation whitepaper reveal that the accuracy of Yoti’s main ML model in mean absolute errors are 2.96 years for 6-to-70-year-olds, 1.52 years for 13-to-19-year-olds, and 1.56 years for 6-to-12-year olds.
To continue to protect people from the negative effects of technology, Yoti will continue to minimize potential gender and skin tone biases that may be embedded in its models.
Meanwhile Meta plans to leverage AI to understand if someone is a teen or an adult, which in turn blocks adults from messaging teens, and prevents teens from accessing Facebook Dating or receiving restricted ad content.
Through a collaboration called BigScience, an international team of around 1,000 largely academic volunteers recently revealed BLOOM, an open-source natural language AI. It challenges big tech’s models by addressing the biases that ML systems inherit from the texts on which they train.
BLOOM, a 176 billion-parameter model hosted by Hugging Face and on par with OpenAI’s GPT-3, is trained with $7 million worth of publicly funded computing time. It is the first model of its scale to be multilingual (the name derives from “BigScience language open-science open-access multilingual”).
The team, which launched an early version of the model on June 17, hopes to help reduce the harmful outputs of natural language AI systems. Large models that recognize and generate language, which are increasingly used in applications such as chatbots and translators, can sound very human. However, these models may display flaws, particularly human biases, because they tend to train on language collected from the web, including sites such as Reddit.
To reduce bias, the BigScience researchers handpicked nearly two-thirds of their 341 billion-word dataset from 500 sources. Among them was Semantic Scholar, an AI-backed search engine for academic publications that also includes content such as Nature news articles. The sources were suggested during a series of workshops, including with community groups, such as the African natural language–processing community Masakhane, LatinX in AI, and Machine Learning Tokyo.
Researchers will be able to download the fully trained BLOOM model to experiment and train it on new data for specific applications. However, running the model will require significant hardware capacity. To address this, the BigScience team will publish smaller, less hardware-intensive versions of BLOOM, as well as create a distributed system that allows labs to share the model across their servers.
In addition, Hugging Face will soon release a web application that will allow anyone to query BLOOM without the need to download it. Besides being a tool for exploring AI, BLOOM will be open for a range of research use cases, such as extracting information from historical texts and making classifications in biology.
Scientists from Stanford University and the University of Wisconsin-Madison have created an AI system comprising two models: EG3D, which generates random 3D images of faces and other objects in high resolution while running in real time on a laptop, and GiraffeHD, which lets users edit precise features for desired scenes.
The models’ near-photorealistic 3D scenes aim to speed up and facilitate animated film development, help artists working on games, improve CGI in films, and make it easier to create hyper-realistic avatars.
Like its predecessors, EG3D is powered by a generative adversarial network (GAN) to produce images, leveraging features from existing high-resolution 2D architectures and relying on additional components that can convert images for 3D space. The researchers therefore solve two problems at once: computational efficiency and backwards compatibility with existing architectures. While 3D images produced by EG3D are impressive, these images can be difficult to edit in design software because GANs are a black box.
In response to the lack of tools that can allow users to edit 3D images, the researchers launched GiraffeHD, an ML model that can extract features that are manipulatable. In other words, GiraffeHD, which is trained on millions of images of a specific type such as a car, looks for latent factors; hidden controllable image features corresponding to categories such as car shapes, colors, or camera angles, which can be used to edit 3D-generated images.
EG3D and GiraffeHD were recently unveiled at the 2022 Computer Vision and Pattern Recognition (CVPR) conference.
Chip startup Cerebras developed a piece of silicon the size of a dinner plate—far larger than the average chip, which is measured in millimeters—that makes training AI cheap and easy. Inside a conference room at a Silicon Valley data center, the company recently demonstrated how its technology allows people to shift between deploying different versions of an AI natural language in matters of moments—a task that usually takes hours or days.
Cerebras, which is Latin for “mind,” can train an entire 20 billion-parameter model on a single, nearly foot-wide silicon wafer called Wafer Scale Engine (WSE). Traditionally, training an AI model from 1 billion parameters to 20 billion parameters requires users to add more server hardware and reconfigure racks inside a data center. By making training cheaper, Cerebras claims, its chip allows a natural language AI model to deliver a performance boost far superior to Nvidia's flagship graphics processor-based systems.
The company aims to give researchers and organizations with tiny budgets—in the range of tens of thousands of dollars—access to AI training tools that were previously available only to much larger organizations with more money to spend.
Cerebras focuses on two main interests: the growing challenges of AI compute, and the production of useful chips the size of a wafer in collaboration with Taiwan Semiconductor Manufacturing Company (TSMC).
Its current chip generation, called WSE-2, can offer considerable performance improvements compared to stringing together roughly 80 graphics processors to achieve the computational horsepower required to train some of the largest AI models. WSE-2’s speed advantage is due to data moving faster across a single chip than across a network of dozens of chips.
Verifying age online is a complex, industrywide challenge. Teens in particular don’t always have access to the forms of ID that make age verification clear and simple. Big tech companies such as Meta are cautiously using AI while consulting with government agencies and subject-matter experts in their industry to set clear standards for age verification online, with people's privacy in mind.
And despite the high performance of large language models such as BLOOM, which promises to reduce embedded bias, researchers advise to proceed with caution when using the model. They urge practitioners to read the accompanying document outlining its capabilities and limitations. However, BLOOM could also find uses in research outside AI. Linguists hope to use it to extract information from collections of historical texts that are too large to go through by hand, and that can’t be retrieved using a search engine.
Until next week, stay informed and get involved!