Google researchers showed back in 2017 that Transformer models could generate convincing text based on a prompt. In 2020, an overlapping team successfully classified images with a Vision Transformer (ViT). In early 2022, OpenAI released DALL·E 2 in this blog post, demonstrating a model that could reliably generate high-resolution, sophisticated images in a wide variety of learned styles from a simple text prompt. The OpenAI team described three different ways they used data selection to reduce certain harmful content. They used it to:
This is a great framework for generally improving content moderation of large ML models. Read on to learn how each of those three steps could be supplemented with synthetic data to improve the results even further.
DALL·E 2’s developers use an active learning approach to create classifiers that will find the examples of the image categories they would like to label—for example, images not suitable for work. They start with small datasets for both positive and negative examples (a few hundred of each), but typically, accurate, robust models require closer to 1,000 samples of each.
At this stage, procedurally generated synthetic examples of both positive and negative examples would make these classification models far more robust before even proceeding to the active learning stage. Thus, in an ideal scenario, the classifiers start out with far more accuracy this way, and the active learning stage can yield the same results in fewer iterations. Thus, things can happen faster and at lower cost.
Furthermore, humans are perhaps more intuitively creative at “defining” categories that aren’t work-suitable than they are adept at searching a large database of images and finding the NSFW examples in the haystack. With active learning, humans can reinforce the algorithm’s signal as to what is in bounds and what is out: Additional examples of positive and negative classification help the model improve in accuracy, eliminating the need to assess images one at a time from the (massive) dataset that OpenAI trained DALL·E 2 on. OpenAI’s active learning approach consisted of two main steps:
In spite of exploration through cross-validation, however, different clusters identified in feature-space might have been ideally suited to synthetic generation. This additional step might have reduced the need for exhaustive searching of sexual or violent data by human labelers.
The OpenAI team next assigned a loss score to every image in the training set. They then calculated the ratio of the likelihood that the image is from the unfiltered dataset versus the filtered dataset. If the result is a higher value, they would weight the loss of sample further, implying that the filtered dataset lacked proper representation from a certain cluster. (In their blog post they mentioned a lack of females in their filtered dataset.)
However, synthetic data, intended to create additional points in a nearby cluster, could be generated around high-ratio samples rather than artificially inflating the loss of a specific image. Next, we could create a probability from this ratio and use that to choose whether or not to sample one of X nearest neighbors to this image from a synthetic dataset. This would reduce the risk of overfitting to particular examples with very high loss, and also unbias the model by providing more examples of the underrepresented class.
The next step the OpenAI team took was to deduplicate their dataset. Using random sampling, they examined subsets of the dataset and create clusters based on five parameter sets. Those clusters were then sampled for duplicate images. However, simply discarding the duplicate images within the dataset isn’t necessarily ideal. It should be feasible to expand the dataset by replacing duplicates with synthetic equivalents: Assess the class of the image, use a random seed, and procedurally generate an image of that class. Even if the classification is of low certainty, mitigating the reduction in dataset size while expanding it with a new, synthetic image should have double benefits. The resulting dataset would contain only unique images, and it would maintain its original size. Additionally, setting up an adversarial network (GAN) to synthesize images that the trained model thinks are in-class versus an impostor, might further resolve the boundaries between safe and unsafe content.
Looking back, OpenAI’s DALL·E 2 was a breakthrough for highly convincing image synthesis, and perhaps more importantly, it introduced three groundbreaking techniques to improve harmful content filtering via tweaks or maybe even “hacks” to their training data:
What’s somewhat surprising about these three particular innovations is that synthetic data provides an opportunity to improve on all three of them. Since, generally speaking, large machine learning models can be manipulated or even enhanced with the data you feed it, synthetic data is a cheap way to modify model performance to suit your needs. And particularly for “unsafe” classes, synthetic data reduces the need for human annotators to curate or generally experience these harmful images.