We are delighted that Figure Eight is a launch partner for Google’s Cloud AutoML service. As more of the world’s software developers are embedding machine learning (ML) into their applications, Cloud AutoML is a great addition to the toolkit on Google Cloud Platform (GCP), and it fits in alongside the existing services including Computer Vision, Natural Language Processing, and Translation.
Cloud AutoML is new kind of machine learning offering from Google, as it is their first service where customers can adapt the machine learning to their own use case without having to build it from the ground up. Their previous computer vision service could identify a predefined set of labels in an image. For example, it might identify a ‘shoe’ in an image, but nothing more fine grained. With Cloud AutoML, a customer who cares a lot about shoes can identify very specific and fine-grained labels, like the type of shoe (“dress”, “athletic”, etc.), the type of fastening (“laces”, “buckles”, etc.), the type of heel, the brand, the specific color, or many other labels that could be applied to shoes. This is a very powerful extension to the existing machine learning products on GCP, as a customer can use the labels that are the most important for their business. For example, if you are trying to determine what styles of shoes are popular across 1000s of fashion websites, you would need to identify each of these specific features of the shoes, not just whether or not a shoe exists.
I demoed exactly this use case at Google Cloud Next for the launch of AutoML: how does WGSN, “The World’s Trend Authority”, identify the current trends in footwear and other fashion products?
The key to adapting machine learning to your labels for computer vision is to provide human-labeled examples of those images. Typically, you need thousands of examples of each label for a computer vision algorithm to learn how to correctly identify a given label. So, if you want to identify laces vs buckles on shoes, you need humans to manually label thousands of examples of each type of shoe. AutoML does a great job on the algorithm side, selecting the right kind of machine learning algorithm and parameters to get the most out of that data, but the need for human labels remains the most important “last mile” in machine learning.
That’s where Figure Eight comes in. Figure Eight provides the software to create training data with the specific labels that matter to the customers and allows them to collect the labels they need to adapt their models to their specific use cases. Here’s an example from Figure Eight customer WGSN:
The configurable interface of our annotation platform allows WGSN’s machine learning experts to incorporate detailed instructions and pop-up photographs to ensure that people are providing the correct labels.
Figure Eight also has thousands of people in our marketplace who are experts in annotating fashion images. The same people who provide image labels for WGSN might also be labeling similar images for eBay (also a joint customer of Figure Eight and Google Cloud AutoML) or for dozens of other Figure Eight customers in the fashion industry, so WGSN is able to draw on their expertise while getting labels fast.
WGSN also takes advantage of the other quality controls in the Figure Eight platform, like giving the same task to multiple people so that the errors of one person don’t propagate, and allowing customers to limit the annotator pool to certain regions and languages. They also use our most important quality control: our patented method for embedding known answers as test questions to automatically quiz potential annotators and track their accuracy over time, to truly scale the accurate collection of human labels.
With these quality controls in place, the annotation can be scaled seamlessly, as we have more than 100,000 people regularly annotating in our marketplace. For a Google-internal team last week, we were providing more than 250,000+ labels per hour, providing a volume of labels that would have taken years for one person to complete in a matter of hours.
Together, this means that when WGSN take their labeled data to Google to build a machine learning algorithm, they are confident that the labels are correct and that they have collected enough labels to make their computer vision models accurate.
What I personally love about our integration with Cloud AutoML is how many use cases this same Human-in-the-Loop Machine Learning can power. Today, we have customers building very similar computer vision algorithms but with very different data. For example: we are helping the United Nations estimate population densities from aerial photographs to track refugees; we are helping farming drones distinguish a weed from a seedling; and we are helping companies like Opendoor disrupt the housing industry – another joint customer with Cloud AutoML who I was happy to speak alongside with at Google Cloud Next ‘18! Even at the microscopic level, we’re helping companies label cells and diagnose diseases with this exact same combination of machine learning and human annotation.
Ultimately, we expect most customers of Cloud AutoML to use this as one tool among many in their application stack. We see similar behavior today in cutting-edge companies like TalkIQ who are using a range of different off-the-shelf and customizable machine learning tools on Google Cloud, especially in cases where customers are combining many different kinds of data, like images, text, audio, and video.
Integrations like Figure Eight and Cloud AutoML open the promise of machine learning to smaller businesses, in particular, organizations without existing ML teams, and even large corporations looking to experiment with machine learning for the first time. High-quality training data, the right models, and easy deployment are the most important ingredients for successful machine learning. So, we are glad to be part of the easiest-to-use combination of core machine learning components that exists today!