Filter by tag

So you want to train a computer vision algorithm? Here are a few essential resources

From annotating satellite imagery down to labeling microscopic images and everything between, we’ve seen our fair share of computer vision projects here at CrowdFlower. And over the past month or so, we’ve been sharing what we’ve learned over the years. Specifically, what we’ve learned about curating high-quality training datasets for real world use cases.

This is just a quick post to compile some resources for computer vision practitioners and their teams. We’ll be hosting a class on the subject in November too, but since we’re still working out the venue and a handful of other details, we’ll tell you about that when we’re concrete. For now? Here’s a quick roundup of valuable materials to help make your CV project more successful:

What we learned labeling 1 million images

A free eBook compiling some tips and lessons we’ve learned labeling images for clients and academics. You’ll learn how to prevent overfitting, which tool works best for your project, why image quantity might be even more important than image quality, and a whole lot more.

Making a computer into a super-recognizer

The first in a three-part series by Tyler Schnoebelen, this piece looks at the most typical kinds of image processing tasks, the kind of team you need to succeed, and how to keep your image project from going off the rails.

A whirlwind tour of image processing use cases

Often, we break image tasks down by the tools they use or their specific endgame. But what if we looked at image processing on a more holistic level? How would we break it down? And what exactly are we trying to do with image processing anyway?

An art forger’s guide to image processing

The last of Tyler’s pieces looks at how art forgers succeed and what lessons we can learn about that and apply to image processing. We cover why even the most thorough art connoisseurs can be duped, how Generative Adversarial Networks might help, and how we can all see more clearly by combining human and machine intelligence.

Data for Everyone library

We’d be remiss if we didn’t mention that our Data for Everyone library had some free image sets to help you train your models. There are plenty of NLP sets, data collections, and more here as well.


Speaking of datasets, Stanford Vision Lab’s ImageNet is perhaps the biggest and best out there. It comes in at over 14 million images and you can access them in a variety of ways.

CrowdFlower image annotation webinar

Join CrowdFlower platform expert Andy Butkovic as she explains how best to use our software and our contributor base to make your annotation project successful. Andy share’s what makes projects successful–and what doesn’t–as well as plenty of other tips from the trenches.



Justin Tenuto

Justin Tenuto

Justin writes things at Figure Eight. He enjoys books about robots and aliens, baking bread, and is pretty sure he can beat you at pingpong.