Blog
Filter by tag

Data in the Fourth Dimension

November 28, 2017

We often make the mistake of thinking all data is static. And sure, while a picture of a cat playing a guitar will always be a picture of a cat playing a guitar, no matter how many times we look at it, the vast amount of data is actually alive. Sentiment analysis about a sports team or politician fluctuates with recent championships or scandals. Trend forecasters can’t use blogs from 2005 about MySpace’s popularity. A person’s address or shirt size can be different today than it was yesterday.

Companies rely on living data far more often than static data. Logistics chains, aerial imagery, sales trends, you name it: they’re in constant flux. They evolve. And those businesses need a strategy to deal with constantly changing datasets. After all, you can’t make data-based decisions on stale data. Not if you want them to be smart ones, that is.

Take InsideView for example. InsideView modernizes marketing and sales organizations with targeting intelligence and business data. But business data is living and old data isn’t actionable for their clients. Take something as simple as a list of sales prospects. Anyone who has ever used an internal one knows that prospects don’t stay stagnant. Your average person changes jobs a dozen times in their life and a list of email addresses and contact information you pulled a year or two back is likely full of null entries. In fact, InsideView discovered that B2B data decays somewhere between 2 and 3 percent every month just through employee churn. That means you’re looking at databases where 25% of your information is inaccurate a year from today.

When accurate data is a key part of your business, you need a plan. InsideView uses a multi-methodology, drawing on partners like AWS, in-house expertise, proprietary technology, user-generated and editorial content, machine learning classifiers, and a whole lot more. They use us here at CrowdFlower to train and inform some of their models, as well as validating data and updating some of that living data we referred to before. Quality data, collected or validated quickly, is a real advantage, for both InsideView and their 20,000+ customers.

Living data often requires an approach like this. It demands diverse approaches to problem solving and a smart team who can aggregate the best results from each tactic. Because rarely is a single methodology going to lead to maximum accuracy. And with any kind of data, living or static, accuracy is always of paramount importance.

There’s no silver bullet for what combination of techniques work for any business. It takes iteration, trial and error, and constant evaluation. Which, in a way, is a lot like dealing with living data. The longer you rest on what you have, the more likely it is that something’s going missing or unaccounted for. And the more you treat your data like a static entity, the more your business starts to follow suit.

Justin Tenuto

Justin Tenuto

Justin writes things at Figure Eight. He enjoys books about robots and aliens, baking bread, and is pretty sure he can beat you at pingpong.