How TalkIQ created the transcription and NLP training data they needed to make their models a success
When we changed to Figure Eight, within a few weeks, we saw that labeler accuracy go up to 88% and it stayed in the high 80s and 90s for us ever since, even across a large diversity of models. That’s been a really, really big win.
– Etienne Manderscheid,
Head of Data Science, Co-Founder TalkIQ
Talk IQ improves conversations with data. They collect telephonic audio, transcribe those dialogs with in-house speech recognition models, and use natural language processing algorithms to comprehend every conversation. They use this universe of one-on-one conversation to identify what each rep–and the company at large–is doing well and what they aren’t, all with the goal of making every call a success.
Every company is different. Each has its own vernacular, its own target market, and its own goals for the conversations they have with their customers. That means that every one of TalkIQ’s clients needs a robust set of unique training data to make the solution work as well as it possibly can and each training set informs a model that makes sense of that specific company’s conversation.
Creating these individual sets is a core part of TalkIQ’s offering.
After all, they’re the fuel that drives the accuracy of their solution. TalkIQ had worked with a competitor of Figure Eight for six months, but were having trouble reaching an accuracy threshold to make their models a success. Put simply, the labels they received weren’t good. They topped out at about 70% accuracy.
TalkIQ needed a change, so they turned to Figure Eight.
It took just a couple weeks for the change to bear fruit for TalkIQ.
“When we changed to Figure Eight, within a few weeks, we saw that labeler accuracy go up to 88% and it stayed in the high 80s and 90s for us ever since, even across a large diversity of models. That’s been a really, really big win.”
Since TalkIQ’s solution is built on their recognition, transcription, and comprehension models, the accuracy of those models is incredibly important. After all, almost every engineer knows the old adage, “garbage in, garbage out.” Less colloquially, it boils down to this: bad labels mean bad training data and bad training data means bad models.
TalkIQ runs Figure Eight jobs to create the data that drives these models. Primarily, they’re driving their transcription models through the platform, doing transcription from audio, categorizing audio into key moments and other important data classifications, as well as verifying internal transcriptions and outputs of their models. They even use Figure Eight’s geolocation tools to make sure British contributors label idiomatic speech from the U.K.
TalkIQ has been acquired by DialPad but they’re going to continue scaling their operation with custom training data. They plan on aiming for labeling training datasets for hundreds of customer organizations, as well as generating and validating paraphrases in addition to the key moments at which they already excel. The Figure Eight platform will be able to scale with them, maintaining the accuracy they’ve gotten used to, every step of the way.