Audio and Speech-to-Text

Figure Eight powers transcription, utterance collection, time stamping, and categorization for myriad audio use cases and machine learning models.

As home agents and other voice-controlled devices enjoy widespread adoption, it’s imperative that they can actually understand their users. But regional accents, dialects, and background noise can make accurate audio models hard to train. Figure Eight can help. Here’s what we can do:

Transcription from audio

Understanding audio often requires that audio transformed into written text. Figure Eight provides free-text translation from audio utterances with a multi-step workflow for maximum accuracy and quality. Additionally, because our contributor base and business process outsourcing (BPO) partners are globally distributed, we can do transcription (or recording) in whatever language you need.

Audio utterance collection

Collecting audio snippets and instructions is an important step to both creating and fine-tuning audio interfaces. Figure Eight allows you to access a distributed, multi-lingual contributor base to collect just this kind of data. You can ask for specific phrases or, more commonly, provide a prompt and have the user create their own utterance in their own words for certain events your use case requires.

Audio categorization

Create a customizable job and ontology to uncover deeper information about the audio you need to be analyzed. Our human-in-the-loop platform can help you determine the emotion behind a statement, categorize the topics in a conversation, identify any important event in a short snippet of recorded audio.

