In the speech to text and automated speech recognition world, the accuracy of transcribing English is reaching very high levels. The best voice recognition software can reportedly achieve accuracies around 95% (numbers could be even higher today). Google is definitely amongst the top transcribers in English (read more in this Leaddesk article).
The transcription accuracy is often lower when transcribing languages, which do not have as many speakers or are not used as often. For example, the Finnish language is morphologically very complex and inflective, including many forms for words. This makes the speech to text processing of Finnish much more difficult. Add background noise and other disturbing factors of phone call recordings, and you have a tough job on your hands.
Feelingstream has decided to take on the mission of transcribing phone calls and creating a contact analysis platform focusing on Nordic languages. We have put a lot of work into our Finnish model, transcribing phone calls and improving the accuracy.
Now, we decided to put our speech-to-text model through a test.
When you want to understand what you have, why not test it against the greats?
As mentioned earlier, the speech to text accuracy that Google can offer is one of the front-runners in the world. Azure is another great competitor, and as with Google, they also have options for free testing of their ASR.
Most speech to text recognition does not revolve around real-world phone calls with kids screaming in the background, traffic noises, or people talking over each other. Since this is what Feelingstream specializes in – transcribing phone calls for analysis – we decided to test our Finnish speech to text model against publicly available Google and Azure ones to see how we do.
- We tested the transcription with 20 recordings (agent and customer recorded separately)
- Demo calls made by different Feelingstream employees to various Finnish service providers (transportation, banks, pension, telecom, insurance).
- Native-level speakers of Finnish
- The median duration of calls was 151 seconds
- Calls were spontaneous with no prepared scripts (general questions regarding whichever topic suited the contact center)
- Calls contained various background noises and were made in the office environment with personal phones without any special equipment.
The results are here!
As you probably already realized when reading the headline – our experiment showed great results in our Finnish speech to text accuracy. When compared to Azure, our transcription of spontaneous speech and phone calls was 20% more accurate. Checking our accuracy with Google showed an even greater difference – a whopping 25%.
So to make it loud and clear –
Why use ASR for your call center contacts?
Large customer service or call centers make thousands of calls each day. Analyzing the calls for improvement, whether it be customer service quality, improvement of services, or cost reduction, is complicated for managers without statistical data.
Automated speech recognition software makes it possible to transcribe all calls to text format. This makes the data analyzable and allows greater visibility into the day-to-day of the call centers. Publicly available Finnish ASR models are moderately accurate, they cannot be adapted specifically for your company’s needs. Also, since the contents of the calls are often confidential, using public models or infrastructure is not an option.
When working with our existing customers, we are often surprised by how they come up with new use cases on top of the transcription technology.
- Managements use the Feelingstream’s analytics platform in meetings to look at timelines of processes, call center statistics. They also reflect on the business decisions made based on statistical data from the platform.
- One telecom company tracked reasons of contacts regarding one subcontractor to monitor the number of calls and reasons behind issues with their services.
Why is Feelingstream the best ASR provider for the Finnish market?
- High accuracy
- Specially trained for customer service calls in the Finnish language
- Available for all dialects in Finland (Uusimaa vs Tampere)
- Our own ASR model
- Adaptable for your company
- Two speakers separated in the transcript – enabling speaker-based searches.
- Punctuation – separate sentences, making the text easier to read
- Capitalisation – meaning that we know where the name is or starting of the sentence. It helps the next processing step in business analysis.
- API available
- Real-time transcription available (https://speaker.feelingstream.com/)
Make the most of your contact center data and listen to your customers
If you wish to see the business signals in your calls, there is no better way to proceed than to start transcribing the calls. We constantly work on our Finnish ASR model to strive for greater accuracy and find ways to improve. We’re also keeping our eyes on the Finnish Speech Donation project to collaborate and enhance our services in the future.
Feelingstream provides speech recognition API to companies who want to transcribe their customer service calls. We also offer a business intelligence layer with our ASR technology (read examples in our blog – automated after-call memos, reducing repetitive calls, and much more). Additionally, we also do client-based adaptations for our language model, if needed. We can install the Feelingstream’s ASR model on the client’s server (on-premises) or in a private cloud, which makes the setup secure.
Contact us for a demo and understand how to create greater call center visibility, improve customer service quality, search for sales opportunities, and automate your after-call work – all with Feelingstream.