From chatbots to predictive text, all kinds of applications are using AI to navigate language barriers and facilitate communication across different communities. Many of these applications focus on text, but there is more to language than written words. Sometimes even fluent speakers of a second language will experience challenges when communicating face-to-face with native speakers. One of the best ways to overcome these challenges is to practice pronunciation.
Markus Koy (@MarkusK) is an IT projects analyst with 18 years of experience across various industries. He is also a native German speaker living in an English-speaking part of Canada, and a regular visitor to C2C’s AI and ML coffee chats, which are hosted in the U.S. Koy’s experiences working in English-speaking countries as a non-native English speaker inspired him to create thefluent.me, an AI-powered app that tests speech samples and scores them based on how well they correspond to standard English pronunciation.
On thefluent.me, users record themselves reading samples of English text (usually about 400 characters long), and then post them either publicly or privately on the app’s website. Within about 30 seconds, the app delivers results, reproducing the text and indicating which words were pronounced well and which can be pronounced better. Even native English speakers may find that they can improve their pronunciation, sometimes even more so than someone who speaks English as a second language.
We recently approached Koy with some questions about thefluent.me, Google Cloud products, and his experience with the C2C Community. Here’s what we learned:
What inspired you to develop thefluent.me?
Koy began working on thefluent.me after contributing to a research project with an international language school. As a second-language English speaker himself, he had already taken the International English Language Testing System; he had found pronunciation to be the hardest part of the process.
“Immediate feedback after reading a text is usually only available from a teacher and in a classroom setting,” he says. Teachers only listen to a speaker’s pronunciation once, and will likely not provide feedback on every word. Tracking progress systematically is just not feasible in a classroom setting, and sometimes non-native speakers will feel intimidated when speaking English in front of other students.
Koy continued his research on AI speech-recognition programs and also graduated from Google’s TensorFlow in Practice and IBM’s Applied AI specialization programs. He decided to build thefluent.me to help students struggling to overcome these challenges.
What makes thefluent.me unique?
There are many apps on the market for students studying English as a second language, and thefluent.me is not the only app of this kind that uses AI for scoring. However, apps combine different features to support distinct learning needs. Koy kept these concerns in mind when designing and building the following features for thefluent.me:
Immediate pronunciation feedback: The application delivers AI-powered scoring for the entire recording and word-level scoring on an easy-to-understand scale.
Immediate feedback on reading speed: Besides pronunciation, the application provides feedback on the reading speed for each word.
Own content: Users can add posts they would like to practice instead of using content only published by platforms. They can immediately listen to the AI read their post before practicing.
Progress tracking and rewards: Users can track their activities and progress. They can revisit previous recordings and scores, check their average score, and earn badges.
Group learning experience: By default, user posts are not accessible to others. However, users can also make their posts public and invite others to try, or they can compete for badges.
How do you use the Google Cloud Platform? Do you have a favorite Google Cloud product?
Koy runs thefluent.me on App Engine Flexible. He likes how easy the deployment process is, especially when managing traffic between different versions. Two key Application Programming Interfaces (APIs) Koy is using are Speech-to-Text and Text-to-Speech, which Koy says allow the Wavenet voices to sound more natural. He also likes that both allow him to choose different accents for the AI speech. Koy is also using Cloud SQL and Cloud Storage, which he finds easy to integrate.
What do you plan to do next?
“There are many other items for horizontal and vertical scaling on my roadmap,” Koy assures us. He is planning to add additional languages and enhance the app’s group features. He has also been approached by multiple companies who want to use thefluent.me for education and training. Koy plans to publish APIs to accommodate these requests in the coming weeks.
Why did you choose to join the C2C community?
Like so many of our members, Koy joined the C2C community to meet people and collaborate, but his experience here has informed his work on thefluent.me beyond friendly conversation. Recently, a community member expressed to Koy that thefluent.me is an ideal tool to use when preparing for a job interview—a user can rehearse answers to interview questions to learn to pronounce them better. For Koy, this is not just nice feedback; it is also a use case he can add to his roadmap.
Still, community itself is enough of a reason for Koy to return on a weekly basis. “Mondays are just not the same anymore without our AI and ML coffee chats,” he says.