Speech Technology is on the Road to Becoming More Accessible in the Future

Google recently launched a new research project to improve voice recognition software for people with disorders that impact speech. They recognize that voice enablement technology like Google Assistant, Siri, or Amazon Alexa are designed to respond to a majority of voices but may be off limits to the minority with speech impairments.

Project Euphonia is part of Google’s AI for Social Good Program. Google believes that artificial intelligence can provide new ways of approaching problems and meaningfully improve people’s lives. This initiative applies Google research and engineering efforts to projects with positive societal impact and empowers the community with tools and resources.

Project Euphonia is focused on improving how voice enablement technology recognizes and analyzes impaired speech, optimizing algorithms so that mobile phones and computers can more reliably transcribe words spoken by people with speech difficulties. They turn recorded voice samples into a visual representation of the sound. The computer then “trains” the system to better recognize impaired speech. It’s exciting that a company with advanced tech know-how and deep pockets, like Google, is putting resources into this initiative.

Their first goal is to gather enough voice recordings from people with speech impairments to train speech recognition models to better understand. That’s why they are soliciting speech samples. If you are 18 years old or older and would like to participate, you can click this link to register. They are using a tool called Chit Chat to record voice samples. The initial phrase set will contain 30 phrases, and takes 5-10 minutes to record. The full phrase set contains about 1500 phrases, and may take 4-7 hours. You do not need to record these in one sitting. All phrases will be saved, and you can always pick up where you left off.

There is another project also collecting voice samples to support research efforts. You may remember that in February, we announced the collection of voice samples through the Uncommon Voice program by the Center for Cognitive Ubiquitous Computing at Arizona State University. In order to make the existing speech systems more inclusive of different voices, they need to make data that represents these voice disorders freely available. With their model, any researcher or developer who has an interest in working on projects that make voice recognition systems more inclusive will have access to this dataset of speech samples. You can also participate in this data capture by clicking this like and choose Get Started.

While better speech recognition software is likely years away, we can’t help being excited about the possibilities.