Speech Therapy Application

The goal of this project is to create a speech visualization tool that can analyze speech and provide visual feedback as well as an error report. This should have a modular design that can be easily added to in the future.

Problem Definition
Using CMUSphinx, an open-source ASR toolkit, we will analyze speech files to detect errors, and we will develop a user interface in Java.

Background
There have been many projects in recent years having to do with analyzing pronunciation errors. Many of these have used CMUSphinx. Our project is different in two ways. First, our project will be developed with a specific aim toward working with child speech. Second, we aim to take the current state of the art and improve on it.

CMUSphinx
CMUSphinx is an ASR toolkit which we are using for this project. It uses the most state-of-the-art algorithms based on decades of CMU research.

Speech Recognition
Speech recognition consists of multiple steps. First there is feature extraction, forming a feature vector. Then, this vector is matched to the acoustic model.

Microphones
The choice of a microphone took many factors into account. This program’s eventual goal is to be used in schools or people’s homes. Knowing that majority of users wouldn’t have access to expensive audio equipment the team decided to set a soft budget of $50 for a microphone. Using an inexpensive microphone will allow the team to test the program using audio quality that would be reasonable for a user to achieve. The final decision was to use the Blue Snowball Microphone.

The Snowball offers a great balance of audio quality and price. A cardioid recording mode is also included which only records audio directly in front of the microphone, useful to prevent unwanted sounds during recording. USB connectivity ensures that the mic should work on nearly every computer. The mic also comes with a small tripod, making it much more portable than a microphone requiring a boom arm mounting system.

This setup will allow for a portable audio recording system that provides quality audio recordings to train and test the programs acoustic model.

Speech disorders
Speech Sound Disorders, or SSDs, affect 10-15% of preschoolers and 6% of school-aged children. The majority of these have unknown causes.

There are two major categories of speech disorders: phonological disorders and motor speech disorders. In phonological disorders, patients are physically capable of producing the correct sounds. In motor speech disorders, patients physically cannot produce the correct sounds and must be shown how. Our software will be more helpful to the former category.

Relevant data for diagnosis includes omission and distortion of phonemes, stress errors, speed of speech, and consistency.

It's important to test phonemes alone, in syllables, in phrases, and in spontaneous speech. The results often differ between these contexts, and that information can be valuable to diagnosis.

Design
We originally used Pocketsphinx with Python, but then discovered that Sphinx4 with Java is 200x faster, most likely due to optimizations implemented after the release of Pocketsphinx.

Flowchart:



The Graphical User Interface will be written in Java and will look something like this:



Team Information
{| class="wikitable"


 * style="text-align: center;" | Member
 * style="text-align: center;" | Biography
 * style="text-align: center;" | Discipline
 * style="text-align: center;" | Discipline


 * - align="center"
 * Simon Barnes
 * Simon Barnes is a senior Computer Engineer at the University of Idaho. Interest in the project comes from his mother's involvement working with children who have speech disabilities as well as the programming involved for speech recognition. His hobbies include cars, technology, and Super Smash Bros Melee for the Nintendo GameCube.
 * Computer Engineer


 * - align="center"
 * Emma Bateman
 * Emma Bateman is a senior Computer Science major at the University of Idaho. She is interested in machine learning and wants to learn more about speech recognition.
 * Computer Science


 * - align="center"
 * Joshua Bonn
 * Joshua Bonn is a senior Computer Science major at the University of Idaho. He is interested in this project because of his own issues with speech and his learning through speech therapy as well as an interest in machine learning. His interests include video games and music.
 * Computer Science