Modern hearing aids (HA) have had a big impact on the quality of life for those with impaired hearing. However, HAs still largely fall short in noisy environments. The main issue is that a hearing aid will amplify all the sounds, even the unwanted ones. This is known as the Cocktail Party Problem and was introduced as early as in 1953. Luckily, sensor technology has come a long way since then. It would be desirable if the HA had increased awareness of sound sources and other landmarks in the surroundings. This knowledge could be used to extract the wanted sounds from the unwanted ones, maybe simply by gazing at the desired target. With state-of-the-art eye trackers, inertial measurement units, cameras and microphones the idea of using sensor fusion for hearing aid control is promising.
The project members were divided into three subsystems, one for each of the project goals.
One of the major advantages with a simulation environment is
the ability to develop algorithms more efficiently, since it
simplifies the trial and error methodology as well as the
iteration process. A robust simulation environment makes future
development easier for several reasons. Firstly, with a
simulation environment it is possible to have total control of
the setup and the noises involved. It enables researchers to
simulate scenarios that would be difficult, or costly, to
implement. Secondly, a simulation environment makes the researchers
practically independent of the hardware during software development.
This also unlocks the ability to estimate the impact of additional
sensors before buying them.
Two different techniques for distance estimation have been
investigated. The first is based on the eye tracking provided
by the glasses. Depending on the distance to the object looked
upon, the relative angle of the gaze vectors change. This is
called vergence and can be used to estimate the distance to
targets at close range.
The second technique is based on identifying objects of roughly
known size, such as faces, in the video stream and based on their
apparent size in the camera calculate the distance to them.
The purpose of SLAM is to give a better understanding of the
environment that the user is located in. SLAM is an algorithm
which estimates landmarks in an environment and its own position
and orientation according to those landmarks. A graphical
interface is used to display the user's position and the
speakers' positions relative to the user. This situational
awareness feature could be a cornerstone in creating sensor
fusion controlled HAs.
By combining distance perception and SLAM, a map of the user and speakers in an environment has been successfully created. This works for both static and semi dynamic scenarios on real data and together with data from the simulation environment. A suggestion on what could be done in future efforts is to focus on the real-time aspects and make sure the algorithms run smoothly in the scenario of a live test or live from the simulation environment.
Responsible for the Software