IBM 5 in 5 2012: Hearing

Computers will hear what matters

Editor’s note: This 2012 IBM 5-in-5 article is by IBM Master Inventor Dimitri Kanevsky.

Imagine knowing the meaning behind your child’s cry, or maybe even your pet dog’s bark, through an app on your smartphone. In the next five years, you will be able to do just that thanks to algorithms embedded in cognitive systems that will understand any sound.

Each of a baby’s cries, from pain, to hunger, to exhaustion, sound different – even if it’s difficult to tell. But some of my colleagues and I patented a way to take the data from typical baby sounds, collected at different ages by monitoring brain, heart and lung activity, to interpret how babies feel. Soon, a mother will be able to translate her baby’s cries in real time into meaningful phrases, via a baby monitor or smartphone.

Predicting the sound of weather

Sensors already help us with everything from easing traffic, to conserving water. These same sensors can also be used to interpret sounds in these environments. What does a tree under stress during a storm sound like? Will it collapse into the road? Sensors feeding the information to a city datacenter would know, and be able to alert ground crews before the collapse.

Scientists at our Research lab in Sao Paolo are using IBM Deep Thunder to make these kinds of weather predictions in Brazil.

These improvements in auditory signal processing sensors can also apply to hearing aids or cochlear implants to better-detect, extract, and transform sound information into codes the brain can comprehend – helping with focus, or the cancelation of sounds.

Forget to hit “mute” while on that conference call at work? Your phone will know how to cancel out background noise – even if that “noise” is you carrying on a separate conversation with another colleague!

Ultrasonics to bridge the distance between sounds

Sound travels at 340 meters per second across thousands of frequencies. IBM Research also wants to take the information from ultrasonic frequencies that we humans can’t hear, into audio that we can. So, in theory, an ultrasonic device could allow us to understand animals such as dolphins or that pet dog.

And what if a sound you want or need to hear could cut through the noise? The same device that transforms and translates ultrasonics could work in reverse. So, imagine wanting to talk with someone who, while only a short distance away, is still too far away to yell (say, from across a crowded room). A smartphone, associated with an ultrasonic system, could turn the speaker’s voice into an ultrasonic frequency that cuts through sounds in the room to be delivered to, and re-translated for only the recipient of the message (who will hear the message as if the speaker was standing close by – no receiving device needed).

This ultrasonic capability could also help a police officer warn a pedestrian to not cross a busy road, without shouting over the traffic noise. And parents could “call” their children to come in from playing in the neighborhood when it’s time for dinner – without worrying if their children’s cellphones were on or not. 

If you think cognitive systems will most-likely have the ability to hear, before augmenting the other senses, vote for it, here.

IBM thinks these cognitive systems will connect to all of our other senses. You can read more about sight, smell, taste, and touch technology in this year’s IBM 5 in 5.

Note: In February 2013, as part of earning the Tan Chin Tuan Exchange Fellowship in Engineering, Dr. Kanvesky lectured about and demonstrated his transcription technology at Nanyang Technological University - Singapore. You can watch his lecture Why I Care About Hessian-Free Optimization in its entirety, here.


  1. Wouldn't these innovations be useful in hearing aids? Although technology in this area has come a long way, there is still very much room for improvement.

  2. "Sound travels at 340 miles per second", looks like your improvements to sounds processing are also drastically increasing the speed of sound. Seriously though, the same app that tells the mother if they're baby's cries are of joy or of pain will be brethren to the current apps out there telling which breast the baby nursed on last. I'm sorry but mothers know these things without smart phones by simple feeling either of weight in a breast to hair standing up on their necks from a cry.

    1. I've seen lots of mothers that have no clue what their babies are crying for, so no, mothers don't have inherent special powers that make them know these things.

  3. 340 METERS per second, not miles per second.

  4. Thank you for catching the typo about the speed of sound. It is now corrected.

  5. This info realy helps me and for my thesis.

  6. I am really interested in recognizing animal language through ultrasound, can you give more info about this, thank you so much :)

  7. If this tech can go into hearing aids, I'm all ears, pardon the pun, as I am almost deaf. The price of hearing aids is not only obscene, they are also not insured, thus creating real hardship when buying tech for thousands of dollars while the hardware costs pennies to make. Looks interesting for sure.