NextGenLog: #MEMS: "Speech Controls Home Automation System"

Wednesday, May 02, 2012

#MEMS: "Speech Controls Home Automation System"

The Distant Speech Interaction for Robust Home Applications (DIRHA) project in the European Union will install MEMS microphones from STMicrosystems all around new houses, enabling dwellers to just speak their commands to the smart house no matter where they are in it. Capitalizing on the progress that neural networks have made in recent years to recognize nearly every language in the world, the multilingual houses will respond to commands spoken in any language: R. Colin Johnson

Here is what DIRHA says about the Distant Speech Interaction for Robust Home Applications: The DIRHA project addresses the development of voice-enabled automated home environments based on distant-speech interaction in different languages. A distributed microphone network is installed in the rooms of a house in order to monitor selectively acoustic and speech activities observable inside any space, and to eventually run a spoken dialogue session with a given user in order to implement a service or to have access to appliances and other devices. The multi-microphone front-end is based on the use of arrays consisting of analog microphones or Micro Electro-Mechanical Systems (MEMS) digital microphones. The targeted system analyses the given multi-space acoustic scene in a coherent way, by processing in a parallelized fashion simultaneous activities which occur in different rooms, and in case by supporting at the same time the interaction with users who may speak in different areas of the house.

These very challenging objectives require advances in different scientific and technical fields. In fact, based on the given network of microphone arrays, multi-microphone front-end processing includes, among the others, tasks as speaker localization, acoustic echo cancellation, speech enhancement, acoustic event segmentation and classification. It is then necessary to have robust technologies for distant-speech recognition and speaker identification (and verification). Effective solutions for language modeling in the selected languages, speech understanding, concurrent management of spoken dialogue interaction, together with user interface and integration between the resulting technological components, will also represent fundamental features for the implementation of the proposed smart home interface. The final prototype will be integrated in an automated home and evaluated by real users.

Here is what ST says about DIRHA: The world-class expertise in MEMS microphone and audio processing technologies of STMicroelectronics (NYSE: STM), a global semiconductor leader serving customers across the spectrum of electronics applications, will play a pivotal role in the European research project on ‘Distant Speech Interaction for Robust Home Applications’ (DIRHA). The three-year program aims to investigate and prototype solutions for natural voice-enabled interaction between humans and machines in tomorrow’s smart homes.

The DIRHA project sets to address the challenge of distant speech interaction in multi-noise, multi-speaker situations of a home environment. The goal is to create a pervasive, always-listening sound space, where users needn’t speak into the microphone to get recognized and understood, but the system reaches out, acoustically, to the speakers regardless of their position within the home.

The physical and acoustic parameters of ST’s MEMS microphones perfectly fit the challenging requirements of distant-speech interaction systems. The small form factor allows the researchers to easily embed entire arrays of microphones in the walls, desks, or speech-enabled appliances of the automated home, while the microphones’ excellent acoustic characteristics, coupled with sophisticated signal-processing technologies, will make it possible to identify and capture an individual speaker from several meters away, in a crowded room with music playing.

The distant-speech interaction capability will not only dramatically change the way people interact with technology, but can make a real difference for those who can’t easily move around, such as the elderly or the motor-impaired. In addition to the home scenarios, the distant-speech interaction systems can find use in robotics, telepresence, surveillance and industry automation.

The DIRHA program is organized into a number of work packages, spanning a total duration of 36 months, and the total cost of the project is 4.8 million euros. The main fields of research include multi-channel acoustic processing, distant-speech recognition and understanding, speaker identification/verification, and spoken-dialogue management in four languages - German, Greek, Italian and Portuguese. The final prototypes will be integrated in pilot households and evaluated by real users.

The DIRHA participants are Fondazione Bruno Kessler, Italy (project coordinator); Athena Research and Innovation Center in Information Communication & Knowledge Technologies, Greece; DomoticArea, Italy; INESC ID - Instituto de Engenharia de Sistemas e Computatores, Investigacae e Desenvolvimento em Lisboa, Portugal; NewAmuser, Italy; STMicroelectronics, Italy; and Technische Universitaet Graz, Austria.
Further Reading