The US laboratory of the Chinese search engine Baidu has developed a speech, the Mandarin sometimes more accurately understood as a human being. The most developers themselves do not speak Chinese.
Baidu has become a leading provider of voice software – and is Google and Apple depend. The magazine reports Technology Review in its current issue 5/2016 (on newsstands now or order here).
China has the best conditions for a triumph of voice interfaces. With 691 million users smartphones there are much more widespread than conventional computers. But thousands of different characters make text input on touch screens frustrating cumbersome.
Baidu has made particularly impressive progress in speech recognition. This also people in other countries are likely to benefit. “I see language technology close to the point where it is so reliable that they are easily exploited, without thinking about it,” says Andrew Ng, Stanford professor and chief scientist of Baidu. A powerful language technology would also facilitate the interaction with any other devices, believes Ng – about with robots or home appliances.
Last November the Baidu lab in Silicon Valley has reached an important milestone: a new voice recognition system called Deep Speech 2. It is based on a deep neural network and learning based transcribed by millions speech examples of how audio signals to the appropriate words related. Now Deep Speech recognizes 2 spoken words in Mandarin and sometimes even more accurate than a man – although it is phonetically very complex. Even more impressive appears these benefits if you know that only a few of the Californian developer speak Chinese at all. So Deep Speech 2 is basically a universal language system that learns as good English, if you put in front of him enough examples.
Most speech requests to Baidu’s search engine are simple and reflect the weather or air pollution. The system is most impressive unerringly. To cope with more complicated questions, Baidu has launched its own voice assistant named Duer last year and integrated into its main app. He can, for example, tell users the early days of cinema films or reserve a table in the restaurant. Sometime should Duer be able to live a meaningful conversation and to respond to new information in it. For this, a research group in Beijing wants to use neural networks as Deep Speech. 2 In addition, Baidu has set a team that analyzes the requests for Duer and corrects errors, so that the system learns and gradually better. (grh)
No comments:
Post a Comment