New Computers Translate as Users Speak
- Share via
PITTSBURGH — Sloppy speakers, rejoice.
Scientists in an international consortium are developing computer technology that translates conversations into six languages.
But the new machines don’t merely translate, they also clean up grammar and omit those awkward “ums” and “urs” that bog down sentences.
“It can even recognize noises like lip smacks,” said Alex Waibel, director of the Interactive Systems Laboratories at the Carnegie Mellon University School of Computer Science.
The speech translation technology was demonstrated recently during a video conference with people in Japan, Italy, Korea and Germany.
“What time is it in Japan?” CMU graduate student Chad Langley asked.
Within seconds, Langley’s sentence was translated into Japanese and heard by a scientist posing as a travel agent in Kyoto. The agent responded in Japanese, and the computer translated his response to English: “It’s 1 a.m. in Japan.”
Langley--who doesn’t speak a word of Japanese--went on to inquire about weather conditions, book a flight and reserve a hotel.
The international Consortium for Speech Translation Advanced Research--C-STAR--has a system with more than 10,000 words that can allow spontaneous speech through the Web.
If a mistake in translation is made, it can be easily corrected before it is transmitted, because a screen shows the translation as it is made but before it is heard at the other end.
In some cases, the user can even choose the degree of speech formality, and scientists surprised their Japanese counterpart by using the local Kyoto dialect.
“It’s sort of like a Japanese imitating a Southern drawl,” Waibel said.
For now, the program is limited to information related to travel, such as flight schedules and hotels.
“We want to go from a narrow domain to a broad domain to no restrictions whatsoever,” said Jaime Carbonell, who directs the Language Technologies Institute at CMU.
Even better, scientists hope to improve voice synthesis, so the computer will translate not only words but also the expression and inflection that comes with them--”so it’s not just the content, but also the emotion,” Carbonell said.
The translating device consists of a headset, a backpack with a laptop computer and a forearm computer the size of a paperback book. Waibel said the consortium’s next goal is to develop smaller versions of the translators to mass-produce and sell commercially.
“All of these things that we are showing you will be coming to a credit card near you soon,” he said.