This is Eddy Carroll's Typepad Profile.
Join Typepad and start following Eddy Carroll's activity
Join Now!
Already a member? Sign In
Eddy Carroll
Recent Activity
While accurate, speaker-independent voice recognition with no constraints on vocabulary or context is still a long way off, there have been enough improvements over the last 10-15 years to make speech input really genuinely useful in more specific scenarios. I've been working on voice control for Windows Media Center for the past couple of years, and find it works very well indeed. The key benefits I've found are: The ability to choose a single musical artist, film, TV program, etc from a decent sized media collection that may include thousands of alternatives - without needing to drill down through menus. The ability to issue commands without interfering with onscreen action (e.g. changing the music while a slideshow is running) Instant access for things like jumping to a particular point in a movie ("skip to 47 minutes") An audio input device that only listens when users are issuing speech commands, rather than trying to make sense of all the random sounds it hears (we use an accelerometer to intelligently unmute the mic when needed) The alternative, for our target audience, is a normal remote control; most living room users don't have a keyboard & mouse conveniently to hand for controlling their TV experience During our development, it became very clear that the single most important thing needed for good speech recognition is a high quality microphone system. Most PC mics are lousy for this (limited bandwidth, etc.) which makes the voice recognition engine work much harder. Garbage in, garbage out. (Bluetooth is even worse, due to limited bandwidth.) I think this is why most users who dabble with speech recognition find it generally poor, even though the quality of, say, Windows 7's built-in speech recognizer is actually pretty decent. It will be interesting to see how much Microsoft's Kinect (aka Project Natal) pushes the speech-in-the-living-room experience forward: with 3D cameras that can accurately identify a speaker's position in the room, coupled with an array microphone that can focus on that position, the potential is there for very good recognition. And another thumbs' up for Jeff Hawkins' book On Intelligence, mentioned by an earlier poster: well worth a read for anyone interested in all types of machine recognition.
Eddy Carroll is now following The Typepad Team
Jun 21, 2010