One of the great skills in Computer Science, and part of what makes it fun, is knowing when you need to create a highly sophisticated general-purpose algorithm and when you can get away with a much simpler one, perhaps by manipulating the problem domain so that you have an easier problem to solve.
This was part of the genius of the PalmPilot team: they realised that if you modified the letters people used for handwriting slightly, they became much easier to recognise, and so Graffiti was born. Actually, in this case, I suspect, the real credit goes to whomever it was who got the idea past the business and marketing guys. Can you imagine the conversation?
“We’ve got this really cool device, and all that the customers will need to do is learn a new alphabet before they can use it. It’s really not very different from the ABC they’ve been using from the age of four…” The VCs must have just loved that idea!
In computer vision, this simplification of the problem domain is particularly relevant because the algorithms can get very complicated very quickly, and complexity can require a lot more processing power and, often, result in less reliability in the real world.
So I was particularly impressed by the TAFFI (Thumb And ForeFinger Interface) developed by Andy Wilson at Microsoft Research in Redmond. He’s come up with a great way to avoid the need for complex hand-tracking algorithms. Have a look:
You can read more about it in his paper from the UIST conference: Robust Computer Vision-Based Detection of Pinching for One and Two-Handed Gesture Input.