I went to a computer vision conference today. It reminded me of how much I like computer vision. It has hard math. It requires maintaining an interesting dichotomy between the exactitude of computer science and the fuzziness of the real world. It is a crucial stepping stone on the path to artificial intelligence. It bridges the gap between the physical and virtual worlds. It involves making demos with pretty pictures. What's not to like?
Here are some of the coolest things I saw:
Computer vision and fashion:
like.com and
covet.comThese services are an unlikely but highly innovative merging of computer vision and women's fashion. When you select items you like, it uses image recognition algorithms to find aesthetically similar alternatives and accessories. Unsurprisingly, it's best on things with patterns (eg floral print dresses) and tends to confuse different styles that have the same color. It's still very impressive though. On covet.com, they have a "get to know your style" app where you repeatedly pick which of two clothing styles you like better. An algorithm then analyzes the clothing that you chose for pattern, shape, and texture. The trouble is that all the photos are of Hollywood actresses wearing what I usually regard as fairly ugly stuff. I told them they need a "neither" button and more style variety. Still, they've managed to do very well in a bad economy. Although they are a website, they are a feeder for online retailers, and thus can make a ton of money off of affiliate fees instead of depending on advertising. It's a great place to be.
SnapTell
SnapTell is one of the most useful iphone applications I've seen. You can take a photograph of any book, video game, CD, or DVD, and it will recognize it within a few seconds. The recognition is done on a remote server. Once it's recognized, you can see the product on Amazon, Barnes & Noble, and various retailers' sites as well as read reviews and other useful information. The company was bought by Amazon just in the last couple of weeks.
Watching dreams with MRIs.
(research page here)A Berkeley researcher has built a map of the human vision system that can, with very high accuracy, figure out what part of what movie you're watching from a dataset of 10,000 hours of video. It builds up a model of your vision system by watching you look at pictures in an MRI for a while. Then this model can be run in reverse to find pictures that correspond to your current vision activity. It can't tell in detail what you're looking at, but it's very good at finding similar scenes. One of the things the researcher wants to do in the future is use the device to decipher the contents of people's dreams as they are having them. The technology could potentially be used in the future to read your verbal thoughts. It's unclear how far away these sci-fi goals are, but the amount of progress that's been made gives me goosebumps.
I was also impressed at the various efforts at scene recognition. Researchers are getting a lot better at labeling the various objects in a scene, which is something that even a mouse can do easily but computers have a very hard time with. There's been a lot of progress in the last three years.