New segmentation algorithms at CVPR 2008
23 06 2008By Christian Laforte
CVPR 2008, the leading computer vision research conference, is starting tomorrow (June 24th) in Alaska. I won’t be attending this year, but fortunately, http://gmazars.info/conf/cvpr2008.html has the full list of selected papers.
What is segmentation and why it matters
In this blog post I will explore advances in one of the most popular vision research topics, segmentation. Segmentation consists in dividing an image into several regions that look similar. Think of it as cutting the contour of an object with scissors, or in Photoshop, separating a foreground object from the background.
Segmentation is a critical step in many vision problems such as recognizing an object in a cluttered room, or a person in a crowd. A classical example is isolating a tiger from a jungle scene.
Sumatran Tiger in the wild, photograph from Richard Ness
If you’re not a computer vision scientist you may think this is segmentation stuff is overly complicated and useless. But say you want to photoshop yourself riding this tiger to impress your online friends. (After making your face prettier of course.) The first thing you’ll need to do (after buying, learning and practicing Photoshop) is to separate the tiger from the background. Now imagine your online friends call the bluff and ask you to post a video of the scene, and you’ll understand why teaching a computer to do it for you would be pretty helpful.
We humans can segment the tiger effortlessly (in our mind at least), since we have a mental model of what a tiger and a jungle look like, built through years of experience, and our eyes and our brains have evolved with natural selection. Individuals who couldn’t spot the tiger in the jungle didn’t survive too long. Replicating this intelligence in a computer algorithm is still an active research area. Even though humans have an easier time, even Photoshop experts sometimes have trouble with harder images that involve repeating patterns, transparency, motion blur and depth of field:
Bengal tiger cub from National Geographic
Manually segmenting the cub from the other tiger is challenging. Doing it automatically is not yet possible. We’re not there yet, but several new papers show possible ways.
Using Contours to Detect and Localize Junctions in Natural Images (PDF)
Michael Maire, Pablo Arbelez, Charless Fowlkes, Jitendra Malik
The paper provides a state-of-the-art solution for the related problems of finding contours (segmentation curves), and finding junction (points joined by multiple contours). The contours are found by combining local and global information. The local cues are combined in a multiscale oriented signal including brightness, color and texture gradients. The global information is considered to be in the first 9 generalized eigenvectors, from which a signal is extracted with Gaussian directional derivatives at multiple orientations. The local and global information are then linearly combined, resulting in a globalized probability of boundary, which claims the top spot in the standard Berkeley segmentation benchmark.
Original image
A set of so-so contour lines. Too many lines in the textured areas.
A near perfect set of contour lines, produced by Maire’s algorithm
Maire and his colleagues then proceed to leverage this superior contour detection algorithm to identify junctions, using an EM-style approach. Open contours can therefore be extended to their likely junction points. The results of this approach are compared with that of a novel Harris operator, along with human-provided expected results. As you can see in the example below, the approach yields smooth, nice contours. Junctions are detected in their expected location even in heavily textured boundaries.
Original image
Resulting contours and junctions
One question that remains unanswered is… how fast are these algorithms? How well do they deal with very large images? Since they don’t mention performance, I wouldn’t be surprised if it took several seconds or minutes to process one image.
Other CVPR papers related to segmentation
Edge preserving spatially varying mixtures for image segmentation (PDF)
Giorgos Sfikas, Christophoros Nikou, Nikolaos Galatsanos
Proposes a hierarchical Bayesian model based on Gaussian mixture models with a prior enforcing spatial smoothness. I skimmed through it very quickly, so I can’t offer an intelligent review. Unlike the first approach, it reportedly doesn’t require tweaking parameters, but the results aren’t as compelling IMHO.
Segmentation by transduction (PDF)
Olivier Duchenne, Jean-Yves Audibert, Renaud Keriven, Jean Ponce, Florent Segonne
Olivier Duchenne and his colleagues describe a semi-interactive background segmentation technique, inspired from GrabCuts and Graph Cuts. Such semi-interactive techniques rely on hints provided by the user to help the computer segment an object from the rest of the image, like brush strokes on the foreground and background, or tracing a rectangle surrounding the foreground object. The best way to explain it is with an example image from the Duchenne’s paper:
This particular paper produces decent results pretty fast (2 seconds to 3 minutes on a standard computer with a single thread). Some more results:
Such techniques will eventually make it simple for anyone to cut a picture cleanly. In the meantime, don’t throw away that old Photoshop magic wand.













Great post! Really nice job of explaining the problem of segmentation. I attended the conference and posted a few of my favorites:
cvpr wrap up and selected papers
I’m glad I found your site. Any other good “Computer Vision” blogs you read?
Thanks Shawn, your reviews are interesting, now I’ll have to read this “Recovering Consistent Video Depth Maps via Bundle Optimization” paper. I also added your blog to my feed.
As for other computer vision blogs… there are very few out there. If anyone else has a computer vision blog, by all mean, post it in the comments and eventually we’ll list them in one place.
[...] forgot an important newcomer in my earlier post on segmentation algorithms published at CVPR [...]