Create 3D Models from Photos

12 09 2008

By Joshua Koopferstock

If creating 3D models was as easy as taking photos, it is safe to say that the use of 3D would be far more widespread than it is today.  From e-commerce to virtual tourism to casual games, reducing the cost and complexity of creating 3D models would have a widespread effect on multiple industries.

Feeling Software is making that possible.  Over the last 2 years, we have worked to develop a technology that allows anyone to create 3D models with little effort and no training.  Our goal: simplicity.  You take a bunch of photos with a regular camera from any angle you please, and we automatically create a 3D model.  The demo video below discusses our project in detail.


Feeling Software Demo from joshk on Vimeo.

We have thought of a variety ways that this technology can be applied to solve problems for consumers.  For our readers, imagine that you could take photos of an object or scene, press a button, and instantly have a high-quality 3D model of that object or scene.  If this technology were available today, how would you use it?

Share/Save/Bookmark

Subscribe to RSS feed!



Presto3D Launched to the Public!

8 09 2008

By Joshua Koopferstock

After months of development, Presto3D left closed beta last week and has opened to the public.  If you missed the post a few weeks ago, Presto3D is a 3D model marketplace that automatically creates 3D previews of the content that is submitted.

The reception has been positive, and we have gotten press coverage on several major animation and game development sites.  If you haven’t taken a look at the site yet, come check out what we’ve been up to for the last few months.  If you already visited the site during the closed beta, it’s worth going back just to see the fullscreen 3D previews that we’ve added in the latest release.

All feedback is appreciated!

Share/Save/Bookmark

Subscribe to RSS feed!



Facial Recognition + Search = Cool and Creepy

3 09 2008

By Joshua Koopferstock

You tag this photo:

Source: www.bt.dk

Polar Rose finds this photo:

Source: www.newprophecy.net

With facial recognition in Picasa Web Albums launched yesterday, an exciting computer vision application once again bumps heads against privacy concerns.  On the one hand, automatic tagging of photos through facial recognition can be a useful time saver; if I have an album of 100 pictures of my family vacation with the same 5 people, I would much rather have my computer tag them for me than having to sift through them one-by-one adding tags.  On the other hand, I might not necessarily want all photos of me to be so easily found by anyone.

Picasa Web Albums seems to be OK in how it deals with this issue, at least for now.  Users only tag their own albums, and, as far as I can tell, the information gathered is not used to search Google Imagesand automatically tag images of the people you tag in your web albums.

This is not the case with every player stepping into this industry.  Polar Rose, announced late-2006, uses facial recognition on user-tagged photos to search for more photos of an individual in any publicly available images.  The service is designed as a browser plug-in and an embeddable widget for photo-sharing sites, and rumour has it that a partnership with one of the major sites is imminent.  Users tag photos of people that they find anywhere online, and Polar Rose uses that information to find that same face across other images.

The example at the beginning of this post should illustrate why this may be a concern.  For Paris Hilton, perhaps her image benefits when scandalous pictures surface, but this is not the case for most of us.  Should photos of people really be that searchable?

In reality, though, the point is moot.  Using Google Image search, I probably could have found the same 2 pictures shown above; with facial recognition, this just becomes more efficient.  If you have been using Facebook for the past few years, you have probably already come to terms with the fact that people can quickly find many pictures of you, including ones that others took without your consent (though in fairness to Facebook, you can untag and render unsearchable pictures you don’t like).

Conclusion: from a computer vision standpoint, it is neat to see these technologies reach the mass market.  From a privacy outlook, more (visual) information about us is going to become accessible without our direct consent, but only information that was publicly available in the first place, and this development is probably inevitable.

On a final note, there is one feature proposed by Polar Rose which I can’t help but find far more creepy than useful: personal photo RSS feeds.  Basically, get instantly updated by RSS each time a new photo of a targeted person is found.  Stalkers rejoice!

Share/Save/Bookmark

Subscribe to RSS feed!



DARPA Binoculars Use Haze to See Farther

28 08 2008

By Joshua Koopferstock

Being able to see through haze would be neat enough in its own right, but DARPA-funded scientists are going one step beyond that and using the haze to actually see further than they would be able to see if it wasn’t there.  More specifically, these researchers believe that the shimmering of heat “waves” can in fact be used as a lens, with the right image recognition technology.

Photo by Keirn

The goal? 90% accurate facial recognition at 1km with a 6cm lens.  As the system combines the data from multiple images, you will not be able to see 1km away in real time; the aim is 1 frame per second.

What I like about this approach is that it takes something that is impeding the goal — haze blocks your ability to see far — and turns it into an improvement — haze helps you see farther!  More problems should be solved this way.

View the technical presentation

Source: New Scientist

Share/Save/Bookmark

Subscribe to RSS feed!



Amazing Inspiration for Computer Vision

27 08 2008

By Joshua Koopferstock

These product concept designs by Mac Funamizu, a Japanese graphic designer, are among the most amazing applications of computer vision technology that I have come across. What I found especially inspiring is that some of the technology that we are working on at Feeling Software will be key to making a concept such as this a reality. And without too much effort, I believe that just about everyone reading this blog can see how their own work in 3D or computer vision will be a necessary building block to make this possible.  I hope Mr. Funamizu’s work concept fires the imagination of many people regarding the possibilities and usefulness of augmented reality.

“Future of Internet Search: Mobile Version” Product Concept


This Photosynth-esque approach (above) shows you other photos of the same scene you are looking at, from the same angle.  Here, the designer demonstrates looking at the scene in front of you over time through historical photographs.

Image recognition + mobile internet + Wikipedia?

Text recognition + mobile internet + babel fish?  I wish I had that when I was trying to decipher menus in Slovakia!

See more possibilities for this concept at Mac Funamizu’s blog, petitinvention.

Share/Save/Bookmark

Subscribe to RSS feed!



Make way for remote surgery!

19 08 2008

By Joshua Koopferstock

As if getting the required precision for surgery wasn’t hard enough in person, surgeons can now perform surgery remotely from thousands of miles away.  While it is far from mainstream, robotic surgery, including remote robotic surgery, has made leaps and bounds in the past decade.  At SIGGRAPH, I came across a technology that might give robotic surgery another shove forward.

Butterfly Haptics, launched earlier this year as a spinoff company from Carnegie Mellon University, is hard at work trying to commercialize a magnetic levitation haptic device (pictured below).  The grey handle in the center of the device floats in a magnetic field.

What Sets it Apart

The device allows 6 degrees of freedom (translation in any direction, rotation in any direction) like many haptic devices.  However, the Butterfly Haptic device separates itself from the rest in two ways.

First, despite the fact the the device is floating in a magnetic field, it can still very effectively stop your motion completely.  In one demo, you could control objects in 3D space; when you ran the object into a wall, the feeling of hitting a solid object was extremely convincing.

Second, no static friction is present as the device is floating in a magnetic field and is not mechanical.  The surface texture demo illustrating this property sold me completely to the benefits of maglev haptics.  In this demo, you were presented on screen with different surface textures that you could run over with the device, such as a solid wavy surface, a tiny ridged surface which felt something like running your fingernail over the the paper edge of a closed book and, most impressively, the “ice” (frictionless surface).  Pushing down on the surface, it was completely solid, but as you move along it in the other two dimensions, it feels absolutely frictionless like perfect ice.

Butterfly Haptics Device

Back to Surgery

It does not require explaining that in surgery, having maximum freedom of movement and realistic force feedback is optimal, if not necessary.  And it is on these two fronts that Butterfly Haptics excels.  Not only would this technology be beneficial in remote robotic surgery, but also for surgical training simulations.  With immersive 3D displays like those used in currently available robotic surgery devices and realistic force feedback, surgeons-in-training can perform highly realistic surgeries on “humans” (anatomically correct 3D models) without ever making a true incision!

The Future for Butterfly Haptics

While the maglev haptic device is currently more academic than commercial, the fact that Butterfly Haptics has been spun out of academia into the business world suggests to me that these devices may find exciting real-world applications in the near future.  What exactly those applications may be are uncertain, but the company suggests on their site that beyond medicine, the devices may be used for CAD applications, data visualization, and character animation.  The medical applications appear most promising to me, but in any case, this is a company and technology well worth keeping an eye on in the next few years.

This is only one of the many interesting technologies and research papers that we came across during SIGGRAPH last week. Expect to find more blog posts about what we saw at SIGGRAPH in the upcoming days now that I am back in beautiful Montreal.

Share/Save/Bookmark

Subscribe to RSS feed!



Feeling Software launches Presto3D in closed beta

8 08 2008

By Joshua Koopferstock & Christian Laforte

For the last few months, we have been quietly working on Presto3D, a 3D model marketplace which integrates our 3D web viewer to display user-generated 3D models in 3D within the browser.  Finally, it is out the door and in closed beta!  If you want to check it out, go to www.presto3d.com and enter this beta referral key: “presto0845″ (without quotes).


Presto3D Tutorial from joshk on Vimeo.

What is Presto3D?

Presto3D is a marketplace where 3D models can be bought or sold.  All 3D assets are user-generated and user-priced.  When a model is uploaded, we convert the 3D model into COLLADA to create a 3D preview for our web viewer.  With a small, one-time plugin download, potential buyers can see the models in 3D, rotate and zoom the models within the browser.

To ensure an optimal performance and to keep the models safe from petty thefts, we automatically reduce the resolution of textures, compress and encrypt all the data.

Why is Presto3D so exciting?

To the best of our knowledge, there exists nothing on the web that allows such openness for the display of user-generated 3D content.  Due to our automatic conversion, on Presto3D, users can upload files in any Maya (.ma, .mb) or 3dsMax (.max, .3ds) format and see them in 3D in the browser (.dae, .obj, & .fbx are also supported).  Even outside of 3D marketplace websites, other sites will require that you use file formats specially created for the specific web viewer, or create the files within their proprietary platform (such as within some web games and virtual worlds).

The goals of Presto3D are two-fold.  First, we aim to drastically improve the experience of buying and selling 3D content.  Second, we will create the most direct path to display 3D content online, irregardless of the software used to create it.

Go give Presto3D a try, and tell us what you think!

Share/Save/Bookmark

Subscribe to RSS feed!



A computer vision system arrested my wife

3 08 2008

By Christian Laforte

Panic and surprise

Three days ago I received a panicked call from my wife. She had been arrested while driving on the highway near the office in Montreal, Canada with our 10-months old daughter. I ran to the scene and was told by the policeman that my wife drove safely, but we had neglected to renew our license plate on time. We had to accompany him to the police station and pay $600 in fines and towing charges.

Not the actual scene, but you get the idea.

How could this happen? We always pay our bills right away. We notified the government of our new address before moving apartment last year. But more interesting to the readers of this blog, how did the police identify my wife’s car out of the dozens that pass every minute on the highway?

The policeman — let’s call him Joe — gave me a lift to the traffic authorities, and explained how this all works. A real-time license plate scanner is installed on a patrol car on the side of the highway. Using an active light source and high-speed cameras, it tracks every license plate that passes and compares it against an on-board database, updated once a week through a USB key. The device costs $25,000.

“Isn’t that expensive?”, I asked Joe.

“Listen to the radio… They just arrested a guy who already lost his permit. He was driving a car with an expired license plate and he was wanted for petty crimes and unpaid parking tickets. He’s looking at a fine of at least $900, plus the old parking tickets. We would have never caught the guy otherwise. No wonder the big boss wants to equip at least 100 cars with the device by the end of year.”

Frustration instantly switched into interest (and a bit of envy)

That’s a market of $2.5M for a small city like Montreal. A great market for a computer vision technology, with a lot of potential growth in years to come.

Still until now, I’ve always been an optimistic proponent of computer vision technologies. I wasn’t too worried about privacy. Being arrested certainly gave me a fresh perspective. Especially, as it turns out, because the government admitted having a bug in their address change software, which explained why we never got the license plate renewal notice.

Anyway, I still love computer vision and this is a cool technology, so let’s explore how it works and how it could be improved.

Description of the system

I haven’t seen the system but on the spot I asked Joe a lot of questions to have a better idea. The device is bolted on the roof of another patrol car, stopped on the highway. It has two cameras and one red, intense light source, like those used in barcode scanners. The cameras and the light source are tuned to focus on highly reflective surfaces, like a clean license plate. It can be fooled if the license plate is dirty or if there are other highly reflective surfaces in the field of view, e.g. a policeman badge, or I assume, when the sun reflects toward the camera. Otherwise the system appears quite robust: it works night and day, it can deal with partial occlusions of the license plate, and it can read multiple license plates in the same image.

Limitations of the system

- The database is only updated once a week. People can get arrested more than once even though they paid the fine.

- The device only scans plates. It cannot recognize a stolen car with a valid plate. As Joe explained, organized criminals are smart: they wouldn’t risk getting arrested with a false or expired plate.

- The device, I presume, can be fooled easily by adding a filter (e.g. transparent film or grease) on the plate to absorb the red wavelength, or by adding a mirror next to it to distract the cameras. To the human eye, the plate would look fine, but it would no longer be detected by the device.

- Joe explained that, if the driver were to speed away, he probably couldn’t do anything. The police no longer engage in speed chases since it’s too dangerous for the police and the general public. They have a hard time tracking dangerous drivers that speed away.

Clearly, recognizing a license plate is too simplistic. Pretty soon, criminals will know how to fool the system and the only honest people like my wife will be apprehended.

A better solution

For a device like this to be truly useful, it would first need to be connected to the central station database. Just plug it into a cellular network, e.g. using an iPhone or Android (link). With a fast enough connection, the video stream could be uploaded, recorded and processed in a central server farm. This could vastly reduce the size and cost of the device and increase the recognition capability of the overall system. The cheaper system could be installed on every patrol car or traffic light. A dangerous driver speeding away could be tracked across the city and apprehended when finally stops.

Using high resolution cameras, It would be pretty easy to recognize a car color, brand and year from the video stream: all you need is a database of logos and a good feature detector. Getting this to run at real-time would be challenging, but I’m confident this can be achieved given a year or two of development. Looking at the car as a whole would help identify stolen vehicles.

Pushing this farther, cars could be tracked across an entire city, e.g.: London with its networks of surveillance cameras. Criminals could be followed to their lair hours after a crime is reported. Hopefully the people in charge will re-think the overall process so honest people aren’t harassed or tracked without a good reason.

(Note: this is a draft of the post. I haven’t had the time to research the solution, but I’m posting it early anyway since the Washington post and Slashdot just featured a similar story.)

Share/Save/Bookmark

Subscribe to RSS feed!



Feeling Software Going to SIGGRAPH

25 07 2008

By Joshua Koopferstock

SIGGRAPH

Many of us from Feeling Software, including Christian and I, will be attending SIGGRAPH in Los Angeles in a couple of weeks. If any of you would like to sit down and brainstorm some computer vision or 3D-related project ideas, we will be happy to schedule some time in our trip to meet up with you. Send us an e-mail to enlighten3d@feelingsoftware.com which both of us will receive.

Also, if you are exhibiting at SIGGRAPH and read our blog regularly, let us know, we’ll try to come by and say hello!

Share/Save/Bookmark

Subscribe to RSS feed!



Advanced real-time facial tracking ready to leave the labs

24 07 2008

By Christian Laforte and Joshua Koopferstock

The goal of facial tracking is to recognize elements of a face in an image, and to follow them in a series of images. It may sound simple, but in reality it’s so complex that the human brain evolved a special area just for this task. Some unfortunate folks are born without that area and can’t even recognize themselves in the mirror.

Is this me? Welcome to Prosopagnosia
(original image)

AAM (Active Appearance Model) is the best family of facial tracking algorithms out there right now. The technique was first published by Cootes, Edwards, and Taylor in 2001, then heavily optimized and extended by a group of CMU researchers, primarily Iain Matthews, Simon Baker and Ralph Gross.

Here’s why this algorithm rocks:

  • It’s fast: 300Hz on a regular desktop PC.
  • It’s robust. It can deal with occlusions, e.g. sunglasses.
  • It’s relatively straightforward to implement.
  • It requires no special calibration for the user.

First, statistical model…

To do its magic, AAM must be taught what faces look like in various conditions. To achieve this, hundreds of images of faces must be annotated by human operators. These faces display a wide range of conditions including different races, different expressions, illumination, etc. For each image, someone must manually mark special points, e.g. tip of the nose, to build a mesh:

This training data is converted into a statistical model of face shapes and appearances. This is a tedious process, but once it works for a few faces, the rest of the algorithm can be used to “bootstrap” other faces, so adding new examples become faster with time.

… then track the face.

Once we have our statistical model, tracking can be performed in real-time by fitting the model on an image, such as a frame from a real-time video. Technically-speaking, this is a non-linear optimization problem that consists in minimizing the error between the image and the model. Because the problem is non-linear, we need a good first estimate and a robust fitting algorithm, otherwise the tracking gets stuck in the wrong part of the image, so a face will be detected in some guy’s ear.

Having a good first estimate used to be the hardest part. Basically, we need to tell AAM roughly where to look. Five years ago this would have required a special face detection algorithm, making the system twice as complicated to implement. Now that AAM is super fast, it’s probably simpler to just run it randomly in the image until we catch a face.

AAM improved by CMU researchers

Prior to the CMU papers, AAM was promising but not robust or fast enough for practical applications. The CMU researchers invented a fitting technique called the inverse compositional image alignment algorithm. Basically, they inverted one key step of the original algorithm (comparing the image against the model) which allowed them to compute some expensive calculations much less frequently. The end result was a much faster and robust algorithm, capable of running hundreds of times per second.

The CMU researchers then further improved AAM to deal with occlusions (i.e. partially hidden face) and to track a 3D face instead of a 2D approximation.

Regular AAM on the left,
improved AAM to deal with occlusions on the right
(video)

Surprisingly, when the CMU researchers extended AAM in 3D, using one or more cameras, the extension not only produced precise 3D results, it also ran faster and more robustly!

3D AAM in action (video)

From lab to the real world

Tons of cool applications could leverage this algorithm. So why hasn’t it happened yet? I think the primary reasons is that most people just don’t know it’s possible.

The time has come for AAM to leave the lab and make the real world a more technologically advanced place. If you have a good application idea and funding to bring this to market, give us a call and we’ll be happy to help and pitch in!

Share/Save/Bookmark

Subscribe to RSS feed!