Dual Degree Ph.D. Student Recreates Indoor Scenes in 3D Using Photographs
Co-written with Yasutaka Furukawa, the paper shows that by focusing on reconstructing the dominant structure of the scene (the floorplan and walls, which is usually piecewise planar), the reconstructions are simplified and maintain the relevant information for navigation applications, for instance, while retaining appealing aesthetics for visualization by allowing the viewers to infer the geometry of clutter objects.
This is different from most existing methods, which focus primarily on producing 3D models of every single location with millimetric precision. The paper argues that models are never perfect and thus trigger noticeable rendering artifacts, which are unappealing for viewers. This is a well-studied phenomenon in the field of human aesthetics: a robot or a computer animation which looks and moves almost but not exactly like natural human beings is unpleasant to human observers.
According to the student, co-advised by João Paulo Costeira at Instituto Superior Técnico of the Universidade de Lisboa (IST-UL) and Fernando De la Torre at Carnegie Mellon University (CMU), “this work can be applied in architecture and civil engineering, but it can also be used to locate things and find directions in critical infrastructures, such as airports, hospitals, or malls,” he explains. The fact that our method is fully automatic allows companies to use this technology to build models and maps of indoor locations at a global scale. “Because this work was done during my internship at Google, an opportunity that came to be because of the CMU Portugal program, we were able to use their massive photo database,” Ricardo Cabral explains.
Computer vision, the area that this paper addresses, is a recent area, and over the past 40 years its “holy grail was making computers look at images and see the same way humans do,” he clarifies. “In our research at CMU and IST-UL we have been trying to develop large-scale systems that are able to process the increasing amount of imagery available on the Internet,” he adds. This has led to solutions for various problems in medicine (diagnostic aids and imaging technologies), entertainment (e.g., Microsoft Kinect and special effects in movies), robotics (e.g., self-driving cars), or manufacturing (e.g., automatic inspection).
The CVPR is the premier annual Computer Vision event comprising the main CVPR conference and several co-located workshops and short courses. With its high quality and low cost, it provides an exceptional value for students, academics and industry researchers.
You can learn more about the project here.