Apple researchers have developed an AI model that can rebuild detailed 3D objects from just one image, capturing complex lighting effects like reflections and highlights that change with viewing angles. This innovation addresses a major challenge in 3D reconstruction, where prior methods typically required multiple images or failed to produce realistic view-dependent lighting.

The breakthrough is grounded in Apple’s new system, LiTo (Surface Light Field Tokenization), which creates a latent 3D representation combining both object geometry and how light interacts with its surface. Unlike earlier techniques that focused mainly on shape or static textures, LiTo models surface light fields, allowing it to recreate specular highlights and Fresnel reflections even under complex lighting scenarios.

The LiTo model leverages latent space representations-a concept that translates visual data into compact numerical codes within a multi-dimensional framework. By encoding only a small subset of surface light information, LiTo efficiently condenses both shape and lighting into a single latent vector. This enables realistic rendering of objects from new perspectives, all from one initial photo.

The training process involved thousands of objects, each rendered from 150 different angles and under various lighting conditions. However, instead of exposing the model to the full dataset at once, it was trained on randomized samples, forcing it to infer complete geometry and lighting from limited data. Afterwards, a separate component was trained to predict these latent codes from single images, bridging the gap from real-world photos to full 3D reconstructions.

Diagram of LiTo model architecture showing encoder and decoder in latent space

LiTo’s ability to generate accurate 3D models with realistic lighting from just a single capture sets it apart from competing approaches such as TRELLIS, which struggle with natural reflections and highlights. Apple’s study includes interactive examples showcasing these differences, reinforcing this model’s potential for applications in augmented reality, digital content creation, and beyond.

By efficiently merging geometry and lighting data in latent space, this AI innovation could pioneer new ways to digitize objects quickly without complex multi-angle setups. It also hints at how future Apple devices might leverage on-device AI for sophisticated 3D scanning and rendering.

How the LiTo AI model reconstructs 3D objects from a single image

Benefits of Apple’s LiTo model in realistic lighting rendering

Applications of LiTo in augmented reality and digital content

Source: 9to5mac

Leave a comment

Your email address will not be published. Required fields are marked *