We present a neural field-based framework for depth estimation from single-view RGB images. Rather than representing a 2D depth map as a single channel image, we define it as the iso-surface of a scalar field in an implicit space, which we introduce as the Pseudo 3D Space. We convert a 3D Depth Field into a 2D depth image utilizing an efficient and differentiable sphere tracing rendering algorithm. We introduce two further innovations. First, we present a Field Warping technique that simplifies the depth field estimation as a classification problem, which is far more efficient to learn than a regression task of learning a signed distance function (SDF). Second, we design the 3D Pseudo Normal from the 2D depth map, which is closely related to the actual 3D surface normal and can be computed from the depth field’s implicit representation with an uncalibrated camera. Experiments validated our method’s performance. Our Pseudo 3D Space simplifies the current implicit field learning and offers a consistent framework for advancing shape reconstruction from multiple cues.


Set up dataset path

Suppose your dataset is placed like this:


Add in ~/.bashrc the following

export PNNET_NYU2_DATASET=/absolute_path/bts_nyu_data/

Train with

python -c configs/train_bts_nyu_nd3_tb_vis.json

This include pseudo normal and total bending loss.