In this paper we propose a method to estimate a global depth order between the objects of a scene using information from a single image coming from an uncalibrated camera. The method we present stems from early vision cues such as occlusion and convexity and uses them to infer both a local and a global depth order. Monocular occlusion cues, namely, T-junctions and convexities, contain information suggesting a local depth order between neighbouring objects. A combination of these cues is more suitable, ...
In this paper we propose a method to estimate a global depth order between the objects of a scene using information from a single image coming from an uncalibrated camera. The method we present stems from early vision cues such as occlusion and convexity and uses them to infer both a local and a global depth order. Monocular occlusion cues, namely, T-junctions and convexities, contain information suggesting a local depth order between neighbouring objects. A combination of these cues is more suitable, because, while information conveyed by T-junctions is perceptually stronger, they are not as prevalent as convexity cues in natural images. We propose a novel convexity detector that also establishes a local depth order. The partial order is extracted in T-junctions by using a curvature-based multi-scale feature. Finally, a global depth order, i.e., a full order of all shapes that is as consistent as possible with the computed partial orders that can tolerate conflicting partial or ders is computed. An integration scheme based on a Markov chain approximation of the rank aggregation problem is used for this purpose. The experiments conducted show that the proposed method compares favorably with the state of the art.
+