You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
/* Ensure paper preview SVG stays visible in dark mode */
224
+
.dark-mode .paper-preview {
225
+
background-color:#fff;
226
+
}
205
227
206
228
#theme-toggle {
207
229
position: fixed;
@@ -257,20 +279,18 @@ <h2>CVPR 2025</h2>
257
279
258
280
<h2>Abstract</h2>
259
281
<p>
260
-
Can objects that are not visible in an image—but are in the
261
-
vicinity of the camera—be detected? This study introduces
262
-
the novel tasks of 2D, 2.5D and 3D unobserved object de-
263
-
tection for predicting the location of nearby objects that are
264
-
occluded or lie outside the image frame. We adapt several
265
-
state-of-the-art pre-trained generative models to address
266
-
this task, including 2D and 3D diffusion models and vision–
267
-
language models, and show that they can be used to infer the
268
-
presence of objects that are not directly observed. To bench-
269
-
mark this task, we propose a suite of metrics that capture
270
-
different aspects of performance. Our empirical evaluation
271
-
on indoor scenes from the RealEstate10k and NYU Depth
272
-
V2 datasets demonstrate results that motivate the use of
273
-
generative models for the unobserved object detection task.
282
+
Can objects that are not visible in an image—but are in the vicinity of the
283
+
camera—be detected? This study introduces the novel tasks of 2D, 2.5D and 3D
284
+
unobserved object detection for predicting the location of nearby objects
285
+
that are occluded or lie outside the image frame. We adapt several
286
+
state-of-the-art pre-trained generative models to address this task,
287
+
including 2D and 3D diffusion models and vision–language models, and show
288
+
that they can be used to infer the presence of objects that are not
289
+
directly observed. To benchmark this task, we propose a suite of metrics
290
+
that capture different aspects of performance. Our empirical evaluation on
291
+
indoor scenes from the RealEstate10k and NYU Depth V2 datasets demonstrate
292
+
results that motivate the use of generative models for the unobserved
293
+
object detection task.
274
294
</p>
275
295
276
296
<h2>Task Definition</h2>
@@ -284,11 +304,15 @@ <h2>Task Definition</h2>
284
304
/>
285
305
</div>
286
306
<divclass="task-text">
287
-
<p>
288
-
The task of <strong>unobserved object detection</strong> is to detect objects that are present in the scene but not captured within the camera frustum. In this paper, we address this by predicting a conditional distribution over a bounded spatial region and a set of semantic labels from a single RGB image. We refer to this distribution as a <strong>spatio-semantic distribution</strong> visualized as a heatmap.
289
-
</p>
290
-
291
-
307
+
<p>
308
+
The task of <strong>unobserved object detection</strong> is to detect
309
+
objects that are present in the scene but not captured within the
310
+
camera frustum. In this paper, we address this by predicting a
311
+
conditional distribution over a bounded spatial region and a set of
312
+
semantic labels from a single RGB image. We refer to this distribution
313
+
as a <strong>spatio-semantic distribution</strong>, visualized as a
0 commit comments