feat(benchmark): Benchmark page, OpenLayers map, and README screenshot by jwitcoski · Pull Request #246 · fieldsoftheworld/ftw-inference-app

jwitcoski · 2026-04-13T14:05:55Z

Add /benchmark route, BenchmarkView, BenchmarkMap (GT vs predictions vs footprint)
useBenchmarkCatalog; MapView link; scroll and map lifecycle fixes
docs/images/benchmark-map-output.png and README Benchmark section

- Add /benchmark route, BenchmarkView, BenchmarkMap (GT vs predictions vs footprint) - useBenchmarkCatalog; MapView link; scroll and map lifecycle fixes - docs/images/benchmark-map-output.png and README Benchmark section Made-with: Cursor

m-mohr · 2026-04-14T09:19:51Z

Thanks for the PR.

Could you please resolve the merge conflicts?

Resolve MapView.vue conflicts by preserving upstream's header/mode-switch updates while keeping the benchmark navigation link. Made-with: Cursor

m-mohr

Thanks. I wasn't able to try it yet as I think the public server is not having this endpoint yet, so did purely a code review.

Edit: The CI returns an error, those should also be solved.

m-mohr · 2026-04-14T15:12:30Z

+import { ref, shallowRef } from 'vue'
+import { generateJWT } from '../functions/generate-jwt'
+
+const base = import.meta.env.VITE_API_BASE_URL || '/v1/'


We default to nothing/empty in all other places.

m-mohr · 2026-04-14T15:17:11Z

+          <p class="text-caption text-medium-emphasis mb-4">
+            Gold outline = chip footprint; green = ground truth; blue = model predictions. Requires a
+            deployed API that returns <code>map_geojson</code> and at least one scored chip.
+          </p>


Suggested change

<p class="text-caption text-medium-emphasis mb-4">

Gold outline = chip footprint; green = ground truth; blue = model predictions. Requires a

deployed API that returns <code>map_geojson</code> and at least one scored chip.

</p>

This is already in the legend, so it doesn't make a lot of sense to dupliate this.
And the second sentence is our obligation to fulfill, not interesting to a user.

m-mohr · 2026-04-14T15:19:53Z

+  const headers = { Authorization: `Bearer ${generateJWT()}` }
+  for (let i = 0; i < 800; i++) {
+    const res = await fetch(`${base}benchmarks/runs/${id}`, { headers })
+    const data = await res.json()


There are many places where we fetch and duplicate the header generation, I think we should consolidate this.

m-mohr · 2026-04-14T15:20:59Z

+    }
+    // Poll quickly at first so the progress UI reflects running before the job finishes.
+    const delayMs = i < 60 ? 350 : 2000
+    await new Promise((r) => setTimeout(r, delayMs))


Suggested change

await new Promise((r) => setTimeout(r, delayMs))

await new Promise((resolve) => setTimeout(resolve, delayMs))

Ambiguous whether r means resolve or reject.

m-mohr · 2026-04-14T15:21:43Z

+    <v-container class="py-6" max-width="960">
+      <p class="text-body-2 mb-2">
+        Select models and countries. <strong>Real scores</strong> need the full FTW benchmark tree on
+        the API host (<code>BENCHMARK__DATA_ROOT</code>): chips GeoParquet, <code>data_config</code>
+        (STAC URLs), and <code>label_masks/instance/</code> per country.
+      </p>
+      <p class="text-caption text-medium-emphasis mb-6">
+        With <code>BENCHMARK__AUTO_DOWNLOAD=true</code>, the API looks for
+        <code>chips_{country}.parquet</code> on Source Cooperative (not the older
+        <code>boundaries_*</code>-only drops). You still need STAC URLs and label masks for a full run.
+        Use <code>BENCHMARK__ALLOW_DEMO=true</code> for placeholder scores without data.
+        Runs are <strong>sequential per chip</strong> (download + inference + scoring); tens of seconds
+        to a few minutes per chip is normal. Matching uses a default IoU of <strong>0.25</strong> because
+        auto-picked Sentinel scenes are not the benchmark’s original imagery (overlaps are often &lt;0.5).
+      </p>


Why do we describe here what the server needs? Isn't that our task to get the server running correctly and don't bother the user with this?

m-mohr · 2026-04-14T15:24:36Z

 <template>
  <div class="map-view">
    <header id="title">
      <img
        src="https://fieldsofthe.world/static/images/brand/logos/ftw-logo-light.svg"
        alt="Fields of The World (FTW) App"
        class="logo"
        @click="aboutDialog = true"
      />
      <v-item-group
        selected-class="bg-primary"
        mandatory
        v-model="modeValue"
        @update:model-value="setModeValue"
        class="mode-switch"
      >
        <v-item v-for="tab in availableModes" :key="tab.id" v-slot="{ selectedClass, toggle }">
          <v-card :class="['d-flex align-center', selectedClass]" @click="toggle">
            <div class="mode-switch-btn flex-grow-1 text-center">{{ tab.label }}</div>
          </v-card>
        </v-item>
      </v-item-group>
+      <RouterLink id="benchmark-link" to="/benchmark">Benchmark</RouterLink>


We need to find a different solution for this header. The more we add to it, the wider it gets, which makes the responsive behaviour not work correctly and hides more field boundaries / wastes more space.

How does someone get back to the normal map view from the benchmark view? (I just looked at the screenshot yet.)

m-mohr · 2026-04-14T15:26:07Z

+  // Dev server must use base "/" or http://localhost:5173/ returns 404. Do not rely on
+  // `mode === 'development'` (e.g. `vite --mode production` still uses serve command).
+  // Build + `vite preview` keep the GitHub Pages subpath; `isPreview` is true for preview only.
+  const base =
+    command === 'serve' && !isPreview ? '/' : '/ftw-inference-app/'
+
  return {
+    base,


I don't understand this change. Can you elaborate on why this was changed, please?

m-mohr · 2026-04-14T15:31:31Z

+    source: new XYZ({
+      url: 'https://tile.openstreetmap.org/{z}/{x}/{y}.png',
+      attributions: '© OpenStreetMap contributors',
+    }),


You could use the OSM source instead: https://openlayers.org/en/latest/apidoc/module-ol_source_OSM-OSM.html

m-mohr · 2026-04-14T15:34:37Z

+  map.updateSize()
+  requestAnimationFrame(() => map.updateSize())


Why is this needed?

m-mohr · 2026-04-14T15:39:02Z

+  requestAnimationFrame(() => map.updateSize())
+}
+
+async function syncMapToData() {


Does this run multiple times for the same data? It's triggered by mounted and immediately when the watcher below fires. The nextTicks also look suspicious.

jwitcoski · 2026-04-14T19:20:57Z

@m-mohr Thanks a lot for the thoughtful review and for taking the time to go through this in detail.

I agree with your feedback. But to keep you from spending too much time we should avoid spending much time polishing the prototype-style code I submitted about two weeks ago. It was intentionally a quick, vibe-coded experiment to test whether the FTW benchmark datasets could help evaluate how well predictions match what is actually on the ground for agricultural fields (FTW dataset on Source Cooperative).

Given the improvements from the last two weeks, I think this benchmark/data workflow can now be integrated much more cleanly into the app, likely in an “Evaluate” section (or similar), where users can tune model choices based on the region they’re mapping. Right now, I don't see your app utilizing the FTW dataset.

Just for fun, I ran it on the Brazil dataset and found the model does a good job. That said, I’m not yet running all the variable options from the original app (e.g., time range, cloud coverage, etc.).

m-mohr · 2026-04-14T19:30:46Z

Okay, I've converted this to draft then. Thanks.

Btw, this is not my app, I'm just contributing to it ;-)

Address local/CI lint failures by fixing BenchmarkMap prefer-const, simplifying BenchmarkView table bindings to avoid template parser errors, and aligning benchmark API base default handling. Made-with: Cursor

jwitcoski · 2026-04-14T19:35:58Z

Awesome, I fixed the error and the npm run now should work correctly. @m-mohr

Merge upstream/main into feature/benchmark-ui.

22c3df2

Resolve MapView.vue conflicts by preserving upstream's header/mode-switch updates while keeping the benchmark navigation link. Made-with: Cursor

m-mohr self-requested a review April 14, 2026 15:11

m-mohr reviewed Apr 14, 2026

View reviewed changes

m-mohr requested a review from ahocevar April 14, 2026 15:42

m-mohr marked this pull request as draft April 14, 2026 19:29

fix(ci): resolve benchmark lint and template parsing issues

f8a8fc2

Address local/CI lint failures by fixing BenchmarkMap prefer-const, simplifying BenchmarkView table bindings to avoid template parser errors, and aligning benchmark API base default handling. Made-with: Cursor

	await new Promise((r) => setTimeout(r, delayMs))
	await new Promise((resolve) => setTimeout(resolve, delayMs))

		map.updateSize()
		requestAnimationFrame(() => map.updateSize())

Conversation

jwitcoski commented Apr 13, 2026

Uh oh!

m-mohr commented Apr 14, 2026

Uh oh!

m-mohr left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jwitcoski commented Apr 14, 2026

Uh oh!

m-mohr commented Apr 14, 2026

Uh oh!

jwitcoski commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

m-mohr left a comment •

edited

Loading