Move GPS tables to a separate database by Rub21 · Pull Request #7111 · openstreetmap/openstreetmap-website

Rub21 · 2026-05-26T17:43:00Z

This PR moves the three GPS tables (gpx_files, gpx_file_tags, gps_points) out of the main openstreetmap database into a new openstreetmap_gps database.

What the PR does:

Adds a GpsRecord abstract model connected to the gps database. Trace, Tracepoint, and Tracetag inherit from it.
Adds db/gps_migrate/ and db/gps_structure.sql for the GPS schema.
Keeps the foreign keys between the three GPS tables. But the link to users.id is now app-level only, since Postgres can't enforce FKs across databases.
Updates CI, Docker, and the devcontainer to set up the second database.
Feeds controller: removed a join that crossed both databases (Postgres can't do that). Now it loads the trace and the user in two steps.
Trace image/icon attachments: skip blob analysis. Active Storage tries to analyze the file by loading the parent record, but the parent (Trace) is now in the gps DB while the attachment is in the main DB, so the analyzer crashes. seems skipping is safe, because we don't use the analyzed metadata for those files.
I tested this with the docker-compose dev setup and the trace tests still pass.

Some files were auto-updated by Rails when the tables moved: model annotations, db/structure.sql, and db/gps_structure.sql. These changes were automatic.

…arting

…db on user profile page

…a separate db

The two GIFs are drawn at fixed sizes by libgd in lib/gpx.rb, so ActiveStorage has nothing to learn from analyzing them. Skipping the analyze job also avoids a race against the disk upload now that the trace tables live in a separate database from active_storage_*. The race was breaking Api::TracesControllerTest#test_create, where the analyze job sometimes ran before the file was written and raised FileNotFoundError, which the trace importer treated as an import failure and destroyed the trace.

github-actions · 2026-05-26T17:43:49Z

	2 Warnings
⚠️	Number of updated lines of code is too large to be in one PR. Perhaps it should be separated into two or more?
⚠️	Merge commits are found in PR. Please rebase to get rid of the merge commits in this PR, see CONTRIBUTING.md.

Generated by 🚫 Danger

gravitystorm

In general I'm neutral on this idea. I've read the operations task and I haven't seen convincing evidence that this is the best way to solve the operational problems. From a codebase point of view, this adds significant complexity to all deployments, so we need to make really sure that it's worth doing.

For this PR, in addition to my inline notes, I would say:

There's no task included to migrate the data from the old database to the new one. This is critical, particularly for non-OSMF deployments that you are not personally managing.
There's no migration strategy. Do we expect all the traces to disappear from the site, then slowly reappear as the migration progresses? Or is there going to be a (several-day?) outage during the migration?
Or should the general strategy be more like the "renaming a table" strategy documented in strong_migrations?
When this PR runs, the site will instantly use the new database for new uploads. This will then immediately conflict with any data that is being migrated from the old database to the new (e.g. primary key reuse).

gravitystorm · 2026-05-27T10:06:03Z

+  rescue_from ActiveRecord::ConnectionNotEstablished,
+              ActiveRecord::DatabaseConnectionError,
+              ActiveRecord::NoDatabaseError, :with => :gps_database_unavailable
+


This isn't correct, those errors could be for other reasons

Yes, you are right, the current rescue_from catches the error for the two databases, not only GPS.

I added this rescue to protect the main site when the GPS DB is unavailable. In openstreetmap/chef#846 (comment), I set a short connect_timeout for it, so the connection fails fast and the request does not hang. That way if a user opens /traces while the GPS DB is unavailable, we show a warning instead of blocking the process.

I updated the gps_database_unavailable method so it only handles GPS errors; otherwise it re-raises the exception.

gravitystorm · 2026-05-27T10:10:09Z

@@ -0,0 +1,6 @@
+# frozen_string_literal: true
+
+class GpsRecord < ApplicationRecord


We should consider the naming of this abstract class (and the database name) more carefully.

When OSM was created, "GPS" and satellite positioning was more or less interchangeable. However, time has moved on, and the American GPS system is only one of 4 different GNSS systems.

GNSS is, however, another acronym, and not very self-explanatory - nor is it in common use.

Was the decision to call the database "gps" and the model "GpsRecord" deliberate? Or would something involving "trace" be more consistent with e.g. the model naming?

I agree, GPS is technically only the American system, and "trace" naming is more consistent with the existing Trace model.

I can rename all the required values and files, e.g.:

GpsRecord → TraceRecord

DB openstreetmap_gps → openstreetmap_traces

db/gps_migrate/ → db/traces_migrate/

db/gps_structure.sql → db/traces_structure.sql

We also need to rename things in chef. openstreetmap/chef#846 already uses gps (props, the database.yml.erb section,etc ), so we should rename there too to keep both sides in sync.

@tomhughes wdyt?

gravitystorm · 2026-05-27T10:15:21Z

+
+# Drop the GPS tables (gps_points, gpx_file_tags, gpx_files) from the main database.
+# These tables now live in a separate GPS database (see GpsRecord model).
+class DropGpsTablesFromMainDb < ActiveRecord::Migration[8.1]


If anyone runs this migration, they will lose all of their traces in their development database.

It's also something that we couldn't merge as-is, since anyone deploying this PR would lose all the trace data from their production database.

Yes, that migration would have deleted trace data, if anyone deploying this PR. I removed it.

The plan from the ops call is to skip the data copy. Tom suggested reusing the replicas as the new GPS DB. So posible steps would be:

Promote one replica, drop the non-GPS tables, point Rails to it as gps.

Verify in production.

Open a new PR with drop migration for the old GPS tables.

For deployers outside OSMF (no replicas), we could add a rake task to copy the GPX data, and also document the migration steps so they can follow the, but i think that's for another ticket.

That's one possible plan that I suggested - no definitive decision has been made about anything.

gravitystorm · 2026-05-27T10:17:15Z

+  def change
+    create_enum :gpx_visibility_enum, %w[private public trackable identifiable]
+
+    create_table :gpx_files, :id => :bigint do |t|


If we're going to create a new database, then I think this is an opportunity to fix the table naming, to match the model names.

I agree. The new gps DB will come from a promoted replica, so renaming directly in the database looks like a good option

I listed what needs to be renamed: https://gist.github.com/Rub21/fd7987d1d8dc5d6d0809eff90cf6c2eb

But this is more a question for Tom. on my side, I can go through the list more carefully and test which ones actually need renaming.

To repeat myself, that is one possible option which I suggested - nothing has been decided yet.

Left a question on the renaming at https://gist.github.com/Rub21/fd7987d1d8dc5d6d0809eff90cf6c2eb?permalink_comment_id=6179520#gistcomment-6179520

pnorman · 2026-05-29T11:34:48Z

In general I'm neutral on this idea. I've read the operations task and I haven't seen convincing evidence that this is the best way to solve the operational problems

We discussed splitting the DBs on the ops call and talked about moving the traces to the old DB servers. The only other gain from splitting would be that we could have different cluster-level settings and WAL/basebackup

It doesn't add anything new to pg_dump since that can already exclude tables.

Personally I'm not convinced this is the biggest trace problem. The API results in ORDER BY ... OFFSET ... queries which are horrible for performance.

tomhughes · 2026-05-29T12:00:21Z

One thing it does do is reduce the time to dump the main database, which affects planet generation.

pnorman · 2026-05-29T17:29:12Z

One thing it does do is reduce the time to dump the main database, which affects planet generation.

Ya, but we could do that without splitting the DB by adjusting which tables pg_dump exports.

pablobm · 2026-06-02T11:19:23Z

skipping is safe, because we don't use the analyzed metadata for those files

Is this something that we should have been skipping all along? If so, I think it can go in a separate PR to help making this one smaller, however slightly so.

Rub21 added 20 commits March 25, 2026 14:29

Add a new gps database for Rails

1d6ab81

Drop gps tables from main database

6a03dbf

Add gpx_visibility_enum ENUM for gps_db

7a799c2

Drop gpx_visibility_enum using safety_assured

bebda85

Use preload instead of includes to avoid JOIN across databases

6e88afd

Handle gps db connection errors to avoid breaking the main site

a019d39

Handle DatabaseConnectionError when gps-db is down adn the site is st…

0ddad1b

…arting

Use traces_count column instead of traces.size to avoid querying gps-…

ab7d2f5

…db on user profile page

Merge remote-tracking branch 'upstream/master' into gps_db

4f3eea9

Configure CI to set up the GPS database

cec922a

Update structure files and model anotations after gps tables move to …

8d6d0c7

…a separate db

Add GPS database setup to lint workflow

71647e6

Use hash rockets syntax in GPS files

ed8a27a

Exclude GPS migration from Rails/ThreeStateBooleanColumn

2c371a3

Add foreign keys for intra-gps trace associations

a258973

Add gps foreign keys with validate: false for strong_migrations

71fd585

Annotate trace models with new foreign keys

c25f93d

Fix cross-database join in traces feeds controller

6fb0f51

Add gps database to docker, devcontainer, and example config

15d5e03

github-actions Bot added big-pr merge-commits labels May 26, 2026

Rub21 mentioned this pull request May 26, 2026

Split GPS points table from main database openstreetmap/operations#1358

Open

gravitystorm requested changes May 27, 2026

View reviewed changes

gravitystorm marked this pull request as draft May 27, 2026 10:29

Only handle GPS database errors in rescue_from

d1f504d

Remove gps tables drop migration

9174cc8

		@@ -0,0 +1,6 @@
		# frozen_string_literal: true

		class GpsRecord < ApplicationRecord

Conversation

Rub21 commented May 26, 2026

Uh oh!

github-actions Bot commented May 26, 2026

Uh oh!

gravitystorm left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Rub21 May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pnorman commented May 29, 2026

Uh oh!

tomhughes commented May 29, 2026

Uh oh!

pnorman commented May 29, 2026

Uh oh!

pablobm commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Rub21 May 29, 2026 •

edited

Loading