Skip to content

Move GPS tables to a separate database#7111

Draft
Rub21 wants to merge 22 commits into
openstreetmap:masterfrom
Rub21:gps_db
Draft

Move GPS tables to a separate database#7111
Rub21 wants to merge 22 commits into
openstreetmap:masterfrom
Rub21:gps_db

Conversation

@Rub21

@Rub21 Rub21 commented May 26, 2026

Copy link
Copy Markdown
Contributor

From: openstreetmap/operations#1358

This PR moves the three GPS tables (gpx_files, gpx_file_tags, gps_points) out of the main openstreetmap database into a new openstreetmap_gps database.

What the PR does:

  • Adds a GpsRecord abstract model connected to the gps database. Trace, Tracepoint, and Tracetag inherit from it.
  • Adds db/gps_migrate/ and db/gps_structure.sql for the GPS schema.
  • Keeps the foreign keys between the three GPS tables. But the link to users.id is now app-level only, since Postgres can't enforce FKs across databases.
  • Updates CI, Docker, and the devcontainer to set up the second database.
  • Feeds controller: removed a join that crossed both databases (Postgres can't do that). Now it loads the trace and the user in two steps.
  • Trace image/icon attachments: skip blob analysis. Active Storage tries to analyze the file by loading the parent record, but the parent (Trace) is now in the gps DB while the attachment is in the main DB, so the analyzer crashes. seems skipping is safe, because we don't use the analyzed metadata for those files.
  • I tested this with the docker-compose dev setup and the trace tests still pass.

Some files were auto-updated by Rails when the tables moved: model annotations, db/structure.sql, and db/gps_structure.sql. These changes were automatic.

Rub21 added 20 commits March 25, 2026 14:29
The two GIFs are drawn at fixed sizes by libgd in lib/gpx.rb, so
ActiveStorage has nothing to learn from analyzing them. Skipping
the analyze job also avoids a race against the disk upload now that
the trace tables live in a separate database from active_storage_*.

The race was breaking Api::TracesControllerTest#test_create, where
the analyze job sometimes ran before the file was written and
raised FileNotFoundError, which the trace importer treated as an
import failure and destroyed the trace.
@github-actions

Copy link
Copy Markdown
2 Warnings
⚠️ Number of updated lines of code is too large to be in one PR. Perhaps it should be separated into two or more?
⚠️ Merge commits are found in PR. Please rebase to get rid of the merge commits in this PR, see CONTRIBUTING.md.

Generated by 🚫 Danger

@gravitystorm gravitystorm left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I'm neutral on this idea. I've read the operations task and I haven't seen convincing evidence that this is the best way to solve the operational problems. From a codebase point of view, this adds significant complexity to all deployments, so we need to make really sure that it's worth doing.

For this PR, in addition to my inline notes, I would say:

  • There's no task included to migrate the data from the old database to the new one. This is critical, particularly for non-OSMF deployments that you are not personally managing.
  • There's no migration strategy. Do we expect all the traces to disappear from the site, then slowly reappear as the migration progresses? Or is there going to be a (several-day?) outage during the migration?
  • Or should the general strategy be more like the "renaming a table" strategy documented in strong_migrations?
  • When this PR runs, the site will instantly use the new database for new uploads. This will then immediately conflict with any data that is being migrated from the old database to the new (e.g. primary key reuse).

rescue_from ActiveRecord::ConnectionNotEstablished,
ActiveRecord::DatabaseConnectionError,
ActiveRecord::NoDatabaseError, :with => :gps_database_unavailable

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't correct, those errors could be for other reasons

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right, the current rescue_from catches the error for the two databases, not only GPS.

I added this rescue to protect the main site when the GPS DB is unavailable. In openstreetmap/chef#846 (comment), I set a short connect_timeout for it, so the connection fails fast and the request does not hang. That way if a user opens /traces while the GPS DB is unavailable, we show a warning instead of blocking the process.

I updated the gps_database_unavailable method so it only handles GPS errors; otherwise it re-raises the exception.

Comment thread app/models/gps_record.rb
@@ -0,0 +1,6 @@
# frozen_string_literal: true

class GpsRecord < ApplicationRecord

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider the naming of this abstract class (and the database name) more carefully.

When OSM was created, "GPS" and satellite positioning was more or less interchangeable. However, time has moved on, and the American GPS system is only one of 4 different GNSS systems.

GNSS is, however, another acronym, and not very self-explanatory - nor is it in common use.

Was the decision to call the database "gps" and the model "GpsRecord" deliberate? Or would something involving "trace" be more consistent with e.g. the model naming?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, GPS is technically only the American system, and "trace" naming is more consistent with the existing Trace model.

I can rename all the required values and files, e.g.:

  • GpsRecord → TraceRecord
  • DB openstreetmap_gps → openstreetmap_traces
  • db/gps_migrate/ → db/traces_migrate/
  • db/gps_structure.sql → db/traces_structure.sql

We also need to rename things in chef. openstreetmap/chef#846 already uses gps (props, the database.yml.erb section,etc ), so we should rename there too to keep both sides in sync.

@tomhughes wdyt?


# Drop the GPS tables (gps_points, gpx_file_tags, gpx_files) from the main database.
# These tables now live in a separate GPS database (see GpsRecord model).
class DropGpsTablesFromMainDb < ActiveRecord::Migration[8.1]

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If anyone runs this migration, they will lose all of their traces in their development database.

It's also something that we couldn't merge as-is, since anyone deploying this PR would lose all the trace data from their production database.

@Rub21 Rub21 May 29, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that migration would have deleted trace data, if anyone deploying this PR. I removed it.

The plan from the ops call is to skip the data copy. Tom suggested reusing the replicas as the new GPS DB. So posible steps would be:

  • Promote one replica, drop the non-GPS tables, point Rails to it as gps.
  • Verify in production.
  • Open a new PR with drop migration for the old GPS tables.

For deployers outside OSMF (no replicas), we could add a rake task to copy the GPX data, and also document the migration steps so they can follow the, but i think that's for another ticket.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's one possible plan that I suggested - no definitive decision has been made about anything.

def change
create_enum :gpx_visibility_enum, %w[private public trackable identifiable]

create_table :gpx_files, :id => :bigint do |t|

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to create a new database, then I think this is an opportunity to fix the table naming, to match the model names.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. The new gps DB will come from a promoted replica, so renaming directly in the database looks like a good option

I listed what needs to be renamed: https://gist.github.com/Rub21/fd7987d1d8dc5d6d0809eff90cf6c2eb

But this is more a question for Tom. on my side, I can go through the list more carefully and test which ones actually need renaming.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To repeat myself, that is one possible option which I suggested - nothing has been decided yet.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gravitystorm gravitystorm marked this pull request as draft May 27, 2026 10:29
@pnorman

pnorman commented May 29, 2026

Copy link
Copy Markdown
Contributor

In general I'm neutral on this idea. I've read the operations task and I haven't seen convincing evidence that this is the best way to solve the operational problems

We discussed splitting the DBs on the ops call and talked about moving the traces to the old DB servers. The only other gain from splitting would be that we could have different cluster-level settings and WAL/basebackup

It doesn't add anything new to pg_dump since that can already exclude tables.

Personally I'm not convinced this is the biggest trace problem. The API results in ORDER BY ... OFFSET ... queries which are horrible for performance.

@tomhughes

Copy link
Copy Markdown
Member

One thing it does do is reduce the time to dump the main database, which affects planet generation.

@pnorman

pnorman commented May 29, 2026

Copy link
Copy Markdown
Contributor

One thing it does do is reduce the time to dump the main database, which affects planet generation.

Ya, but we could do that without splitting the DB by adjusting which tables pg_dump exports.

@pablobm

pablobm commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

skipping is safe, because we don't use the analyzed metadata for those files

Is this something that we should have been skipping all along? If so, I think it can go in a separate PR to help making this one smaller, however slightly so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants