Move GPS tables to a separate database#7111
Conversation
…db on user profile page
The two GIFs are drawn at fixed sizes by libgd in lib/gpx.rb, so ActiveStorage has nothing to learn from analyzing them. Skipping the analyze job also avoids a race against the disk upload now that the trace tables live in a separate database from active_storage_*. The race was breaking Api::TracesControllerTest#test_create, where the analyze job sometimes ran before the file was written and raised FileNotFoundError, which the trace importer treated as an import failure and destroyed the trace.
Generated by 🚫 Danger |
gravitystorm
left a comment
There was a problem hiding this comment.
In general I'm neutral on this idea. I've read the operations task and I haven't seen convincing evidence that this is the best way to solve the operational problems. From a codebase point of view, this adds significant complexity to all deployments, so we need to make really sure that it's worth doing.
For this PR, in addition to my inline notes, I would say:
- There's no task included to migrate the data from the old database to the new one. This is critical, particularly for non-OSMF deployments that you are not personally managing.
- There's no migration strategy. Do we expect all the traces to disappear from the site, then slowly reappear as the migration progresses? Or is there going to be a (several-day?) outage during the migration?
- Or should the general strategy be more like the "renaming a table" strategy documented in strong_migrations?
- When this PR runs, the site will instantly use the new database for new uploads. This will then immediately conflict with any data that is being migrated from the old database to the new (e.g. primary key reuse).
| rescue_from ActiveRecord::ConnectionNotEstablished, | ||
| ActiveRecord::DatabaseConnectionError, | ||
| ActiveRecord::NoDatabaseError, :with => :gps_database_unavailable | ||
|
|
There was a problem hiding this comment.
This isn't correct, those errors could be for other reasons
There was a problem hiding this comment.
Yes, you are right, the current rescue_from catches the error for the two databases, not only GPS.
I added this rescue to protect the main site when the GPS DB is unavailable. In openstreetmap/chef#846 (comment), I set a short connect_timeout for it, so the connection fails fast and the request does not hang. That way if a user opens /traces while the GPS DB is unavailable, we show a warning instead of blocking the process.
I updated the gps_database_unavailable method so it only handles GPS errors; otherwise it re-raises the exception.
| @@ -0,0 +1,6 @@ | |||
| # frozen_string_literal: true | |||
|
|
|||
| class GpsRecord < ApplicationRecord | |||
There was a problem hiding this comment.
We should consider the naming of this abstract class (and the database name) more carefully.
When OSM was created, "GPS" and satellite positioning was more or less interchangeable. However, time has moved on, and the American GPS system is only one of 4 different GNSS systems.
GNSS is, however, another acronym, and not very self-explanatory - nor is it in common use.
Was the decision to call the database "gps" and the model "GpsRecord" deliberate? Or would something involving "trace" be more consistent with e.g. the model naming?
There was a problem hiding this comment.
I agree, GPS is technically only the American system, and "trace" naming is more consistent with the existing Trace model.
I can rename all the required values and files, e.g.:
- GpsRecord → TraceRecord
- DB openstreetmap_gps → openstreetmap_traces
- db/gps_migrate/ → db/traces_migrate/
- db/gps_structure.sql → db/traces_structure.sql
We also need to rename things in chef. openstreetmap/chef#846 already uses gps (props, the database.yml.erb section,etc ), so we should rename there too to keep both sides in sync.
@tomhughes wdyt?
|
|
||
| # Drop the GPS tables (gps_points, gpx_file_tags, gpx_files) from the main database. | ||
| # These tables now live in a separate GPS database (see GpsRecord model). | ||
| class DropGpsTablesFromMainDb < ActiveRecord::Migration[8.1] |
There was a problem hiding this comment.
If anyone runs this migration, they will lose all of their traces in their development database.
It's also something that we couldn't merge as-is, since anyone deploying this PR would lose all the trace data from their production database.
There was a problem hiding this comment.
Yes, that migration would have deleted trace data, if anyone deploying this PR. I removed it.
The plan from the ops call is to skip the data copy. Tom suggested reusing the replicas as the new GPS DB. So posible steps would be:
- Promote one replica, drop the non-GPS tables, point Rails to it as gps.
- Verify in production.
- Open a new PR with drop migration for the old GPS tables.
For deployers outside OSMF (no replicas), we could add a rake task to copy the GPX data, and also document the migration steps so they can follow the, but i think that's for another ticket.
There was a problem hiding this comment.
That's one possible plan that I suggested - no definitive decision has been made about anything.
| def change | ||
| create_enum :gpx_visibility_enum, %w[private public trackable identifiable] | ||
|
|
||
| create_table :gpx_files, :id => :bigint do |t| |
There was a problem hiding this comment.
If we're going to create a new database, then I think this is an opportunity to fix the table naming, to match the model names.
There was a problem hiding this comment.
I agree. The new gps DB will come from a promoted replica, so renaming directly in the database looks like a good option
I listed what needs to be renamed: https://gist.github.com/Rub21/fd7987d1d8dc5d6d0809eff90cf6c2eb
But this is more a question for Tom. on my side, I can go through the list more carefully and test which ones actually need renaming.
There was a problem hiding this comment.
To repeat myself, that is one possible option which I suggested - nothing has been decided yet.
There was a problem hiding this comment.
Left a question on the renaming at https://gist.github.com/Rub21/fd7987d1d8dc5d6d0809eff90cf6c2eb?permalink_comment_id=6179520#gistcomment-6179520
We discussed splitting the DBs on the ops call and talked about moving the traces to the old DB servers. The only other gain from splitting would be that we could have different cluster-level settings and WAL/basebackup It doesn't add anything new to pg_dump since that can already exclude tables. Personally I'm not convinced this is the biggest trace problem. The API results in ORDER BY ... OFFSET ... queries which are horrible for performance. |
|
One thing it does do is reduce the time to dump the main database, which affects planet generation. |
Ya, but we could do that without splitting the DB by adjusting which tables pg_dump exports. |
Is this something that we should have been skipping all along? If so, I think it can go in a separate PR to help making this one smaller, however slightly so. |
From: openstreetmap/operations#1358
This PR moves the three GPS tables (gpx_files, gpx_file_tags, gps_points) out of the main openstreetmap database into a new openstreetmap_gps database.
What the PR does:
Some files were auto-updated by Rails when the tables moved: model annotations, db/structure.sql, and db/gps_structure.sql. These changes were automatic.