Skip to content

Transit model refactor for easier integration with RT-updaters and transit model maintenance #4002

@t2gran

Description

@t2gran

The transit model needs some love and care. RT-updates is error prone and inefficient, and almost impossible to maintain. The Raptor integration is unnecessarily complex.

Goals

  • Make a better model which serves RT updates and Raptor routing well, over serving the index API.
  • Improve readability and maintainability
  • Make an encapsulated model with immutable data structures supporting concurrent updates
  • We will keep the original concurrency strategy
  • Avoid "per request" data preparation - takes time
  • Keep only the latest state from realtime updates
  • Provide a general interface for real time updates, so updaters/producers (gtfs-rt, siri, etc..) only do mapping. The domain logic should be encapsulated into the transit model.
  • Consistent way of dealing with illegal input data. We should fail-early not late, and illegal data should prevent other data from being used/imported. Data errors should be reported (issue report or log). Entities dropped cascade to other entities referencing them.
  • Error code design #5070 Strategy for error handling and error codes. On API-calls/RT-updates we should be be able to list all possible ERRORS that could occur, but we should not pollute the domain models with a global ERROR enum (like this code).

Milestones

This is a very big task, so we decided on adding a few milestones to track progress.

  • M1 - Initial design
  • M2 - Basic model infrastructure refactor. Migrate the most important transit entities to new immutable model and at the same time develop the infrastructure for it. Entities/services: TransitService with Agency, Operator, Route, Trip and all types these entities reference.
    Otp2 transit model - Part 1 [changelog skip] #4150, Otp2 arch test [changelog skip] #4151, Transit model - Part 2 [changelog skip] #4172, Transit model - Part 3 #4176 and Introduce TransitService [changelog skip] #4222
  • M3 - Migrate StopLocationService to new model.
  • M4 - Migrate remaining transit entities to new model.
  • M5 - Create new model needed for Raptor routing. Includes: TripPattern, Timetable, TransitCalendar, RaptorDataProviders. This is done when existing Raptor module tests can be run successfully using the new model. OTP will still not work on the new model.
  • M6 - Make OTP work for the simplest possible routing request. This include extracting interfaces for the transit model and let the new and the old model implement this, so that itinerary mapping, filtering and APIs can work on both the old and the new model. Another approach is to duplicate some of the code, but this might become a maintenance issue since we will live with this for a while.
  • M7 - Siri RT updater - views/poc. Copy one RT-updater and make it work with the new model, designing the needed views to safely update the transit model. Focus on thread-safety and designing the view as "developer friendly" as possible. Use-case: update departure and/or arrival-times.
  • M8 - Siri RT updater - remaining use-cases.
  • M9 - GTFS RT updater Copy existing GTFS updater and port it to new model.
  • M10 - Migrate constrained transfers to new model
  • M11 - Migrate frequency-based trips to new model
  • M12 - Migrate other updaters if needed
  • M13 - Final cleanup for new model - go through checklist below and fix reminding model issues.
  • M14 - Acceptance test
  • M15 - Remove old code

Strategy: We will try to refactor/migrate as much as possible, but in some cases it will not be cost effective - very difficult. So, at some point we will create a OTPFeature flag to switch between the new and the old model. Note! The graph will be different, so the feature flag must be set to the same value for transit graph build and serve. The Street graph should not be affected.

Checklist - features we should remember to support

This is the list of features we should use to "functionally" verify our model. We do not need to implement these, but the model should take them into account so we CAN implement them in the future.

  • Allow for multiple new shapes for a changed trip/pattern (from @optionsome). Something like shape on Trip(Pattern)ForDate override shape on scheduled pattern.
  • All times should be stored in a Server TimeZone (not agency of feed). We will use a time zero, then for a timetable we can use an offset + array of relative times. The smallest time in the array will be zero (can be both first arrival or departure time).
  • Remove constraint that first departure time and last arrival time is known.
  • The scheduled data structure should be immutable
  • Create TripSearch takes 7-8 % of total time in SpeedTest, we should be able to get rid of that. #4070
  • Prepare for a better heuristic search in Raptor
  • Use transitServiceStart(build-cinfig) as time zero - we can start with this, and see how it looks.
  • Configure TimeZone offset in router-config.json. Make all times in the transit model relative to this offset and time transitServiceStart. Use 32 bit operations for calculating time. Store as int, or less (arrays of short or byte). We must use absolute TimeZone (+/-HH:MM) without DST.
  • Refactor TripSearch out of Raptor Refactor trip search [changelog skip] #4069
  • Introduce and use Dagger dependency-injection
  • Use the the new transit model in module tests instead of a custom test model, possible with a builder/factory to simplify the creation of test data.
  • Migrate all usage of FeedScopedId to ID, and use a configurable Factory/mapper in the import/APIs. This will allow deployments witch do not want to use the FeedScopedId to use another strategy.
  • Remove model class headers: /* This file is based on code copied from project OneBusAway, ... */

Model refactoring

New Transit Model

  • The real time information will be part of the TransitModel, not a separate Snapshot. So a trip will have both scheduled and realtime data. The RealTime updaters will copy parts of the model, change it and post it back tho the service, which apply the changes in a thread-safe way.
  • The data needed by for the routing will be put in a data structure optimized for routing. The entities will not hold this data as fields, but be views of the data in the optimized model. Fields/attributes not needed for trip routing will be part of the TransitModel objects.

Package structure

This is just a starting point/suggestion, while working with this we will probably refactor several times.

  • transit
    • model
      • lang // Shared/base types used across all packages in the transit/model like FeedScopedId and TransitEntity
      • stop // Stop and station related
      • network // Rote, shapes, stop patterns
      • timetable or schedule // Trip, StopTimes
      • realtime^[1] // RealTimeSnapshot
      • support // Supporting functionality that does not fit in one of the above packages (depend on more than one of the packages)

[1] Might be a sub package of timetable

Package dependencies

transitPAckages
This is WIP, we will see if the works out well. It might be that we also want to add a dependency from [network] to [schedule] (Pattern ->* Trip).

Metadata

Metadata

Assignees

Labels

!Technical DebtImprove code quality, no functional changes.+RoadmapCreate an issue and mention your organization in the title and tag with RoadMap to share your planStaleThis issue is stale, no activity for 90 days. Remove stale label or comment within 30 days.

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions