This document aims to describe how the GTFS format is read in the Navitia Transit Model. To improve readability of this document, the specification will describe the transformation of a GTFS feed into a NTFS feed (which is a bunch of csv files accordingly to the memory Navitia Transit Model).
If at any time of the conversion, the GTFS is not conform to the GTFS specification, the conversion should stop immediately with an error, unless otherwise specified.
At the end of the conversion, a sanitizing operation is started on the final model. See common NTFS rules for more information.
As explained in common NTFS rules, a prefix is added to all identifiers during the conversion in order to guarantee uniqueness among objects IDs. In the following chapters, identifiers may be prepend with this prefix using this pattern : <prefix>:<object_id>. The use of this specific pattern is shown explicitly using the value ID in the column Constraint in the tables below.
To reinforce the uniqueness some objects might have a sub-prefix (generated automatically) in addition to their prefix.
The pattern is the following <prefix>:<sub_prefix>:<object_id>.
Objects concerned by this sub-prefix in this connector are: calendars, trips, trip_properties, frequencies, comments, comment_links, geometries, equipments.
In addition, the NTFS format introduces 2 objects to enable the manipulation of several datasets: contributors and datasets. Those two objects are described in common NTFS rules.
Two parameters can be specified in the configuration of the converter in order to determine if on demand transport (ODT) data should be considered when reading the input GTFS (in particular, when reading the stop_times.txt file):
- a boolean parameter
odt, by default set tofalse, indicating if the GTFS should be considered as containing ODT information - a string
odt_commentsetting the message associated to an ODT comment.
| GTFS object | NTFS object(s) |
|---|---|
| agency | network and company |
| route | line, route, physical_mode, commercial_mode |
| trip | route and trip |
| stop_time | stop_time |
| transfer | transfer |
| shape | geometry |
| frequency | trip and stop_time |
The field "agency_id" may not be provided in the GTFS as it's an optional field.
- If there is only one agency, the "agency_id" is considered to be "1".
- If there are several agencies, the program will raise an exception as it won't be able to choose the right agency for the routes.
If 2 networks with the same ID are specified, the conversion should stop immediately with an error.
| NTFS file | NTFS field | Constraint | GTFS file | GTFS field | Note |
|---|---|---|---|---|---|
| networks.txt | network_id | ID | agency.txt | agency_id | See above when not specified |
| networks.txt | network_name | Required | agency.txt | agency_name | |
| networks.txt | network_url | Optional | agency.txt | agency_url | |
| networks.txt | network_timezone | Optional | agency.txt | agency_timezone | |
| networks.txt | network_lang | Optional | agency.txt | agency_lang | |
| networks.txt | network_phone | Optional | agency.txt | agency_phone |
"Source" complementary code :
A complementary object_code is added to each network with the following properties:
object_type: the fixed valuenetworkobject_id: the value of thenetwork_idfieldobject_system: the fixed valuesourceobject_code: the unmodified value ofagency_id(or1if the value is not provided as stated above)
If 2 companies with the same ID are specified, the conversion should stop immediately with an error.
| NTFS file | NTFS field | Constraint | GTFS file | GTFS field | Note |
|---|---|---|---|---|---|
| companies.txt | company_id | ID | agency.txt | agency_id | 1 if the value is not provided (same rule as networks) |
| companies.txt | company_name | Required | agency.txt | agency_name | |
| companies.txt | company_url | Optional | agency.txt | agency_lang | |
| companies.txt | company_phone | Optional | agency.txt | agency_phone |
Like the GTFS, the NTFS group stop_points and stop_areas in on file : stops.txt. If the stop_points have the same ID, the conversion should stop immediately with an error. Likewise for the stop_areas.
| NTFS file | NTFS field | Constraint | GTFS file | GTFS field | Note |
|---|---|---|---|---|---|
| stops.txt | stop_id | ID | stops.txt | stop_id | All slashes / will be removed |
| stops.txt | stop_code | Optional | stops.txt | stop_code | Additionally, this GTFS property is stored as an associated code for this stop. See (2) for complementary properties. |
| stops.txt | stop_name | Required | stops.txt | stop_name | |
| stops.txt | stop_lat | Required | stops.txt | stop_lat | |
| stops.txt | stop_lon | Required | stops.txt | stop_lon | |
| stops.txt | location_type | Optional | stops.txt | location_type | The value is set to 0 if the input value is 0 or unspecified or invalid, 1 if the input value is 1, 3 if the input value is 2, 4 if the input value is 3 and 5 if the input value is 4 |
| stops.txt | parent_station | Optional | stops.txt | parent_station | All slashes / are removed (1) |
| stops.txt | stop_timezone | Optional | stops.txt | stop_timezone | |
| stops.txt | fare_zone_id | Optional | stops.txt | zone_id | Only for stop_point (location_type = 0) |
| comments.txt | comment_value | Optional | stops.txt | stop_desc | See (3) for additional properties |
| equipments.txt | wheelchair_boarding | Optional | stops.txt | wheelchair_boarding | If value is not one of 0, 1 or 2, then set to 0. See (4) for detailed info. |
(1) If the parent_station field of a stop_point (location_type = 0 or empty) is missing or empty, then a stop_area should be created, using the following properties :
stop_id: the stop_id of the stop_point, with the following pattern : Navitia:<stop_id of the stop_point>stop_name: the stop_name of the stop_pointstop_lat: the stop_lat of the stop_pointstop_lon: the stop_lon of the stop_pointlocation_type: fixed value "1" (to specify it's a stop_area) Theparent_stationof the stop_point should then contain the generatedstop_area.id.
(2) The stop_code field is added as a complementary object_code with the following properties:
object_type:stop_pointorstop_areaaccordingly to thelocation_typevalueobject_id: NTFSstop_idobject_system: Fixed valuegtfs_stop_codeobject_code: value of thestop_codeproperty Thegtfs_stop_codecomplementaryobject_codeis kept here for backward compatibility reasons. It will be removed in the future.
(3) The comment object is a complex type with additional properties :
comment_id: specify an identifier with the pattern stop:<stop_id of GTFS>comment_type: specify the fixed value "information"
(4) The equipment object is a complex type with additional properties :
equipment_id: should be generated by the reader.wheelchair_boarding: possible values are the same in both GTFS and NTFS. Be careful to only create necessary equipments and avoid duplicates.
"Source" complementary code :
A complementary object_code is added to each stop with the following properties:
object_type: the fixed valuestop_pointorstop_area(depending on the object)object_id: the value of thestop_idfieldobject_system: the fixed valuesourceobject_code: the unmodified value ofagency_id(or1if the value is not provided as stated above)
The standard values of the route_type field are directly mapped to the NTFS modes. Extended GTFS modes are read by categories mapping the most prominent mode. The priority is used to prioritize the use of a commercial mode when creating a Line grouping routes with different route_types. This priorization follow the Netex Specification in chapter 6.2.3 (and also indicated in the NTFS Specification).
| GTFS route_type | NTFS physical_mode ID (1) | NTFS commercial_mode ID (2) | NTFS commercial_mode name | Priority |
|---|---|---|---|---|
| 0, 9XX | Tramway | Tramway | Tramway | 3 |
| 1, 4XX, 5XX, 6XX | Metro | Metro | Metro | 4 |
| 2, 1XX, 3XX | Train | Train | Train | 2 |
| 3, 7XX, 8XX | Bus | Bus | Bus | 8 |
| 4, 10XX, 12XX | Ferry | Ferry | Ferry | 1 |
| 5 | Funicular | CableCar | Cable car | 6 |
| 6, 13XX | SuspendedCableCar | SuspendedCableCar | Suspended cable car | 7 |
| 7, 14XX | Funicular | Funicular | Funicular | 5 |
| 2XX | Coach | Coach | Coach | 8 |
| 11XX | Air | Air | Airplane | 0 |
| 15XX | Taxi | Taxi | Taxi | 8 |
| 16XX, 17XX | Bus | UnknownMode | Unknown mode | 8 |
(1) The physical_mode ID is a mapping with a specific value as described in the NTFS format specification. This value must not be prefixed. (2) The commercial_mode ID are standardized when converting from GTFS. This value must not be prefixed.
All physical_mode are enhanced with CO2 emission and fallback modes, following
the documentation in common NTFS rules.
A Route is created for each direction of existing trips. If 2 routes with the same ID are specified, the conversion should stop immediately with an error. Warning : If the GTFS route has no trips, the Navitia Route should NOT be created and a warning should be logged.
| NTFS file | NTFS field | Constraint | GTFS file | GTFS field | Note |
|---|---|---|---|---|---|
| routes.txt | route_id | ID | routes.txt | route_id | append a _R suffix for the Route grouping trips with direction_id = 1 (no suffix for 0 or undefined direction_id) |
| routes.txt | route_name | Required | routes.txt | route_long_name | (1) |
| routes.txt | direction_type | Optional | (2) | ||
| routes.txt | line_id | Required | corresponding line.id (see Line construction above) |
||
| routes.txt | destination_id | Optional | This field contains a stop_area.id of the most frequent destination of the contained trips (ie. the parent_station of the most frequent last stop of trips) | ||
| comments.txt | comment_value | Optional | routes.txt | route_desc | See (3) for additional properties |
(1) if only one route is created (only one direction in included trips), use
route_long_name or, if empty, use route_short_name. In case of multiple
routes created (multiple directions in included trips), see common NTFS rules
for generating the route_name.
(2) the field direction_type contains backward when grouping GTFS Trips with direction_id = 1, forward otherwise
(3) The comment object is a complex type with additional properties :
comment_id: specify an identifier with the pattern <prefix>:route:<route_id of GTFS>comment_type: specify the fixed value "Information"
A Navitia Line is created to group one or several Navitia Routes when they are
created with the same gtfs agency_id and the same route_short_name (or
route_long_name if the latter is empty). If 2 lines with the same ID are
specified, the conversion should stop immediately with an error.
| NTFS file | NTFS field | Constraint | GTFS file | GTFS field | Note |
|---|---|---|---|---|---|
| lines.txt | network_id | Required | This field should contain the network.id corresponding to the agency_id of the routes; if no agency_id is specified in the route, use the ID of the unique network; if no network or multiple networks are available, the conversion should stop immediately with an error |
||
| lines.txt | line_id | ID | routes.txt | route_id | Use the smallest route_id of the grouped gtfs Route |
| lines.txt | line_code | Optional | routes.txt | route_short_name | |
| lines.txt | line_name | Required | routes.txt | The Navitia route_name of the Route with the smallest route_id (as a string) is used. |
|
| lines.txt | line_color | Optional | routes.txt | route_color | if several values are available, a warning is logged and the color of the smallest route_id is used; if color format is incorrect, the value is dropped |
| lines.txt | line_text_color | Optional | routes.txt | route_text_color | same as line_color; if color format is incorrect, the value is dropped |
| lines.txt | line_sort_order | Optional | routes.txt | route_sort_order | |
| lines.txt | commercial_mode_id | Required | routes.txt | route_type | See "Mapping of route_type with modes" chapter (1). |
(1) When several GTFS Routes with different route_types are grouped together, the commercial_mode_id with the smallest priority should be used (as specified in chapter "Mapping of route_type with modes").
GTFS services are transformed into lists of active dates as if using a single NTFS
file calendar_dates.txt. The resulting NTFS files might be different following an
optimization operation applied at the end of the conversion, but the result should be
functionally identical.
- In case both files
calendar.txtandcalendar_dates.txtare present in the input dataset, the days of the week of the specified services within the date range [start_date-end_date] are transformed into explicit active service dates, taking into account the dates when service exceptions occur. Note that the generated (service_id,date) pairs must be unique. - In case the file
calendar.txtis empty or not present in the input dataset, the active service dates are loaded as is.
If 2 trips with the same ID are specified, the conversion should stop immediately with an error.
| NTFS file | NTFS field | Constraint | GTFS file | GTFS field | Note |
|---|---|---|---|---|---|
| trips.txt | route_id | Required | trips.txt | route_id | cf. NTFS route_id definition above to specify the proper reference. |
| trips.txt | service_id | Required | trips.txt | service_id | |
| trips.txt | trip_id | Required | trips.txt | trip_id | |
| trips.txt | trip_headsign | Optional | trips.txt | trip_short_name, or if empty trip_headsign |
|
| trips.txt | block_id | Optional | trips.txt | block_id | |
| trips.txt | company_id | Required | routes.txt | agency_id | The company corresponding to the agency_id of the trip's route_id |
| trips.txt | physical_mode_id | Required | use the route_type See "Mapping of route_type with modes" chapter |
||
| trips.txt | trip_property_id | Optional | trips.txt | (1) | |
| trips.txt | dataset_id | Required | The dataset_id provided (cf. gtfs2ntfs.md ) |
||
| trips.txt | geometry_id | Optional | trips.txt | shape_id | All slashes / are removed |
(1) The trip_property object is a complex type with additional properties :
trip_property_id: should be generated by the reader.wheelchair_accessible: possible values are the same in both GTFS and NTFS; if value is not one of0,1or2, then set to0.bike_accepted: corresponding to the GTFSbikes_allowedproperty. Possible values are the same in both GTFS and NTFS; if value is not one of0,1or2, then set to0. Be careful to only create necessarytrip_propertiesand avoid duplicates.
"Source" complementary code :
A complementary object_code is added to each vehicle journey with the following properties:
object_type: the fixed valuetripobject_id: the value of thetrip_idfieldobject_system: the fixed valuesourceobject_code: the unmodified GTFS value oftrip_id
| NTFS file | NTFS field | Constraint | GTFS file | GTFS field | Note |
|---|---|---|---|---|---|
| stop_times.txt | trip_id | Required | stop_times.txt | trip_id | All slashes / are removed; if the corresponding trip doesn't exist, the conversion should stop immediately with an error |
| stop_times.txt | arrival_time | Optional | stop_times.txt | arrival_time | If not specified, see (1) |
| stop_times.txt | departure_time | Optional | stop_times.txt | departure_time | If not specified, see (1) |
| stop_times.txt | stop_id | Required | stop_times.txt | stop_id | If the corresponding stop doesn't exist, the conversion should stop immediately with an error |
| stop_times.txt | stop_sequence | Required | stop_times.txt | stop_sequence | |
| stop_times.txt | stop_headsign | Optional | stop_times.txt | stop_headsign | |
| stop_times.txt | pickup_type | Optional | stop_times.txt | pickup_type | If invalid unsigned integer, default to 0. If 2, see (3) for the generation of comments. |
| stop_times.txt | drop_off_type | Optional | stop_times.txt | drop_off_type | If invalid unsigned integer, default to 0. If 2, see (3) for the generation of comments. |
| stop_times.txt | stop_time_precision | Optional | stop_times.txt | timepoint | GTFS and NTFS values are inverted when no ODT information is considered. See (2). If invalid unsigned integer, default to 1 |
(1) GTFS arrival_time and departure_time should contain values.
- if both of them are empty :
- if the stop_time is the first or the last of the trip, an error is returned
- if not, the time should be interpolated (see below).
- if one of them is empty, a warning should be logged and the value of the other field should be copied to the empty one.
Interpolation If a stop_time needs to be interpolated :
- collect the nearest preceding stop_time and the nearest following stop_time containing a valid time value
- apply a simple distribution for all the intermediate stop_times For exemple :
| GTFS passing time | NTFS Extrapolated time |
|---|---|
| 9:00 | 9:00 |
| - | 9:30 |
| - | 10:00 |
| 10:30 | 10:30 |
(2) Depending of the value of the parameter odt, the GTFS timepoint conversion rules for NTFS stop_time_precision are :
- if
odtis set tofalseor empty:- if
timepointis unspecified =>stop_time_precisionequals 0 - if
timepointequals 1 =>stop_time_precisionequals 0 - if
timepointequals 0 =>stop_time_precisionequals 1
- if
- if
odtis set totrue:- if
timepointis unspecified =>stop_time_precisionequals 0 - if
timepointequals 1 =>stop_time_precisionequals 0 - if
timepointequals 0 =>stop_time_precisionequals 2
- if
(3) A comment associated to the stop_time is created in the files comments.txt and comment_links.txt as follows:
| NTFS file | NTFS field | Constraint | Value/Note |
|---|---|---|---|
| comments.txt | comment_id | Required | The value of stop_time_id is used as the concatenation of trip_id and stop_sequence separated by -. Note that this field is prefixed as explained in common NTFS rules. |
| comments.txt | comment_type | Optional | on_demand_transport |
| comments.txt | comment_name | Required | The message set for the parameter odt_comment. |
| comment_links.txt | object_id | Required | The value of stop_time_id is used as the concatenation of trip_id and stop_sequence separated by -. Note that this field is prefixed as explained in common NTFS rules. |
| comment_links.txt | object_type | Required | stop_time |
| comment_links.txt | comment_id | Required | The value of stop_time_id is used as the concatenation of trip_id and stop_sequence separated by -. Note that, as this field references the comment in file comments.txt, it should be prefixed as explained in common NTFS rules. |
- If 2 transfers with the same ID are specified, the conversion should stop immediately with an error
- If a line of the file is not conform to the specification, then the line is ignored
| NTFS file | NTFS field | Constraint | GTFS file | GTFS field | Note |
|---|---|---|---|---|---|
| transfers.txt | from_stop_id | Required | transfers.txt | from_stop_id | All slashes / are removed; if the stop_id doesn't exist in stops.txt, the transfer is ignored |
| transfers.txt | to_stop_id | Required | transfers.txt | to_stop_id | All slashes / are removed; if the stop_id doesn't exist in stops.txt, the transfer is ignored |
| transfers.txt | min_transfer_time | Optional | transfers.txt | see (1) | |
| transfers.txt | real_min_transfer_time | Optional | transfers.txt | see (1) | |
| transfers.txt | equipment_id | Optional | transfers.txt |
(1) NTFS min_transfer_time and real_min_transfer_time are calculated as
follows. Note that if value is not one of 0, 1, 2 or 3, then set to 0.
GTFS transfer_type |
NTFS min_transfer_time |
NTFS real_min_transfer_time |
Note |
|---|---|---|---|
| 0 | time between 2 stop points | time between 2 stop points + 2 minutes | The time is calculated with the distance as the crow flies and a walking speed of 0.785 m/s. Speed value is lowered because effective transit is not straight |
| 1 | 0 | 0 | |
| 2 | GTFS min_transfer_time |
GTFS min_transfer_time |
Log a warning message if the min_transfer_time is empty |
| 3 | 86400 | 86400 |
| NTFS file | NTFS field | Constraint | GTFS file | GTFS field | Note |
|---|---|---|---|---|---|
| geometries.txt | geometry_id | ID | shapes.txt | shape_id | All slashes / are removed |
| geometries.txt | geometry_wkt | Required | shapes.txt | shape_pt_lat, shape_pt_lon, shape_pt_sequence | A WKT LINESTRING geometry is created from the 3 input fields. |
Frequencies are transformed into explicit passing times by creating new trips that operate on regular times within the specified period. For each line of the GTFS frequencies.txt file, the referenced trip and its stop_times are used as a sample to create the new trips whose stop_times are calculated based on the given headway.
A new trip is created, departing from the first stop every headway_secs seconds within the time period between start_time and end_time. Stop times of the referenced trip are used to calculate the time interval between two stop departures.
The departure time at the first stop of the last trip should not be later than the end_time value. In case both values for start_time and end_time are equal or end_time is smaller than start_time, the frequency is ignored (no new trip is created).
Note that the referenced trip (and its stop_times) is only used as a sample and is deleted in the resulting data. In case the referenced trip and/or its associated stop_times do not exist, the frequency is ignored (no new trip is created).
The identifier for each new trip is generated using the following pattern: <trip_id>:<auto-incrimented integer> and maintains the rest of the attributes of the sample trip. That is, all new trips are assigned to the same route as the route of the sample trip, have the same service_id, etc.
A complementary object_code is added to each new trip with the following properties:
object_type: the fixed valuetripobject_id: the value of thetrip_idfieldobject_system: the fixed valuesourceobject_code: the unmodified initial GTFS value oftrip_id