Contains tools for fetching, building and deploying fresh opentripplanner data server and opentripplanner images for consumption by Digitransit maintained OTP version 2.x instances.
The actual data builder application. This is a node.js application that fetches and processes new gtfs/osm data. It's build around gulp and all separate steps of databuilding process can also be called directly from the source tree. The only required external dependency is docker. Docker is used for launching external commands that do for example data manipulation.
install gulp cli:
yarn global add gulp-cli
install app deps:
yarn
update osm data:
ROUTER_NAME=hsl gulp osm:update
download new gtfs data for waltti:
ROUTER_NAME=waltti gulp gtfs:dl
It is possible to change the behaviour of the data builder by defining environment variables.
ROUTER_NAMEdefines for which router the data gets updated for.DOCKER_USERdefines username for authenticating to docker hub.DOCKER_AUTHdefines password for authenticating to docker hub.- (Optional, default v3 and tag based on date)
DOCKER_TAGdefines what will be the updated docker tag of the data server images in the remote container registry. - (Optional, default hsldevcom)
ORGdefines what organization images belong to in the remote container registry. - (Optional, default v3)
SEED_TAGdefines what version of the data storage should be used for seeding. - (Optional, default v2)
OTP_TAGdefines what version of OTP is used for testing, building graphs and deploying a new OTP image (postfixed with router name). - (Optional, default v3)
TOOLS_TAGdefines what version of otp-data-tools image is used for testing. - (Optional, default dev)
BUILDER_TYPEused as a postfix to slack bot name - (Optional)
SLACK_CHANNEL_IDdefines to which slack channel the messages are sent to - (Optional)
SLACK_ACCESS_TOKENbearer token for slack messaging - (Optional, default {})
EXTRA_SRCdefines gtfs src values that should be overridden or completely new src that should be added with unique id. Example format:{"FOLI": {"url": "https://data.foli.fi/gtfs/gtfs.zip", "fit": false, "rules": ["router-waltti/gtfs-rules/waltti.rule"]}}- You can remove a src by including
"remove": true,{"FOLI": {"remove": true}}
- (Optional, default {})
EXTRA_UPDATERSdefines router-config.json updater values that should be overridden or completely new updater that should be added with unique id. Example format:{"turku-alerts": {"type": "real-time-alerts", "frequencySec": 30, "url": "https://foli-beta.nanona.fi/gtfs-rt/reittiopas", "feedId": "FOLI", "fuzzyTripMatching": true}}- You can remove a src by including
"remove": true,{"turku-alerts": {"remove": true}}
- (Optional, default {})
EXTRA_OSMcan redefine OSM source URLs. For example:{"hsl": "https://tempserver.com/newhsl.pbf"} - (Optional)
VERSION_CHECKis a comma-separated list of feedIds from which the GTFS data'sfeed_info.txt's file'sfeed_versionfield is parsed into a date object and it's checked if the data has been updated within the last 8 hours. If not, a message is sent to stdout (and slack, only monday-friday) to inform about usage of "old" data. - (Optional)
SKIPPED_SITESdefines a comma-separated list of sites from OTPQA tests that should be skipped. Example format:"turku.digitransit.fi,reittiopas.hsl.fi"
- (Optional)
DISABLE_BLOB_VALIDATIONshould be included if blob (OSM) validation should be disabled temporarily. - (Optional)
NOSEEDshould be included (together with DISABLE_BLOB_VALIDATION) when data loading for a new configuration is run first time and no seed image is available. - (Optional)
NOCLEANUPcan be used to disable removal of historical data in storage - (Optional)
JAVA_OPTSJava parameters for running OTP - (Optional)
SPLIT_BUILD_TYPEis an enum used to configure if the build should be split. The values are:ONLY_BUILD_STREET_GRAPHto only build the street graphUSE_PREBUILT_STREET_GRAPHto use the prebuilt street graph to finish a complete graph build- All other values default to
NO_SPLIT_BUILDwhich indicates that the build is run as normal
- (Optional)
USE_SEEDED_OSMskips OSM updating and uses existing seed version - (Optional)
SKIP_OSM_PREPROCESSINGskips OSM preprocessing even if an instruction file is defined - (Optional)
SKIP_OTP_TESTSskips OTP tests
-
seeddownloads previous data from storage (env variable SEED_TAG can be used to customize which storage location is used) and then extracts osm, dem, and gtfs data and places it in thedata/seedanddata/readydirectories. The old data acts as backup in case fetching/validating new data fails. The command uses the zipped contents of the latest build that built a complete graph (from prebuilt data or from a normal build). -
dem:updatedownloads required DEM information, after which data is copied to thedata/downloads/demdirectory. -
osm:updatedownloads required OSM packages from configured locations, tests the files with OTP, and if the tests pass, data is copied to thedata/downloads/osmdirectory. -
gtfs:update-
gtfs:dldownloads a GTFS package from a configured location and data is copied to thedata/fitdirectory. The resulting zip file is named<feedid>.zip. -
gtfs:fitruns configured map fits. Copies data to thedata/filterdirectory. -
gtfs:filterruns configured filters. Copies data to thedata/iddirectory. -
gtfs:idsets the gtfs feed id to<id>and copies data to thedata/test/gtfsdirectory. -
gtfs:testtests the file with OTP and if the test passes, data is copied to thedata/ready/gtfsdirectory.
-
-
router:buildGraphrouter:copycopies files needed for the build.buildOTPGraphTask(config.router)builds a new graph with all the new data sets (and maybe seeded data sets if there were issues with new data).
-
router:buildStreetOnlyGraphrouter:copyStreetOnlyGraphDatacopies files needed for the street only build.buildOTPStreetOnlyGraphTask(config.router)builds a new street only graph with all the new data sets (and maybe seeded data sets if there were issues with new data).
-
router:buildWithPrebuiltStreetGraphrouter:copyForPrebuiltStreetGraphDataBuildcopies files needed for the build from prebuilt data.buildOTPGraphTask(config.router)builds a new graph from prebuilt street only data with new gtfs data sets (and maybe seeded data sets if there were issues with new data).
-
test.shruns the routing quality test bench defined in thehsldevcom/OTPQArepository. OTPQA test sets are associated with GTFS packages. If there are quality regressions, a comma separated list of failed GTFS feed identifiers is written to the local filefailed_feeds.txt. -
router:storestores the new data in storage (which can be a mounted storage volume). -
router:storeForPrebuiltStreetGraphDataBuildstores the new data in storage (which can be a mounted storage volume). Also copies thereportdirectory from the street only build to the output directory under the namestreet-report. -
deploy.shdeploys a new opentripplanner-data-server image with theDOCKER_TAGenv variable (defaultv3) postfixed with the router name, and pushes the image to Dockerhub.Normally, when the application is running as a container, the script
index.jsis run to execute all steps. The end result of the build is a data server image uploaded to dockerhub.Each data server image runs an http server listening to port
8080. It serves a data bundle required for building a graph and a prebuilt graph. For example, in the HSL case: http://localhost:8080/router-hsl.zip andgraph-hsl-$OTPVERSION.zip. The image does not include the data, the data needs to be mounted while running the container. -
deploy-otp.shtags an OTP image using theOTP_TAGenv variable (defaultv2) postfixed with the router name and pushes the image to Dockerhub. This new OTP image will automatically use the graph and configuration from the storage location where the build's end result was stored at. -
storage:cleanupkeeps the 10 latest versions of the data in storage and removes the rest. -
storage:cleanupStreetOnlyGraphDatakeeps the 10 latest versions of the street only build data in storage and removes the rest.
seeddem:updateosm:updategtfs:updategtfs:dlgtfs:fitgtfs:filtergtfs:id
router:buildGraphrouter:copybuildOTPGraphTask(config.router)
test.shrouter:storedeploy.shdeploy-otp.shstorage:cleanup
seeddem:updateosm:updaterouter:buildStreetOnlyGraphrouter:copyStreetOnlyGraphDatabuildOTPStreetOnlyGraphTask(config.router)
router:storestorage:cleanupStreetOnlyGraphData
seedgtfs:updategtfs:dlgtfs:fitgtfs:filtergtfs:id
router:buildWithPrebuiltStreetGraphrouter:copyForPrebuiltStreetGraphDataBuildbuildOTPGraphTask(config.router)
test.shrouter:storeForPrebuiltStreetGraphDataBuilddeploy.shdeploy-otp.shstorage:cleanup
Contains tools, such as the OneBusAway gtfs filter, for gtfs manipulation. It uses the opentransitsoftwarefoundation/onebusaway-gtfs-transformer-cli as the base image. These tools are packaged inside a docker container and are used during the data build process.
OSM preprocessing is done if a bash script is defined for a specific config and a specific OSM file. See hsl.sh for an example.
When creating OSM preprocessing instructions you should:
- Name the bash file as follows:
<osm_id>.sh. Valid file names can be e.g.hsl.shorsouthFinland.sh. - Place the file in the
osm-preprocessingdirectory of the config you want to use. - Make sure that the name of the output file is the same as the input file. The file has to be named
<osm_id>.pbf, e.g.hsl.pbforsouthFinland.pbf. - Make sure that you do not reuse input and output filenames in commands:
- INCORRECT
osmfilter hsl.o5m -o=hsl.o5m ... - CORRECT
osmfilter hsl.o5m -o=hsl2.o5m ...
- INCORRECT
- Test the script by running it locally and verifying that the output makes sense.