A predictive engine for the Alan Turing Institute project DemoLand
Either clone and pip install or pip install from git.
See the notebooks in the docs folder.
Top level overview:
- Generate the required files using the Demoland pipeline Docker container
- Commit engine-related files to
demoland-enginerepository - Commit app-related files to
demoland-webrepository
The rest should be automatic.
Important
The pipeline requires a decent amount of RAM (about 12GB at peak but we've seen even 22GB when running on Apple Silicon under emulation) and a fast internet connect (it needs to download 1.3GB of OSM data). Depending on the size of the area and a machine it can take anywhere from 15 minutes to a few hours.
The process of generating data for a new area is provided in the form of a Docker container. You will need to provide four pieces of information:
AREA_NAME: Name that will be visible in the app to the user.NAME: Name that is used as a key within thedemoland_engine.AOI_FILE_PATH: Path to a file relative to the current working directory (avoid../) containing polygon geometries defining the extent of the area of interest. Can be any file readable bygeopandas.read_file.GTFS_FILE_PATH: Zip file with GTFS data covering the region of interest. Go to https://data.bus-data.dft.gov.uk/downloads/, register, and download timetable data for your region. Pass the file without any changes.
The best option is to create a folder with the two required files, navigate to the folder and run the container.
The container is not public, so you need to ensure you are logged in to ghcr.io within Docker. Follow the Github docs to do that. Note that to be able to pull the container, you need to have at least read permission of the demoland-engine repository.
Example:
docker pull ghcr.io/urban-analytics-technology-platform/demoland_pipeline:latest
docker run \
--rm \
-ti \
-e AREA_NAME="Tyne and Wear" \
-e NAME="tyne_and_wear_v1" \
-e AOI_FILE_PATH="geography.geojson" \
-e GTFS_FILE_PATH="itm_north_east_gtfs.zip" \
-v ${PWD}:/app/data \
--user=$UID \
ghcr.io/urban-analytics-technology-platform/demoland_pipelineThe container generates two ZIP files. One shall be used in demoland_engine, and the
other shall be used to deploy the app.
The file engine.zip contains files to be added to the data folder of the demoland-engine repository. Use the information in hashes.json to update data.py in the demoland_engine code. The result should look like the PR #7.
The file app.zip contains all the necessary files to generate the webapp. Note that the new version of demoland_engine with all the files from engine.zip and correct hashes needs to be deployed before the app. See the dedicated documenation on the app deployment. There's no need to pay a lot of attention to the contents of each file as all of them are autogenerated by the pipeline.
To successfully build the container, navigate to the root of the repository and copy the necessary data files there as well. The required files:
air_quality_model.joblibgrid_adjacency_binary.parquetgrid_complete.parquethouse_price_model.joblib
All four need to be present in the repository at the time of building as they are copied to the container. You can then build the container and upload it to GHCR as:
docker build -t ghcr.io/urban-analytics-technology-platform/demoland_pipeline -f Dockerfile.pipe .
docker push ghcr.io/urban-analytics-technology-platform/demoland_pipeline:latestThe repository includes a GitHub Action which automatically deploys the main branch to Azure Functions.
For manual deployment steps, see the Developer Notes section in the DemoLand book for instructions.