Analysis of High Frequency Positioning (HFP) data from public transport vehicles.
The main goal is to analyze high-frequency GPS positioning data from public transport vehicles to identify and study patterns in vehicle behavior, particularly focusing on:
-
Anomaly Detection in Vehicle Movement
- Detect sudden deceleration events
- Identify unusual stopping patterns
- Analyze speed variations in critical zones
-
Spatial Analysis
- Study vehicle behavior at intersections
- Analyze movement patterns at roundabouts
- Identify potential conflict points with other traffic
-
Statistical Analysis
- Develop statistical methods for pattern recognition
- Perform time series analysis on speed and acceleration data
- Create baseline models for normal vehicle behavior
-
Data Visualization
- Create interactive map visualizations
- Generate time series plots of vehicle movements
- Develop dashboards for pattern analysis
The project uses High Frequency Positioning (HFP) data from public transport vehicles. The data is stored in text files captured directly from MQTT topics and is available at:
Download one or more of the files and place them in the data/raw/
directory.
This project uses Python 3.12 and uv
for dependency management. Follow these steps to set up your development environment:
Install uv
if you haven't already. You can use brew:
brew install uv
or download the install script:
curl -LsSf https://astral.sh/uv/install.sh | sh
Create a new virtual environment and install dependencies:
uv venv venv
source venv/bin/activate # On Windows use: venv\Scripts\activate
uv pip install -e ".[dev,test]"
This will install:
- Core dependencies for data analysis and visualization
- Development tools (ruff, pre-commit, jupyter)
- Testing frameworks (pytest with coverage)
This project uses pre-commit hooks for linting and code quality checks.
Install the hooks to your local repository:
pre-commit install
-
Data Collection
- Download HFP data files from the server
- Parse MQTT message format
- Convert to structured data format
-
Data Preprocessing
- Clean GPS coordinates
- Calculate derived metrics (speed, acceleration)
- Filter relevant geographic areas
- Handle missing or erroneous data
-
Analysis
- Time series analysis of vehicle movements
- Statistical pattern recognition
- Anomaly detection in speed/acceleration profiles
- Spatial clustering of events
-
Visualization
- Interactive maps showing vehicle paths
- Time series plots of movement patterns
- Statistical distribution visualizations
- Dashboard for pattern analysis
Core Analysis:
- pandas: Data manipulation and analysis
- numpy: Numerical computations
- scipy: Scientific computing and statistics
Geospatial Analysis:
- geopandas: Spatial data operations
- shapely: Geometric operations
- folium: Interactive maps
Visualization:
- matplotlib: Basic plotting
- seaborn: Statistical visualizations
- plotly: Interactive plots
- dash: Web-based dashboards
Machine Learning:
- scikit-learn: Statistical analysis and machine learning
- statsmodels: Time series analysis
- Clone the repository
- Set up the virtual environment as described above
- Download the HFP data using the provided scripts
- Run the analysis notebooks in the
notebooks/
directory
├── data/ # Data storage
│ ├── raw/ # Original HFP data files
│ ├── processed/ # Cleaned and preprocessed data
│ └── interim/ # Intermediate processing results
├── notebooks/ # Jupyter notebooks for analysis
├── src/ # Source code
│ ├── data/ # Data processing scripts
│ ├── features/ # Feature calculation
│ └── visualization/ # Visualization tools
├── tests/ # Unit tests
└── reports/ # Generated analysis reports
- Fork the repository
- Create a feature branch
- Make your changes
- Run tests
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.