ForexFactoryScrapper is a Python-based web scraping tool designed to extract financial event data from the ForexFactory website. This project provides a simple and effective way to scrape calendar events, forecast data, actual values, and other relevant information for forex trading analysis.
- Scrape calendar events, including date, time, currency, event name, forecast, actual, and previous values.
- Export or process extracted data in structured formats suitable for analysis.
- Simple and customizable scraping logic using
BeautifulSoup. - Includes examples for extracting data and creating basic reports.
- Python 3.9 or newer
- See
requirements.txtfor dependency versions used during development and testing.
- Create and activate a virtual environment:
python -m venv .venv
source .venv/bin/activate- Install dependencies:
pip install -r requirements.txtStart the application locally:
python app.pyBy default this will start the app on 0.0.0.0:5000. Example endpoints you can call:
- GET /api/hello
- GET /api/health
- GET /api/forex/daily?day=1&month=1&year=2020
(Adjust host/port or endpoint parameters as needed in main.py.)
Below are simple example requests you can use to interact with the running application. Replace localhost:5000 with the host/port where your app is listening if different.
Curl:
curl -sS http://localhost:5000/api/helloExpected JSON response (HTTP 200):
{
"message": "Hello, World!",
"status": "success"
}Curl:
curl -sS http://localhost:5000/api/healthExpected JSON response (HTTP 200):
{
"status": "ok"
}- Missing parameters (HTTP 400):
curl -sS http://localhost:5000/api/forex/dailyResponse body:
{ "error": "Missing one or more required parameters: day, month, year" }- Invalid (non-integer) parameters (HTTP 400):
curl -sS "http://localhost:5000/api/forex/daily?day=aa&month=bb&year=cc"Response body:
{ "error": "Parameters day, month and year must be integers" }- Out-of-range parameters (HTTP 400):
curl -sS "http://localhost:5000/api/forex/daily?day=99&month=99&year=3000"Response body:
{ "error": "Parameters out of reasonable range" }Curl (example):
curl -sS "http://localhost:5000/api/forex/daily?day=1&month=1&year=2020"Expected JSON response (HTTP 200): a pagination wrapper containing metadata and a list of records. Example response format:
{
"total": 1,
"offset": 0,
"limit": null,
"results": [
{
"Time": "01/01/2020 00:00",
"Currency": "USD",
"Event": "NFP",
"Forecast": "100k",
"Actual": "120k",
"Previous": "90k"
}
]
}Note: The
total,offset, andlimitfields in the response wrapper provide pagination metadata. Theresultsfield contains the list of records.
Python requests example:
import requests
resp = requests.get(
'http://localhost:5000/api/forex/daily',
params={'day': 1, 'month': 1, 'year': 2020},
)
print(resp.status_code)
print(resp.json())This project added optional paging support to the /api/forex/daily endpoint via two query parameters: limit and offset.
offset(optional): integer >= 0, default 0. Skip this many records from the start.limit(optional): integer >= 0, default is unlimited. Return at most this many records after applying the offset.
Behavior and validation:
- Both
limitandoffsetmust be integers. Non-integer values return HTTP 400. - Negative values return HTTP 400.
- If
offsetis greater than or equal to the number of available records, the endpoint returns an empty list and HTTP 200. limit=0returns an empty list (valid request).- If the scraper returns a non-list structure, paging is not applied and the raw response is returned.
Examples:
- First 10 records:
curl -sS "http://localhost:5000/api/forex/daily?day=1&month=1&year=2020&limit=10"- Start from the 5th record and return up to 3 records:
curl -sS "http://localhost:5000/api/forex/daily?day=1&month=1&year=2020&offset=4&limit=3"- Non-integer or negative paging params (example, HTTP 400):
curl -sS "http://localhost:5000/api/forex/daily?day=1&month=1&year=2020&limit=abc"
curl -sS "http://localhost:5000/api/forex/daily?day=1&month=1&year=2020&offset=-1"The project also exposes a Cryptocraft-specific scraping endpoint that follows the same parameter and paging semantics as the Forex endpoint.
- Missing parameters (HTTP 400):
curl -sS http://localhost:5000/api/cryptocraft/dailyResponse body:
{ "error": "Missing one or more required parameters: day, month, year" }- Invalid (non-integer) parameters (HTTP 400):
curl -sS "http://localhost:5000/api/cryptocraft/daily?day=aa&month=bb&year=cc"Response body:
{ "error": "Parameters day, month and year must be integers" }- Success (pagination wrapper):
Curl example:
curl -sS "http://localhost:5000/api/cryptocraft/daily?day=1&month=1&year=2020"Expected JSON response (HTTP 200): a pagination wrapper containing metadata and a list of records. Example response format:
{
"total": 1,
"offset": 0,
"limit": null,
"results": [
{
"Time": "01/01/2020 00:00",
"Impact": "high",
"Event": "Protocol Upgrade",
"Forecast": "n/a",
"Actual": "n/a",
"Previous": "n/a"
}
]
}Note on Impact field:
-
For the
cryptocraftendpoint there is noCurrencyfield in results. Instead each record contains anImpactfield describing the expected market impact of the event. Typical values arelow,medium, orhigh(strings). Consumers should rely onImpactto assess severity rather than a currency code. -
Paging examples (limit & offset):
First 10 records:
curl -sS "http://localhost:5000/api/cryptocraft/daily?day=1&month=1&year=2020&limit=10"Start from the 5th record and return up to 3 records:
curl -sS "http://localhost:5000/api/cryptocraft/daily?day=1&month=1&year=2020&offset=4&limit=3"Notes:
- Behavior and validation match the Forex endpoint:
limitandoffsetmust be integers and non-negative;limit=0is valid and returns an emptyresultsarray. - Responses use the pagination wrapper
{ "total": N, "offset": X, "limit": Y, "results": [...] }for list data.
The project also exposes a MetalsMine-specific scraping endpoint that follows the same parameter and paging semantics as the Forex and Cryptocraft endpoints.
- Missing parameters (HTTP 400):
curl -sS http://localhost:5000/api/metalsmine/dailyResponse body:
{ "error": "Missing one or more required parameters: day, month, year" }- Invalid (non-integer) parameters (HTTP 400):
curl -sS "http://localhost:5000/api/metalsmine/daily?day=aa&month=bb&year=cc"Response body:
{ "error": "Parameters day, month and year must be integers" }- Success (pagination wrapper):
Curl example:
curl -sS "http://localhost:5000/api/metalsmine/daily?day=1&month=1&year=2020"Expected JSON response (HTTP 200): a pagination wrapper containing metadata and a list of records. Example response format:
{
"total": 1,
"offset": 0,
"limit": null,
"results": [
{
"Time": "01/01/2020 00:00",
"Currency": "XAU",
"Event": "Gold Inventory Release",
"Forecast": "n/a",
"Actual": "n/a",
"Previous": "n/a"
}
]
}- Paging examples (limit & offset):
First 10 records:
curl -sS "http://localhost:5000/api/metalsmine/daily?day=1&month=1&year=2020&limit=10"Start from the 5th record and return up to 3 records:
curl -sS "http://localhost:5000/api/metalsmine/daily?day=1&month=1&year=2020&offset=4&limit=3"The project also exposes an EnergyExch-specific scraping endpoint that follows the same parameter and paging semantics as the Forex, Cryptocraft, and MetalsMine endpoints.
- Missing parameters (HTTP 400):
curl -sS http://localhost:5000/api/energyexch/dailyResponse body:
{ "error": "Missing one or more required parameters: day, month, year" }- Invalid (non-integer) parameters (HTTP 400):
curl -sS "http://localhost:5000/api/energyexch/daily?day=aa&month=bb&year=cc"Response body:
{ "error": "Parameters day, month and year must be integers" }- Success (pagination wrapper):
Curl example:
curl -sS "http://localhost:5000/api/energyexch/daily?day=1&month=1&year=2020"Expected JSON response (HTTP 200): a pagination wrapper containing metadata and a list of records. Example response format:
{
"total": 1,
"offset": 0,
"limit": null,
"results": [
{
"Time": "01/01/2020 00:00",
"Currency": "USD",
"Event": "Energy Report",
"Forecast": "n/a",
"Actual": "n/a",
"Previous": "n/a"
}
]
}- Paging examples (limit & offset):
First 10 records:
curl -sS "http://localhost:5000/api/energyexch/daily?day=1&month=1&year=2020&limit=10"Start from the 5th record and return up to 3 records:
curl -sS "http://localhost:5000/api/energyexch/daily?day=1&month=1&year=2020&offset=4&limit=3"Notes:
- Behavior and validation match the Forex endpoint:
limitandoffsetmust be integers and non-negative;limit=0is valid and returns an emptyresultsarray. - Responses use the pagination wrapper
{ "total": N, "offset": X, "limit": Y, "results": [...] }for list data.
Notes and suggestions:
- There is no enforced maximum
limitin the current implementation. For production use you may want to caplimit(for example 500 or 1000) to avoid large responses or memory spikes. - Consider returning a pagination wrapper like
{ "total": N, "offset": X, "limit": Y, "results": [...] }if clients benefit from metadata. Current response remains a plain JSON array for backward compatibility.
Notes:
- The exact fields and values depend on the parser and target site's HTML structure. When running the real scraper, values reflect what is parsed from ForexFactory for the given date.
- The examples above match the app behavior implemented in
main.pyand the test fixtures intests/test_app.py.
This project exposes a tiny OpenAPI JSON and a Swagger UI page to help explore the endpoints:
- GET /openapi.json — returns a minimal OpenAPI 3 JSON describing the API (used by the UI).
- GET /swagger — serves a lightweight Swagger UI that loads
/openapi.json(served from a CDN; no new Python dependencies required).
Examples:
- Fetch the spec directly:
curl -sS http://localhost:5000/openapi.json | jq .- Open the interactive docs in your browser:
Visit: http://localhost:5000/swagger
Notes:
- The Swagger UI is loaded from a CDN (unpkg). If you need an offline or self-hosted UI, consider adding
swagger-uias a static asset or installing a Python package likeflasgger. - The OpenAPI spec is intentionally minimal and kept in
src/app.pyasOPENAPI_SPEC. You can expand it with schemas and richer response descriptions if you need stronger client generation.
Run the test suite with pytest:
pytest -qUnit tests are located in the tests/ folder. Network calls and external dependencies are isolated using monkeypatching to keep tests deterministic.
- The scraper depends on the target site's HTML structure. If ForexFactory changes its markup, the parsing code will need updating.
requirements.txtpins versions that were used during development; consider updating or pinning further for deployments.- Respect the target site's robots.txt and terms of service when scraping.
Contributions, bug reports, and feature requests are welcome. Please open an issue or a pull request.
This project is licensed under the MIT License — see the LICENSE file for details.