How does high renewable penetration drive electricity price volatility in Spain's grid?
As Spain increases its renewable energy capacity, the grid faces new challenges. Renewable generation (wind, solar, hydro) is inherently variable - clouds pass, wind dies down, reservoir levels fluctuate. When renewable output drops suddenly, dispatchable generation (gas, coal) must ramp up quickly, often at premium prices. This project quantifies that price risk using real data from Spain's electricity grid.
Note: This analysis uses the full renewable energy mix (hydro, wind, solar PV, solar thermal, and other renewables) - not just solar, wind and hydro. The data infrastructure already tracks all 19 generation sources individually, enabling future research into source-specific price impacts.
Analysis of 730 days of Spanish electricity data reveals:
| Metric | Value |
|---|---|
| Average renewable share | 59.7% |
| Renewable share range | 30.5% - 85.1% |
| Average daily price | 64.12 EUR/MWh |
| Price range | 0.40 - 146.52 EUR/MWh |
Pearson correlation: -0.64 (p < 0.001)
Higher renewable penetration is strongly associated with lower electricity prices. Each 1% increase in renewable share reduces prices by approximately 2.28 EUR/MWh.
Left: Scatter plot showing the inverse relationship between renewable share and price. Right: Price distribution comparison - high renewable days (≥60%) average 45 EUR/MWh vs 83 EUR/MWh for low renewable days.
price = -2.28 × renewable_share + 200.52
R² = 0.41 (41% of price variance explained by renewable share alone)
The regression coefficient (-2.28) quantifies the price sensitivity: a 10% drop in renewable generation leads to a ~23 EUR/MWh price increase.
What happens when renewable generation suddenly drops? I simulated 5,000 scenarios where high-renewable days (≥60%) experience random drops averaging 20%:
| Risk Metric | Value |
|---|---|
| Mean price spike | +46.31 EUR/MWh |
| 95th percentile spike | +82.80 EUR/MWh |
| 99th percentile spike | +96.94 EUR/MWh |
| P(price > 100 EUR/MWh) | 39.0% |
| P(price > 150 EUR/MWh) | 7.8% |
The simulation shows that sudden renewable drops can cause significant price spikes, with ~39% of scenarios resulting in prices exceeding 100 EUR/MWh.
A 20% renewable drop → ~46 EUR/MWh price spike
95th percentile spike: 83 EUR/MWh
Coefficient: 2.28 EUR/MWh per 1% renewable drop
This project is containerized for three key reasons:
1. Reproducibility
- Anyone can replicate these exact results with a single command
- No "works on my machine" issues - the analysis environment is frozen
- Pinned dependencies ensure consistent behavior across time
2. Data Pipeline Isolation
- API calls, data processing, and analysis run in an isolated environment
- No risk of conflicting with existing Python installations
- Clean separation between the host system and analysis code
3. Deployment Ready
- Can be scheduled as a cron job or CI/CD pipeline
- Easy to deploy on cloud platforms (AWS ECS, GCP Cloud Run, etc.)
- Same container works locally and in production
# One command to reproduce the entire analysis
docker run -v $(pwd)/outputs:/outputs -v $(pwd)/data:/data renewables-risk-sim
# If you are running it using windows cmd use:
docker run -v %cd%\outputs:/outputs -v %cd%\data:/data renewables-risk-simThis project uses Spain's official electricity data from REData API (Red Eléctrica de España):
| Data | Endpoint | Granularity |
|---|---|---|
| Electricity Prices | /es/datos/mercados/precios-mercados-tiempo-real |
15-min intervals |
| Generation Balance | /es/datos/balance/balance-electrico |
Daily |
No API token required - REData provides open access to Spanish grid data.
- Prices: 15-minute spot market prices aggregated to daily averages (EUR/MWh)
- Renewable Share:
(hydro + wind + solar PV + solar thermal + other renewables) / total demand × 100% - Output: Merged dataset with 26 columns covering all generation sources
docker build -t renewables-risk-sim .docker run -v $(pwd)/outputs:/outputs -v $(pwd)/data:/data renewables-risk-simCustom date range:
docker run -v $(pwd)/outputs:/outputs -v $(pwd)/data:/data \
renewables-risk-sim \
all --start-date 2024-01-01 --end-date 2024-06-30Fetch data only:
docker run -v $(pwd)/data:/data renewables-risk-sim \
fetch --start-date 2024-01-01 --end-date 2024-12-31Analyze existing data:
docker run -v $(pwd)/outputs:/outputs -v $(pwd)/data:/data renewables-risk-sim \
analyzeCustomize Monte Carlo parameters:
docker run -v $(pwd)/outputs:/outputs -v $(pwd)/data:/data renewables-risk-sim \
analyze --threshold 50 --drop-mean 15 --drop-std 5renewables-risk-sim/
├── main.py # CLI entry point with subcommands
├── data_fetch.py # REData API fetching and parsing
├── analysis.py # Statistical analysis and Monte Carlo
├── utils.py # Shared utilities
├── requirements.txt # Pinned Python dependencies
├── Dockerfile # Container definition
├── README.md # This file
├── data/ # Raw data output (CSV)
└── outputs/ # Analysis results (plots, reports)
usage: main.py {fetch,analyze,all} ...
Commands:
fetch Fetch data from REData API
analyze Run analysis on fetched data
all Run fetch and analyze in sequence
Subcommand options:
--data-dir Directory for data files (default: /data)
--output-dir Directory for output plots and reports (default: /outputs)
--start-date Start date YYYY-MM-DD (fetch, all)
--end-date End date YYYY-MM-DD (fetch, all)
--threshold High renewable threshold % for Monte Carlo (default: 60)
--drop-mean Mean renewable drop % for simulation (default: 20)
--drop-std Std dev of renewable drop % (default: 10)
| File | Description |
|---|---|
spain_renewables_prices.csv |
Raw merged dataset (730 rows, 26 columns) |
exploratory_analysis.png |
Scatter plot and box plot |
regression_diagnostics.png |
Regression fit and residual analysis |
monte_carlo_simulation.png |
Price spike distribution |
report.json |
Complete statistical results |
- Fetch 15-minute interval prices from REData API
- Fetch daily generation balance (19 sources + storage)
- Aggregate prices to daily averages
- Calculate renewable and non-renewable shares
- Merge datasets by date
- Scatter plot: renewable share vs. price
- Pearson correlation coefficient
- Price volatility comparison: high renewable days (≥60%) vs. low
- Model:
price ~ renewable_share - Extract coefficient for price impact per percentage point change
- Diagnostic plots: residuals vs fitted, residual distribution
- Filter high renewable days (≥60% share)
- Simulate random drops using truncated normal distribution (mean=20%, std=10%, lower bound=0%)
- Calculate:
new_price = base_price + (drop_pct × -coefficient) - 5,000 simulation runs
- Output: mean spike, 95th percentile, probability of extreme prices
Note on distribution choice: We use a truncated normal (bounded at 0) rather than a standard normal to model renewable drops. This avoids artificial spikes at zero caused by clipping negative samples, producing a more realistic distribution of price impacts.
Renewable (7 sources): Hydro, Wind, Solar PV, Solar Thermal, Hydro-wind, Other renewable, Renewable waste
Non-Renewable (8 sources): Nuclear, Combined Cycle (gas), Coal, Diesel, Gas Turbine, Steam Turbine, Cogeneration, Non-renewable waste
Storage (4 sources): Pumped hydro (generation/consumption), Battery (charge/discharge)
Docker is optional. You can run the project directly with Python 3.11+.
git clone https://github.com/yourusername/renewables-risk-sim.git
cd renewables-risk-sim
pip install -r requirements.txtCreate output directories:
- Linux/macOS:
mkdir data outputs - Windows PowerShell:
mkdir data, outputs
| Action | Command |
|---|---|
| Full pipeline | python main.py all --data-dir ./data --output-dir ./outputs --start-date 2024-01-01 --end-date 2024-12-31 |
| Fetch only | python main.py fetch --data-dir ./data --start-date 2024-01-01 --end-date 2024-12-31 |
| Analyze only | python main.py analyze --data-dir ./data --output-dir ./outputs |
| Custom params | python main.py analyze --data-dir ./data --output-dir ./outputs --threshold 50 --drop-mean 15 |
Results are saved to ./outputs/ (PNG plots + report.json).
This project lays the groundwork for deeper analysis. Potential extensions include:
Source-Specific Analysis
- The dataset already tracks all 19 generation sources individually
- Future work could analyze how drops in specific sources (e.g., wind alone, solar alone) affect prices differently
- Hydro variability (seasonal, drought conditions) could be studied separately from wind/solar intermittency
Cloud Deployment
- Deploy the container to AWS (ECS/Fargate) or GCP Cloud Run
- Schedule automated daily/weekly data refreshes
- Set up alerts for high-risk price conditions
Enhanced Modeling
- Multivariate regression including weather data, demand forecasts
- Time-series models (ARIMA, Prophet) for price forecasting
- Machine learning approaches for non-linear relationships
MIT


