Skip to content

Commit c506437

Browse files
committed
[WIP]
1 parent ebd09b8 commit c506437

12 files changed

+232
-138
lines changed

docs/Project.toml

+1
Original file line numberDiff line numberDiff line change
@@ -9,5 +9,6 @@ JSON = "682c06a0-de6a-54ab-a142-c8b1cf79cde6"
99
JuMP = "4076af6c-e467-56ae-b986-b466b2749572"
1010
LiveServer = "16fef848-5104-11e9-1b77-fb7a48bbb589"
1111
OrderedCollections = "bac558e1-5e72-5ebc-8fee-abe8a469f55d"
12+
TulipaClustering = "314fac8b-c762-4aa3-9d12-851379729163"
1213
TulipaEnergyModel = "5d7bd171-d18e-45a5-9111-f1f11ac5d04d"
1314
TulipaIO = "7b3808b7-0819-42d4-885c-978ba173db11"

docs/src/50-schemas.md

+94-7
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,19 @@
11
# [Data pipeline/workflow](@id data)
22

33
---
4-
TODO:
54

6-
- diagrams
7-
- Replace
5+
## [TODO](@id TODO)
6+
7+
- [ ] diagrams
8+
- [ ] Replace
89
> To create these tables we currently use CSV files that follow this same schema and then convert them into tables using TulipaIO, as shown in the basic example of the [Tutorials](@ref basic-example) section.
9-
- Review below
10+
- [ ]Review below
11+
- [ ] Link to OBZ
1012

1113
---
1214

1315
In this section we will take a look into the data in more details, focusing on what you need to go from your raw data all the way to the results.
16+
We also have a tutorial going over the [full workflow](#TODO), focusing on the code parts.
1417

1518
```@contents
1619
Pages = ["50-schemas.md"]
@@ -23,7 +26,7 @@ Here is a brief look at how we imagine a normal usage of the Tulipa model:
2326

2427
![Tulipa Workflow. Textual explanation below.](./figs/tulipa-workflow.jpg)
2528

26-
Brief explanation (more details in [TODO](#TODO)):
29+
Workflow explanation:
2730

2831
- **External source**: The first thing that you need, and hopefully have, is data. Currently, Tulipa does not provide any public data sources, so we expected that you will _load_ all required data.
2932
- **Create connection**: Tulipa uses a DuckDB database to store the input data, the representation of variables, constraints, and other internal tables, as well as the output. This database is informed through the `connection` argument in various parts of the API. Most notably, [`run_scenario`](@ref) and [`EnergyProblem`](@ref) receive the `connection` as main argument to create the model (and various internal tables).
@@ -69,8 +72,92 @@ To know the defaults, check the table [Schemas](@ref schemas) below.
6972

7073
### Example of using `populate_with_defaults!`
7174

72-
```@example
73-
using TulipaEnergyModel, TulipaIO, DuckDB
75+
Below we have the minimum amount of data (essentially, nothing), that is necessary to start Tulipa.
76+
77+
```@example minimum_data
78+
using TulipaEnergyModel, TulipaIO, DuckDB, DataFrames
79+
80+
data = Dict(
81+
# Basic asset data
82+
"input_asset" => DataFrame(
83+
:asset => ["some_producer", "some_consumer"],
84+
:type => ["producer", "consumer"],
85+
),
86+
"input_asset_both" => DataFrame(
87+
:asset => ["some_producer", "some_consumer"],
88+
:commission_year => [2030, 2030],
89+
:milestone_year => [2030, 2030],
90+
),
91+
"input_asset_commission" => DataFrame(
92+
:asset => ["some_producer", "some_consumer"],
93+
:commission_year => [2030, 2030],
94+
),
95+
"input_asset_milestone" => DataFrame(
96+
:asset => ["some_producer", "some_consumer"],
97+
:milestone_year => [2030, 2030],
98+
),
99+
100+
# Basic flow data
101+
"input_flow" => DataFrame(:from_asset => ["some_producer"], :to_asset => ["some_consumer"]),
102+
"input_flow_both" => DataFrame(
103+
:from_asset => ["some_producer"],
104+
:to_asset => ["some_consumer"],
105+
:commission_year => [2030],
106+
:milestone_year => [2030],
107+
),
108+
"input_flow_commission" => DataFrame(
109+
:from_asset => ["some_producer"],
110+
:to_asset => ["some_consumer"],
111+
:commission_year => [2030],
112+
),
113+
"input_flow_milestone" => DataFrame(
114+
:from_asset => ["some_producer"],
115+
:to_asset => ["some_consumer"],
116+
:milestone_year => [2030],
117+
),
118+
119+
# Basic time information
120+
"input_year_data" => DataFrame(:year => [2030]),
121+
"cluster_rep_periods_data" => DataFrame(:year => [2030, 2030], :rep_period => [1, 2]),
122+
"input_timeframe_data" => DataFrame(:year => 2030, :period => 1:365),
123+
"cluster_rep_periods_mapping" =>
124+
DataFrame(:year => 2030, :period => 1:365, :rep_period => mod1.(1:365, 2)),
125+
)
126+
```
127+
128+
And here we load this data into a DuckDB connection.
129+
130+
```@example minimum_data
131+
connection = DBInterface.connect(DuckDB.DB)
132+
133+
# Loading the minimum data in the connection
134+
for (table_name::String, table::DataFrame) in data
135+
DuckDB.register_data_frame(connection, table, table_name)
136+
end
137+
138+
# Table `input_asset`:
139+
DuckDB.query(connection, "FROM input_asset") |> DataFrame
140+
```
141+
142+
Now we run `populate_with_defaults!` to fill the remaining columns with default values:
143+
144+
```@example minimum_data
145+
TulipaEnergyModel.populate_with_defaults!(connection)
146+
147+
DuckDB.query(connection, "FROM input_asset") |> DataFrame
148+
```
149+
150+
You can see the table has been modified to include many more columns.
151+
Even this problem, with no relevant information, can be solved:
152+
153+
```@example minimum_data
154+
energy_problem = TulipaEnergyModel.run_scenario(
155+
connection;
156+
output_folder = mktempdir(),
157+
show_log = false,
158+
)
159+
160+
DuckDB.query(connection, "FROM var_flow LIMIT 5") |> DataFrame
74161
```
75162

76163
## Namespaces

src/data-preparation.jl

+5-5
Original file line numberDiff line numberDiff line change
@@ -178,10 +178,10 @@ function create_unrolled_partition_tables!(connection)
178178
rep_periods_data.year,
179179
rep_periods_data.rep_period,
180180
COALESCE(arpp.specification, 'uniform') AS specification,
181-
COALESCE(arpp.partition, '1') AS partition,
181+
COALESCE(arpp.partition::string, '1') AS partition,
182182
rep_periods_data.num_timesteps,
183183
FROM input_asset as asset
184-
CROSS JOIN input_rep_periods_data as rep_periods_data
184+
CROSS JOIN cluster_rep_periods_data as rep_periods_data
185185
LEFT JOIN input_assets_rep_periods_partitions as arpp
186186
ON asset.asset = arpp.asset
187187
AND rep_periods_data.year = arpp.year
@@ -199,12 +199,12 @@ function create_unrolled_partition_tables!(connection)
199199
rep_periods_data.year,
200200
rep_periods_data.rep_period,
201201
COALESCE(frpp.specification, 'uniform') AS specification,
202-
COALESCE(frpp.partition, '1') AS partition,
202+
COALESCE(frpp.partition::string, '1') AS partition,
203203
flow_commission.efficiency,
204204
flow_commission.flow_coefficient_in_capacity_constraint,
205205
rep_periods_data.num_timesteps,
206206
FROM input_flow as flow
207-
CROSS JOIN input_rep_periods_data as rep_periods_data
207+
CROSS JOIN cluster_rep_periods_data as rep_periods_data
208208
LEFT JOIN input_flow_commission as flow_commission
209209
ON flow.from_asset = flow_commission.from_asset
210210
AND flow.to_asset = flow_commission.to_asset
@@ -578,7 +578,7 @@ function create_highest_resolution_table!(connection)
578578
SELECT DISTINCT asset, year, rep_period, time_block_start
579579
FROM $merged_table
580580
) AS merged
581-
LEFT JOIN input_rep_periods_data as rep_periods_data
581+
LEFT JOIN cluster_rep_periods_data as rep_periods_data
582582
ON merged.year = rep_periods_data.year
583583
AND merged.rep_period = rep_periods_data.rep_period
584584
ORDER BY merged.asset, merged.year, merged.rep_period, time_block_start

src/data-validation.jl

+23-26
Original file line numberDiff line numberDiff line change
@@ -100,32 +100,29 @@ function _validate_no_duplicate_rows!(connection)
100100
# However, where to add this, and how to ensure it was added is not clear.
101101
duplicates = String[]
102102
for (table, primary_keys) in (
103-
("asset", (:asset,)),
104-
("asset_both", (:asset, :milestone_year, :commission_year)),
105-
("asset_commission", (:asset, :commission_year)),
106-
("asset_milestone", (:asset, :milestone_year)),
107-
("assets_profiles", (:asset, :commission_year, :profile_type)),
108-
("assets_rep_periods_partitions", (:asset, :year, :rep_period)),
109-
("assets_timeframe_partitions", (:asset, :year)),
110-
("assets_timeframe_profiles", (:asset, :commission_year, :profile_type)),
111-
("flow", (:from_asset, :to_asset)),
112-
("flow_both", (:from_asset, :to_asset, :milestone_year, :commission_year)),
113-
("flow_commission", (:from_asset, :to_asset, :commission_year)),
114-
("flow_milestone", (:from_asset, :to_asset, :milestone_year)),
115-
("flows_profiles", (:from_asset, :to_asset, :year, :profile_type)),
116-
("flows_rep_periods_partitions", (:from_asset, :to_asset, :year, :rep_period)),
117-
("group_asset", (:name, :milestone_year)),
118-
("profiles_rep_periods", (:profile_name, :year, :rep_period, :timestep)),
119-
("profiles_timeframe", (:profile_name, :year, :period)),
120-
("rep_periods_data", (:year, :rep_period)),
121-
("rep_periods_mapping", (:year, :period, :rep_period)),
122-
("timeframe_data", (:year, :period)),
123-
("year_data", (:year,)),
103+
("input_asset", (:asset,)),
104+
("input_asset_both", (:asset, :milestone_year, :commission_year)),
105+
("input_asset_commission", (:asset, :commission_year)),
106+
("input_asset_milestone", (:asset, :milestone_year)),
107+
("input_assets_profiles", (:asset, :commission_year, :profile_type)),
108+
("input_assets_rep_periods_partitions", (:asset, :year, :rep_period)),
109+
("input_assets_timeframe_partitions", (:asset, :year)),
110+
("input_assets_timeframe_profiles", (:asset, :commission_year, :profile_type)),
111+
("input_flow", (:from_asset, :to_asset)),
112+
("input_flow_both", (:from_asset, :to_asset, :milestone_year, :commission_year)),
113+
("input_flow_commission", (:from_asset, :to_asset, :commission_year)),
114+
("input_flow_milestone", (:from_asset, :to_asset, :milestone_year)),
115+
("input_flows_profiles", (:from_asset, :to_asset, :year, :profile_type)),
116+
("input_flows_rep_periods_partitions", (:from_asset, :to_asset, :year, :rep_period)),
117+
("input_group_asset", (:name, :milestone_year)),
118+
("cluster_profiles_rep_periods", (:profile_name, :year, :rep_period, :timestep)),
119+
("input_profiles_timeframe", (:profile_name, :year, :period)),
120+
("cluster_rep_periods_data", (:year, :rep_period)),
121+
("cluster_rep_periods_mapping", (:year, :period, :rep_period)),
122+
("input_timeframe_data", (:year, :period)),
123+
("input_year_data", (:year,)),
124124
)
125-
append!(
126-
duplicates,
127-
_validate_no_duplicate_rows!(connection, "input_" * table, primary_keys),
128-
)
125+
append!(duplicates, _validate_no_duplicate_rows!(connection, table, primary_keys))
129126
end
130127

131128
return duplicates
@@ -365,7 +362,7 @@ function _validate_use_binary_storage_method_has_investment_limit!(connection)
365362
)
366363
push!(
367364
error_messages,
368-
"Incorrect investment_limit = $(row.investment_limit) for investable storage asset '$(row.asset)' with use_binary_storage_method = '$(row.use_binary_storage_method)' for year $(row.milestone_year). The investment_limit at year $(row.commission_year) should be greater than 0 in 'asset_commission'.",
365+
"Incorrect investment_limit = $(row.investment_limit) for investable storage asset '$(row.asset)' with use_binary_storage_method = '$(row.use_binary_storage_method)' for year $(row.milestone_year). The investment_limit at year $(row.commission_year) should be greater than 0 in 'input_asset_commission'.",
369366
)
370367
end
371368

src/input-schemas.json

+64-64
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,68 @@
11
{
2+
"cluster_profiles_rep_periods": {
3+
"profile_name": {
4+
"description": "Profile name.",
5+
"type": "VARCHAR"
6+
},
7+
"rep_period": {
8+
"description": "Representative period number.",
9+
"type": "INTEGER"
10+
},
11+
"timestep": {
12+
"description": "Timestep number.",
13+
"type": "INTEGER"
14+
},
15+
"value": {
16+
"description": "Value of the profile.",
17+
"type": "DOUBLE",
18+
"unit_of_measure": "p.u."
19+
},
20+
"year": {
21+
"description": "Milestone year.",
22+
"type": "INTEGER"
23+
}
24+
},
25+
"cluster_rep_periods_data": {
26+
"num_timesteps": {
27+
"default": 8760,
28+
"description": "Number of timesteps",
29+
"type": "INTEGER"
30+
},
31+
"rep_period": {
32+
"description": "Representative period number.",
33+
"type": "INTEGER",
34+
"unit_of_measure": "number"
35+
},
36+
"resolution": {
37+
"default": 1,
38+
"description": "Duration of each timestep",
39+
"type": "DOUBLE",
40+
"unit_of_measure": "hours"
41+
},
42+
"year": {
43+
"description": "Milestone year.",
44+
"type": "INTEGER"
45+
}
46+
},
47+
"cluster_rep_periods_mapping": {
48+
"period": {
49+
"description": "Period number.",
50+
"type": "INTEGER"
51+
},
52+
"rep_period": {
53+
"description": "Representative period number.",
54+
"type": "INTEGER"
55+
},
56+
"weight": {
57+
"default": 1.0,
58+
"description": "Hours",
59+
"type": "DOUBLE"
60+
},
61+
"year": {
62+
"description": "Milestone year.",
63+
"type": "INTEGER"
64+
}
65+
},
266
"input_asset": {
367
"asset": {
468
"description": "Unique identifier with the name of the asset.",
@@ -707,29 +771,6 @@
707771
"type": "VARCHAR"
708772
}
709773
},
710-
"input_profiles_rep_periods": {
711-
"profile_name": {
712-
"description": "Profile name.",
713-
"type": "VARCHAR"
714-
},
715-
"rep_period": {
716-
"description": "Representative period number.",
717-
"type": "INTEGER"
718-
},
719-
"timestep": {
720-
"description": "Timestep number.",
721-
"type": "INTEGER"
722-
},
723-
"value": {
724-
"description": "Value of the profile.",
725-
"type": "DOUBLE",
726-
"unit_of_measure": "p.u."
727-
},
728-
"year": {
729-
"description": "Milestone year.",
730-
"type": "INTEGER"
731-
}
732-
},
733774
"input_profiles_timeframe": {
734775
"period": {
735776
"description": "Period.",
@@ -749,47 +790,6 @@
749790
"type": "INTEGER"
750791
}
751792
},
752-
"input_rep_periods_data": {
753-
"num_timesteps": {
754-
"default": 8760,
755-
"description": "Number of timesteps",
756-
"type": "INTEGER"
757-
},
758-
"rep_period": {
759-
"description": "Representative period number.",
760-
"type": "INTEGER",
761-
"unit_of_measure": "number"
762-
},
763-
"resolution": {
764-
"default": 1,
765-
"description": "Duration of each timestep",
766-
"type": "DOUBLE",
767-
"unit_of_measure": "hours"
768-
},
769-
"year": {
770-
"description": "Milestone year.",
771-
"type": "INTEGER"
772-
}
773-
},
774-
"input_rep_periods_mapping": {
775-
"period": {
776-
"description": "Period number.",
777-
"type": "INTEGER"
778-
},
779-
"rep_period": {
780-
"description": "Representative period number.",
781-
"type": "INTEGER"
782-
},
783-
"weight": {
784-
"default": 1.0,
785-
"description": "Hours",
786-
"type": "DOUBLE"
787-
},
788-
"year": {
789-
"description": "Milestone year.",
790-
"type": "INTEGER"
791-
}
792-
},
793793
"input_timeframe_data": {
794794
"num_timesteps": {
795795
"default": 8760,

0 commit comments

Comments
 (0)