Skip to content

Conversation

@kalashsinghal
Copy link
Contributor

Dear @AnneSchoenauer @Tilmon,

This PR adds maps for all four countries (Netherlands, Austria, France, and Spain) other than Germany. Please have a look at the maps for all four countries on real data:

  • Austria
library(readr)
library(dplyr)
devtools::load_all(".")
#> ℹ Loading tiltPlot
options(width = 500)

transition_risk_product <- read_csv("transition_risk_profile_at_product_level_all_countries_wide_22_08_24.csv") 

tiltPlot:::map_region_risk(transition_risk_product,
                country_code = c("AT"),
                grouping_emission = "all",
                mode = "transition_risk_profile_best_case",
                scenario = "1.5C RPS",
                year = 2050,
                risk_category = "transition_risk_category")
#> Extracting data using giscoR package, please report issues on https://github.com/rOpenGov/giscoR/issues

Created on 2024-10-08 with reprex v2.0.2

  • France
library(readr)
library(dplyr)
devtools::load_all(".")
#> ℹ Loading tiltPlot
options(width = 500)

transition_risk_product <- read_csv("transition_risk_profile_at_product_level_all_countries_wide_22_08_24.csv") 

tiltPlot:::map_region_risk(transition_risk_product,
                country_code = c("FR"),
                grouping_emission = "all",
                mode = "transition_risk_profile_best_case",
                scenario = "1.5C RPS",
                year = 2050,
                risk_category = "transition_risk_category")
#> Extracting data using giscoR package, please report issues on https://github.com/rOpenGov/giscoR/issues

Created on 2024-10-08 with reprex v2.0.2

  • Spain
library(readr)
library(dplyr)
devtools::load_all(".")
#> ℹ Loading tiltPlot
options(width = 500)

transition_risk_product <- read_csv("transition_risk_profile_at_product_level_all_countries_wide_22_08_24.csv") 

tiltPlot:::map_region_risk(transition_risk_product,
                country_code = c("ES"),
                grouping_emission = "all",
                mode = "transition_risk_profile_best_case",
                scenario = "1.5C RPS",
                year = 2050,
                risk_category = "transition_risk_category")
#> Extracting data using giscoR package, please report issues on https://github.com/rOpenGov/giscoR/issues

Created on 2024-10-08 with reprex v2.0.2

  • Netherlands
library(readr)
library(dplyr)
devtools::load_all(".")
#> ℹ Loading tiltPlot
options(width = 500)

transition_risk_product <- read_csv("transition_risk_profile_at_product_level_all_countries_wide_22_08_24.csv") 

tiltPlot:::map_region_risk(transition_risk_product,
                country_code = c("NL"),
                grouping_emission = "all",
                mode = "transition_risk_profile_best_case",
                scenario = "1.5C RPS",
                year = 2050,
                risk_category = "transition_risk_category")
#> Extracting data using giscoR package, please report issues on https://github.com/rOpenGov/giscoR/issues

Created on 2024-10-08 with reprex v2.0.2


TODO

  • Link related issue/PR.
  • Describe the goal of the PR. Avoid details that are clear in the diff.
  • Mark the PR as draft.
  • Include a unit test.
  • Review your own PR in "Files changed".
  • Ensure the PR branch is updated.
  • Ensure the checks pass.
  • Change the status from draft to ready.
  • Polish the PR description to reflect what it ended up doing.
  • Polish the PR title as you'd like others to read it from the git log.
  • Assign a reviewer.
  • Document user-facing changes in the changelog.

@Tilmon
Copy link

Tilmon commented Oct 10, 2024

@kalashsinghal looks good, thanks! In addition to @AnneSchoenauer 's request on slack, I wonder if it's possible to show France without the islands in the Caribbean? Otherwise, the mainland part of France is so small...

@kalashsinghal
Copy link
Contributor Author

Hi @AnneSchoenauer @Tilmon ,

It took me a while to understand Linda's code completely on how the colouring works. Let's break down the whole procedure step by step:

1. Selection of sample data

library(readr)
library(dplyr)
library(tibble)
library(eurostat)
library(sf)
#> Linking to GEOS 3.13.0, GDAL 3.9.2, PROJ 9.5.0; sf_use_s2() is TRUE
library(ggplot2)
devtools::load_all(".")
#> ℹ Loading tiltPlot
options(width = 500)

example_best_case_worst_case_sample <- example_data_factory(
  # styler: off
  tribble(
    ~companies_id,  ~country, ~postcode, ~grouping_emission,  ~scenario,  ~year, ~transition_risk_category, ~transition_risk_profile_best_case,
    "comp_1", "austria",    "1020",              "all", "1.5C RPS", "2050",                     "low",                                1.0,
    "comp_1", "austria",    "1020",              "all", "1.5C RPS", "2050",                     "low",                                2.0,
    "comp_1", "austria",    "1020",              "all", "1.5C RPS", "2050",                  "medium",                                3.0,
    "comp_1", "austria",    "1020",              "all", "1.5C RPS", "2050",                  "medium",                                4.0,
    "comp_1", "austria",    "1020",              "all", "1.5C RPS", "2050",                    "high",                                5.0,
    "comp_1", "austria",    "1020",              "all", "1.5C RPS", "2050",                    "high",                                6.0
  )
  # styler: on
)
example <- example_best_case_worst_case_sample()
example
#> # A tibble: 6 × 8
#>   companies_id country postcode grouping_emission scenario year  transition_risk_category transition_risk_profile_best_case
#>   <chr>        <chr>   <chr>    <chr>             <chr>    <chr> <chr>                                                <dbl>
#> 1 comp_1       austria 1020     all               1.5C RPS 2050  low                                                      1
#> 2 comp_1       austria 1020     all               1.5C RPS 2050  low                                                      2
#> 3 comp_1       austria 1020     all               1.5C RPS 2050  medium                                                   3
#> 4 comp_1       austria 1020     all               1.5C RPS 2050  medium                                                   4
#> 5 comp_1       austria 1020     all               1.5C RPS 2050  high                                                     5
#> 6 comp_1       austria 1020     all               1.5C RPS 2050  high                                                     6

2. Create aggregated data from sample data

shp_0 <- get_eurostat_geospatial(
  resolution = 10,
  nuts_level = 3,
  year = 2016,
  crs = 3035
)
#> Extracting data using giscoR package, please report issues on https://github.com/rOpenGov/giscoR/issues

# filter for the specified country
shp_1 <- shp_0 |>
  filter(.data$CNTR_CODE == "AT") |>
  select(geo = "NUTS_ID", "geometry") |>
  arrange(.data$geo) |>
  st_as_sf() |>
  inner_join(nuts_all, by = "geo")

agg <- example |>
  left_join(shp_1, by = "postcode") |>
  st_as_sf() |>
  group_by(.data$postcode, .data$transition_risk_category) |>
  summarise(total_mode = sum(.data$transition_risk_profile_best_case)) |>
  group_by(.data$postcode) |>
  mutate(proportion = .data$total_mode / sum(.data$total_mode)) |>
  ungroup()
#> `summarise()` has grouped output by 'postcode'. You can override using the `.groups` argument.

agg
#> # A tibble: 3 × 5
#>   postcode transition_risk_category total_mode                                                                                     geometry proportion
#>   <chr>    <chr>                         <dbl>                                                                                <POLYGON [m]>      <dbl>
#> 1 1020     high                             11 ((4810026 2802881, 4803838 2803770, 4797684 2799650, 4788742 2798937, 4780682 2803584, 47...      0.524
#> 2 1020     low                               3 ((4810026 2802881, 4803838 2803770, 4797684 2799650, 4788742 2798937, 4780682 2803584, 47...      0.143
#> 3 1020     medium                            7 ((4810026 2802881, 4803838 2803770, 4797684 2799650, 4788742 2798937, 4780682 2803584, 47...      0.333

NOTE: The total_mode column is the sum of transition_risk_profile_best_case column values after grouping them by postcode and transition_risk_category. And, the column proportion is the ratio of total_mode/ sum of total_mode. The proportion will behave as the weightage of colours red (high risk), yellow (medium risk), and green (low risk). You can also check the risk colours using this snapshot:

image

3. Pivot aggregated data

aggregated_data <- agg |>
  pivot_wider(names_from = transition_risk_category , values_from = "proportion", values_fill = 0) |>
  filter(.data$total_mode != 0) |>
  group_by(.data$postcode) |>
  summarise(
    total_mode = add(.data$total_mode),
    geometry = first(.data$geometry),
    low = add(.data$low),
    medium = add(.data$medium),
    high = add(.data$high)
  ) |>
  mutate(color = pmap(list(.data$high, .data$medium, .data$low), custom_gradient_color))

aggregated_data
#> # A tibble: 1 × 7
#>   postcode total_mode                                                                                     geometry   low medium  high color    
#> * <chr>         <dbl>                                                                                <POLYGON [m]> <dbl>  <dbl> <dbl> <list>   
#> 1 1020             21 ((4810026 2802881, 4803838 2803770, 4797684 2799650, 4788742 2798937, 4780682 2803584, 47... 0.143  0.333 0.524 <chr [1]>

NOTE: The final colour for the postcode is derived using this formula:
0.143*green (low) + 0.333*yellow (medium) + 0.524*red (high)

4. Plot aggregated data

ggplot() +
  geom_sf(data = aggregated_data, mapping = aes(fill = .data$color)) +
  geom_sf(data = shp_1, fill = NA) +
  coord_sf() +
  theme_tiltplot()

Created on 2024-10-14 with reprex v2.0.2

@kalashsinghal
Copy link
Contributor Author

Here is another example which only has values for low transition risk category:

library(readr)
library(dplyr)
devtools::load_all(".")
#> ℹ Loading tiltPlot
options(width = 500)

example_best_case_worst_case_sample <- example_data_factory(
  # styler: off
  tribble(
    ~companies_id,  ~country, ~postcode, ~grouping_emission,  ~scenario,  ~year, ~transition_risk_category, ~transition_risk_profile_best_case,
    "comp_1", "austria",    "1020",              "all", "1.5C RPS", "2050",                     "low",                                1.0,
    "comp_1", "austria",    "1020",              "all", "1.5C RPS", "2050",                     "low",                                2.0,
    "comp_1", "austria",    "1020",              "all", "1.5C RPS", "2050",                     "medium",                                0.0,
    "comp_1", "austria",    "1020",              "all", "1.5C RPS", "2050",                     "high",                                0.0
  )
  # styler: on
)

example <- example_best_case_worst_case_sample()
example
#> # A tibble: 4 × 8
#>   companies_id country postcode grouping_emission scenario year  transition_risk_category transition_risk_profile_best_case
#>   <chr>        <chr>   <chr>    <chr>             <chr>    <chr> <chr>                                                <dbl>
#> 1 comp_1       austria 1020     all               1.5C RPS 2050  low                                                      1
#> 2 comp_1       austria 1020     all               1.5C RPS 2050  low                                                      2
#> 3 comp_1       austria 1020     all               1.5C RPS 2050  medium                                                   0
#> 4 comp_1       austria 1020     all               1.5C RPS 2050  high                                                     0

tiltPlot:::map_region_risk(example,
                           country_code = c("AT"),
                           grouping_emission = "all",
                           mode = "transition_risk_profile_best_case",
                           scenario = "1.5C RPS",
                           year = 2050,
                           risk_category = "transition_risk_category")
#> Extracting data using giscoR package, please report issues on https://github.com/rOpenGov/giscoR/issues

Created on 2024-10-14 with reprex v2.0.2

@Tilmon
Copy link

Tilmon commented Oct 14, 2024

Discussion from Sprint @kalashsinghal

For a first version, please create maps with the following methodological choices, for emission and transition risk indicator:

  • grouping: all
  • scenario: ipr
  • year: 2030

@Tilmon
Copy link

Tilmon commented Oct 14, 2024

@kalashsinghal regarding the France map: Please try for another two hours to remove the FR islands. But if it doesn't work, just skip it for now.

@Tilmon
Copy link

Tilmon commented Oct 14, 2024

@kalashsinghal please make sure empty postcodes are still shown as grey (or show them in any other way).

@Tilmon
Copy link

Tilmon commented Oct 14, 2024

@kalashsinghal this looks good! Thanks for the explanation.

@kalashsinghal
Copy link
Contributor Author

Hi @Tilmon @AnneSchoenauer,

I identified that that the colouring in maps is wrong, because only a single postcode is used to colour the whole NUTS region. This is happening because we have one-to-many mapping between NUTS-postcode mapper. If we choose to colour the maps based on NUTS code instead of postcode then most of the maps made from the real data will have orangish colour. To better understand the case for a single NUTS region, I have taken the NUTS code NL325 which is mapped to 8 postcodes. Please find attached the csv file which contains all the information of companies with NUTS code NL325

transition_risk_best_case_NUTS_NL325_netherlands_map.csv

This file is the product level data of transition risk indicator for a single NUTS code. Now, I will walk you through the whole process of map creation from the 8 unique postcodes of NL325 for transition risk best case. After code processing, we will get the first table like this:

image

And then to get colour for each postcode, we get the second colour table like this:

image

The colour #E59E47 present in the last row will be used to colour the whole NUTS region NL325, like this:

image

Please use this reprex to understand the logic behind the code processing. If you have questions then, please let me know!

@Tilmon
Copy link

Tilmon commented Oct 23, 2024

Hi @kalashsinghal thanks for the info:

  1. Regarding the dataset you shared: Indeed, the average of the best case values is around 0.58 so it's correct that the color is orange... To understand if it makes sense that all NUTS codes are organge, we need to understand the color coding and/or the distribution of the NUTS-level averages better. For that, can you provide us with the following?
  • A table that shows per NUTS code the transition risk best case, worst case and equal weight values? I.e. four columns in total? That way we may be able to better understand the differences between NUTS-level averages.
  • Also, it would be great to understand if the map is also completely orange if we use the equal weight estimates... Could you provide a country map with equal weight values instead of best case values?
  1. Question: In the table from the screenshot there is the column sum_transition_risk_best_case --> is this column used for the graph? Because I don't think it's needed, right?

cc' @AnneSchoenauer

@kalashsinghal
Copy link
Contributor Author

it would be great to understand if the map is also completely orange if we use the equal weight estimates... Could you provide a country map with equal weight values instead of best case values?

@Tilmon This is the graph of transition risk equal weight values for Germany (for group: 1.5C RPS_2030_all):

image

A table that shows per NUTS code the transition risk best case, worst case and equal weight values? I.e. four columns in total? That way we may be able to better understand the differences between NUTS-level averages.

@Tilmon Please have a look at the csv file with NUTS codes, transition risk best case, worst case and equal weight values for the German map! (for group: 1.5C RPS_2030_all)

transition_risk_NUTS_germany_map.csv

Question: In the table from the screenshot there is the column sum_transition_risk_best_case --> is this column used for the graph? Because I don't think it's needed, right?

@Tilmon Right! This column is not used for the graph.

cc' @AnneSchoenauer

@Tilmon
Copy link

Tilmon commented Oct 25, 2024

NUTS Dataset

@Tilmon Please have a look at the csv file with NUTS codes, transition risk best case, worst case and equal weight values for the German map! (for group: 1.5C RPS_2030_all)

transition_risk_NUTS_germany_map.csv

Hi @kalashsinghal , this table has multiple rows per NUTS code - is it still for each company? I would need a table that shows one value per NUTS code, i.e. the value that you use to create the colors in the map.

I want to see that to be able to understand the distribution of the values - I would expect that values are almost the same across NUTS codes given that the colors are all orange. But maybe that's wrong and values actually vary greatly but only the color codes are all the same.

So please share a file with results per NUTS code - thanks!

Total mode

And @kalashsinghal can you explain again why actually we use as basis for coloring the (total mode)/(sum of total mode)? E.g., why don't we simply use the average transition risk score per company? That's something that came to my mind after our discussion on Monday with Anne about the bar charts... I mean the map is supposed to show the distribution of the transition risk. Then wouldn't it be more useful to use the actual transition risk indicator on company-level? Just thinking out loud...

@kalashsinghal
Copy link
Contributor Author

So please share a file with results per NUTS code - thanks!

Hi @Tilmon,

Here is the file you need:
transition_risk_NUTS_germany_map_v2.csv

These NUTS code are only for country Germany and for group: 1.5C RPS_2030_all.

can you explain again why actually we use as basis for coloring the (total mode)/(sum of total mode)?

@Tilmon I am not aware of the reason because Linda worked on the map code. I guess the methodology is yet to approve from @AnneSchoenauer. @AnneSchoenauer Did you reviewed the methodology of German map created by Linda?

@kalashsinghal
Copy link
Contributor Author

kalashsinghal commented Oct 30, 2024

Hi @AnneSchoenauer @Tilmon,

UPDATE: To remove the issue of orangish colour on each NUTS code of the map, I have chosen the following colours for the risk categories:

image

Please have a look at the best_case country maps for emission and transition risk indicators after applying the filter IPR_2030_all based on this comment:

emission_profile_best_case_AT_all.pdf
emission_profile_best_case_DE_all.pdf
emission_profile_best_case_ES_all.pdf
emission_profile_best_case_FR_all.pdf
emission_profile_best_case_NL_all.pdf
transition_risk_best_case_AT_ipr_2030_all.pdf
transition_risk_best_case_DE_ipr_2030_all.pdf
transition_risk_best_case_ES_ipr_2030_all.pdf
transition_risk_best_case_FR_ipr_2030_all.pdf
transition_risk_best_case_NL_ipr_2030_all.pdf

If the results look satisfactory to you, then I am happy to create maps for worst_case and equal_weight with different combinations of tilt_subsector, weo, and 2050 filters.

Please let me know what you think about the first version!

@Tilmon
Copy link

Tilmon commented Oct 30, 2024

Hi @kalashsinghal , thanks! The TR Indicator maps look OK I think, but I find it counter intuitive that all the emission indicator maps are mostly red, even though it's best case... Do you have an explanation for that?

@kalashsinghal
Copy link
Contributor Author

kalashsinghal commented Oct 30, 2024

but I find it counter intuitive that all the emission indicator maps are mostly red, even though it's best case... Do you have an explanation for that?

Hi @Tilmon @AnneSchoenauer,

We use the sum of best case values to give weights to the colours. And, the high risk category has bigger best case values than low and medium categories. This is the reason why the weightage to the red colour is high, and hence the emission indicator maps are mostly red. For example:

image

The above image shows that the high risk category has higher value for emissions_profile_best_case than medium and low risk categories. This is the major reason why we mostly have more weightage for the red colour than the other two colours.

Does it makes sense?

FYI: As you guyz already know that profile ranking of high risk categories will be higher than the medium and low risk categories, it is very obvious that the best case values will be higher too for high risk categories.

@Tilmon
Copy link

Tilmon commented Oct 31, 2024

@kalashsinghal can you explain where the screenshot comes from? Which results does it show?

Also, just to confirm: My understanding is that for the country map, we group by NUTS code and the sum up the results per risk category and the divide the sums per risk category by the total sum across all risk categories. So in the end, we have per NUTS code one value for low, medium and high.
Is that correct? I'm not sure where you screenshot is in these steps.

Also can you explain this better? Not sure I know what you mean...
"FYI: As you guyz already know that profile ranking of high risk categories will be higher than the medium and low risk categories, it is very obvious that the best case values will be higher too for high risk categories."

cc' @AnneSchoenauer

@kalashsinghal
Copy link
Contributor Author

The TR Indicator maps look OK I think, but I find it counter intuitive that all the emission indicator maps are mostly red, even though it's best case... Do you have an explanation for that?

Hi @Tilmon,

After our discussion on the call, we decided to check the emission profile map on company-level data. Do your remember?

I replaced the column emission_profile_best_case (product-level) with emission_rank_avg_best_case (company-level) in the procedure described in this comment. As you can see from the map below, the company-level data also shows mostly red colour in the map. This means that company-level data is of no use to resolve the red colour issue. Should we discuss the next steps in tech sprint?

image

@Tilmon
Copy link

Tilmon commented Nov 18, 2024

Hi @kalashsinghal ,

please change the country mapping as discussed so that it calculates the colors by counting the products per category and then calculating the proportion as share of products in each category. That way, we avoid the bias of the ranking amounts.

@kalashsinghal
Copy link
Contributor Author

kalashsinghal commented Dec 20, 2024

please change the country mapping as discussed so that it calculates the colors by counting the products per category and then calculating the proportion as share of products in each category. That way, we avoid the bias of the ranking amounts.

Hi @Tilmon @AnneSchoenauer,

After changing the calculation by counting the products per category, I created the maps for emission profile and transition risk profile below. If these maps look ok to you then we can consider this change as final. What do you think?

Emission profile:

library(readr)
library(dplyr)
devtools::load_all(".")
#> ℹ Loading tiltPlot
options(width = 500)

transition_risk_product <- read_csv("transition_risk_profile_at_product_level_23_10_24.csv")

tiltPlot:::map_region_risk(transition_risk_product,
                           country_code = c("DE"),
                           grouping_emission = "all",
                           scenario = NULL,
                           year = NULL,
                           risk_category = "emission_category")
#> Extracting data using giscoR package, please report issues on https://github.com/rOpenGov/giscoR/issues

Created on 2024-12-20 with reprex v2.1.1

Transition risk profile:

library(readr)
library(dplyr)
devtools::load_all(".")
#> ℹ Loading tiltPlot
options(width = 500)

transition_risk_product <- read_csv("transition_risk_profile_at_product_level_23_10_24.csv")

tiltPlot:::map_region_risk(transition_risk_product,
                           country_code = c("DE"),
                           grouping_emission = "all",
                           scenario = "1.5C RPS",
                           year = 2050,
                           risk_category = "transition_risk_category")
#> Extracting data using giscoR package, please report issues on https://github.com/rOpenGov/giscoR/issues

Created on 2024-12-20 with reprex v2.1.1

@AnneSchoenauer
Copy link
Collaborator

Hi @kalashsinghal thanks a lot! Could you please create a litte reprex with which we can see the calculations behind the plot? Thanks :)

@kalashsinghal
Copy link
Contributor Author

Hi @AnneSchoenauer @Tilmon,

Here is the reprex which will show the calculations of new maps step by step:

1. Selection of sample data

library(readr)
library(dplyr)
library(tibble)
library(eurostat)
library(ggplot2)
devtools::load_all(".")
#> ℹ Loading tiltPlot
options(width = 500)

example_best_case_worst_case_sample <- example_data_factory(
  # styler: off
  tribble(
    ~companies_id,  ~country, ~postcode, ~grouping_emission,  ~scenario,  ~year, ~transition_risk_category, ~product,
         "comp_1", "austria",    "8332",              "all", "1.5C RPS", "2050",                     "low",      "a",
         "comp_1", "austria",    "8332",              "all", "1.5C RPS", "2050",                     "low",      "b",
         "comp_1", "austria",    "8332",              "all", "1.5C RPS", "2050",                  "medium",      "c",
         "comp_1", "austria",    "8332",              "all", "1.5C RPS", "2050",                    "high",      "d",
         "comp_2", "austria",    "7304",              "all", "1.5C RPS", "2050",                     "low",      "e",
         "comp_2", "austria",    "7304",              "all", "1.5C RPS", "2050",                  "medium",      "f",
         "comp_2", "austria",    "7304",              "all", "1.5C RPS", "2050",                    "high",      "g",
         "comp_2", "austria",    "7304",              "all", "1.5C RPS", "2050",                    "high",      "h",
         "comp_2", "austria",    "7304",              "all", "1.5C RPS", "2050",                    "high",      "i"
  )
  # styler: on
)

example <- example_best_case_worst_case_sample()
example
#> # A tibble: 9 × 8
#>   companies_id country postcode grouping_emission scenario year  transition_risk_category product
#>   <chr>        <chr>   <chr>    <chr>             <chr>    <chr> <chr>                    <chr>  
#> 1 comp_1       austria 8332     all               1.5C RPS 2050  low                      a      
#> 2 comp_1       austria 8332     all               1.5C RPS 2050  low                      b      
#> 3 comp_1       austria 8332     all               1.5C RPS 2050  medium                   c      
#> 4 comp_1       austria 8332     all               1.5C RPS 2050  high                     d      
#> 5 comp_2       austria 7304     all               1.5C RPS 2050  low                      e      
#> 6 comp_2       austria 7304     all               1.5C RPS 2050  medium                   f      
#> 7 comp_2       austria 7304     all               1.5C RPS 2050  high                     g      
#> 8 comp_2       austria 7304     all               1.5C RPS 2050  high                     h      
#> 9 comp_2       austria 7304     all               1.5C RPS 2050  high                     i

2. Create aggregated data from sample data

shp_0 <- eurostat::get_eurostat_geospatial(
  resolution = 10,
  nuts_level = "all",
  year = 2021,
  crs = 3035
) |>
  filter(!(geo %in% c("FRY10", "FRY20", "FRY30", "FRY40", "FRY50")))
#> Extracting data using giscoR package, please report issues on https://github.com/rOpenGov/giscoR/issues

# filter for the specified country
shp_1 <- shp_0 |>
  filter(.data$CNTR_CODE == "AT") |>
  select(geo = "NUTS_ID", "geometry") |>
  arrange(.data$geo) |>
  st_as_sf() |>
  inner_join(nuts_all, by = "geo")

agg <- example |>
  left_join(shp_1, by = "postcode") |>
  st_as_sf() |>
  filter(country == "austria") |>
  group_by(.data$geo, .data$transition_risk_category) |>
  summarise(total_mode = n_distinct(.data$product, na.rm = TRUE)) |>
  group_by(.data$geo) |>
  mutate(proportion = .data$total_mode / sum(.data$total_mode, na.rm = TRUE)) |>
  ungroup() 
#> `summarise()` has grouped output by 'geo'. You can override using the `.groups` argument.

agg
#> # A tibble: 6 × 5
#>   geo   transition_risk_category total_mode                                                                                     geometry proportion
#>   <chr> <chr>                         <int>                                                                                <POLYGON [m]>      <dbl>
#> 1 AT111 high                              3 ((4821839 2726166, 4809143 2719917, 4806751 2714375, 4803462 2713240, 4802031 2714153, 47...       0.6 
#> 2 AT111 low                               1 ((4821839 2726166, 4809143 2719917, 4806751 2714375, 4803462 2713240, 4802031 2714153, 47...       0.2 
#> 3 AT111 medium                            1 ((4821839 2726166, 4809143 2719917, 4806751 2714375, 4803462 2713240, 4802031 2714153, 47...       0.2 
#> 4 AT224 high                              1 ((4786359 2720266, 4779599 2717444, 4776596 2712017, 4783789 2691337, 4783747 2681535, 47...       0.25
#> 5 AT224 low                               2 ((4786359 2720266, 4779599 2717444, 4776596 2712017, 4783789 2691337, 4783747 2681535, 47...       0.5 
#> 6 AT224 medium                            1 ((4786359 2720266, 4779599 2717444, 4776596 2712017, 4783789 2691337, 4783747 2681535, 47...       0.25

NOTE: The total_mode column gives us the count of distinct products from product column values after grouping them by NUTS code (geo) and transition_risk_category. The column proportion is the ratio of total_mode/ sum of total_mode. The proportion will behave as the weightage for colours red (high risk), blue (medium risk), and green (low risk). You can also check the risk colours using this snapshot:

image

3. Pivot aggregated data

aggregated_data <- agg |>
  pivot_wider(names_from = transition_risk_category, values_from = "proportion", values_fill = 0) |>
  filter(.data$total_mode != 0) |>
  group_by(.data$geo) |>
  summarise(
    total_mode = add(.data$total_mode),
    geometry = first(.data$geometry),
    low = if ("low" %in% names(pick(everything()))) add(.data$low) else 0.0,
    medium = if ("medium" %in% names(pick(everything()))) add(.data$medium) else 0.0,
    high = if ("high" %in% names(pick(everything()))) add(.data$high) else 0.0
  ) |>
  mutate(color = pmap(list(.data$high, .data$medium, .data$low), custom_gradient_color)) 

aggregated_data
#> # A tibble: 2 × 7
#>   geo   total_mode                                                                                     geometry   low medium  high color    
#> * <chr>      <int>                                                                                <POLYGON [m]> <dbl>  <dbl> <dbl> <list>   
#> 1 AT111          4 ((4821839 2726166, 4809143 2719917, 4806751 2714375, 4803462 2713240, 4802031 2714153, 47...   0.2   0.2   0.6  <chr [1]>
#> 2 AT224          3 ((4786359 2720266, 4779599 2717444, 4776596 2712017, 4783789 2691337, 4783747 2681535, 47...   0.5   0.25  0.25 <chr [1]>

NOTE: The final colour for the NUTS codes are derived using this formula:
AT111: 0.2*green (low) + 0.2*blue (medium) + 0.6*red (high)
AT224: 0.5*green (low) + 0.25*blue (medium) + 0.25*red (high)

4. Plot aggregated data

ggplot() +
  geom_sf(data = aggregated_data, mapping = aes(fill = .data$color)) +
  geom_sf(data = shp_1, fill = NA) +
  coord_sf() +
  theme_tiltplot()

Created on 2025-01-02 with reprex v2.1.1

@AnneSchoenauer
Copy link
Collaborator

Dear @kalashsinghal this sounds perfect°! Well done :)

@kalashsinghal kalashsinghal merged commit d8cb60e into main Jan 3, 2025
6 checks passed
@kalashsinghal kalashsinghal deleted the 141_map_for_all_countries branch January 3, 2025 09:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create map for all countries

4 participants