Build Supply Use Tables by lbm364dl · Pull Request #17 · eduaguilera/whep

lbm364dl · 2025-06-26T13:40:38Z

It was tough to put everything together, and this part of the model must surely be revisited later for improvements.
For now I tried to group processes into three groups, each one with its logic as to where to get the data from. Some indirect things I also did:

Clean many datasets, standardize namings (_code and _name columns)
Always use code columns only, and have util functions to get the names back (add_item_cbs_name, add_area_name, add_item_prod_name and their counterparts)
Create more functions for cleaned datasets coming from @eduaguilera (get_feed_intake, get_primary_production, get_primary_residues). They can be used by anyone who needs to work with those data.

Lastly, I paste here an explanation that I initially wrote in an external task for all the steps followed for building the supply and use tables:

The final goal is to get an input-output matrix (concept from economy https://en.wikipedia.org/wiki/Input%E2%80%93output_model, a.k.a. technical coefficient matrix) for the commodity balance sheet items.
Why do we need the previous Supply-Use Table (SUT) format? In fact we don't, but once we have it, implementing item allocation to get technical coefficients is easier.
Why do we need to define processes? This can also be quite arbitrary. The concept of multi-output process is important because we have to allocate (share) its inputs through each of the outputs. NOTE: If we define in the same process two seemingly unrelated outputs we could get bad results when allocating, but we have to deal with it (see processing items below).
The process name is not that important, since we will be losing it anyway when converting to the input-output square matrix. We will generate process 'names' on the fly.

We will identify a process by a pair (proc_group, proc_cbs_code), where proc_cbs_code is the main item considered in the process (if that makes sense, see cases below) and proc_group is (for now) one of three values, which describe the kind of process (related to how its data will be calculated). These are:

crop_production:
- Process ("crop_production", <code_of_main_crop_produced>)
- Use: Seed of main crop (source: commodity balance sheet use data, Edu's CBS.csv)
- Supply:
  - Actual produced crops (source: Edu's Primary_all.csv),
  - Byproduct residues: Straw, Firewood, other... (source: Edu's Crop_NPPr_NoFallow.csv containing crop residues)
husbandry:
- Process ("husbandry", <code_of_animal_involved>)
- Use: Feed intake, crops used as feed (source: Edu's Feed_intake.csv)
- Supply:
  - Actual live animals (in mass) (source: Edu's Primary_all.csv for livestock entries, i.e. those with LU units, then converted to tonnes by 1 LU = 0.65 t)
  - Their animal products (Milk, Offals, Meat, etc) (source: Edu's Primary_all.csv for entries with non-NA Live_anim column)
processing:
- Process ("processing", <code_of_item_for_processing_use>)
- Use: The item used for processing. Note: This is always a single item by how input quantities are stored. This is not ideal but a good starting point. We get this data from Edu's Processing_coefs.csv, we don't really have processes well defined, so there could be more than one process 'grouped' by its input that might give bad results when allocating, but this is the way I thought we could start and have something done.
- Supply: The multiple outputs for each input item. Again keep in mind they might come from different processes, but for now we assume it is one process. Also comes from Processing_coefs.csv.
  Since we will try to generate some process groupings automatically, we won't really need the process correspondence tables we were trying to fill (items_supply.csv, items_use.csv). We might have to use a similar one for the crop production processes since there are some multioutput crops to treat separately.

This should cover most of the relations between items. There might be others that don't fit in here and we should figure that out later on. I also accept feedback.

…tions at start

…du/add-feed-use

eduaguilera

Very nice contribution. Overall, the only thing I miss is a guideline explaining the genral purpose of all scripts and inputs and outputs, and in what order to run them (or maybe I missed that?)

lbm364dl · 2025-06-30T16:10:28Z

Very nice contribution. Overall, the only thing I miss is a guideline explaining the genral purpose of all scripts and inputs and outputs, and in what order to run them (or maybe I missed that?)

I would expect more general functions will be created later on when we have all the steps implemented. That being said, for a more detailed explanation I would create an R Markdown article.

Build Supply Use Tables

eduaguilera and others added 30 commits May 22, 2025 12:45

update items_use.csv with own items and processes

fef4f2b

add devtools::load_all() to Rprfile to automatically add project func…

3dfea0d

…tions at start

Rebuild man

75d51b2

Load functions only if devtools package installed

9ddbb80

Add missing newline to pass linter

8f09bab

Merge branch 'main' into edu/add-feed-use

f5ee28b

Use commas

5410e4d

Fix multiproduct crops

94c2158

Fix use process namings for multicrops

acfd738

Add product residue supply processes

4487f02

Add product crop supply processes

c192a21

Rename item to item_cbs

4a3eda8

Rename add_item_ functions to add_item_cbs_

ac2c669

Create conversion functions for production codes

c032f9b

Add missing livestock cbs items

fe92684

Add get_production functions

00924be

Use correct crop to process column

4603bf7

Adapt crop residue supply to use of codes

168fa4c

Use codes only in get_processing_coefs

83d55a6

Add missing processed cbs items correspondence

5fc27dc

Update docs

32979a1

Use codes only in get_wide_cbs

49bde6c

Use codes only in get_bilateral_trade

2c903e6

Write _name explicitly in natural language name columns

0fbe2ec

Add missing trade cbs items correspondence

36758f7

Fix tests

c66f49b

Add missing used items in processes

799a1b4

Add more missing cbs correspondences

ec849a3

Add fallow to CBS items

375d31f

Add get_feed_intake

70c5cf8

lbm364dl added 10 commits June 24, 2025 09:51

Merge branch 'edu/add-feed-use' of github.com:eduaguilera/WHEP into e…

8af80e3

…du/add-feed-use

Build husbandry and processing supply use tables

7601161

Complete and clean supply_use

c75391d

Add list of items to use for crop production data

e452557

Remove processes

a44a8a4

State husbandry items clearly

dfadb85

Add tests for supply use

85c3756

Supress global variable warnings

0af8e0a

Update and improve build_supply_use docs

dfddb66

Fix lint

3b1ea07

lbm364dl requested a review from eduaguilera June 26, 2025 13:58