Skip to content

Percentage estimation of ingredients #107

@laurenskz

Description

@laurenskz

Problem

The current estimation of ingredients percentages is a good step in the right direction but there is a lot of room for improvement. For example: Suppose we know that a product has the second ingredient sugar and the total carbs per 100 grams are 14. Then we know that the second ingredient <14g/100g. This implies the first ingredient is more than 14%. This is currently overlooked and we make arbitrary assumptions about ratios.

Proposed solution

There is a paper published which uses all available information about mandatory nutrients combined with linear optimization techniques: https://www.sciencedirect.com/science/article/pii/S0889157522001260.

The general idea is as follows. We have n ingredients, we have k known nutrients( mandatory labeling) :
For eachingredient_j (1<=j<=n):
Fetch nutrition info of ingredient_j (since it is simple ingredient there should be something in db). Let nutrient_j_k be nutrient value per 100g of nutrient k for food j.
Then we declare:
quantity_j as double with range (0,1)
add constraint if j>1: quantity_j <= quantity_{j-i}
if food has known percentage add constraint for that.
Now to the smart part:

Let total_nutrient_k be the value of total_nutrient_k for the total product. Then for each k that is known we add constraint:

quantity_1 * nutrient_1_k + quantity_2 * nutrient_2_k + ... + quantity_n * nutrient_n_k > 0.99 * total_nutrient_k
quantity_1 * nutrient_1_k + quantity_2 * nutrient_2_k + ... + quantity_n * nutrient_n_k < 1.01 * total_nutrient_k

Add following constraints:

sum_i quantity_i >0.99
sum_i quantity_i < 1.01

Now we have a system of linear constraints that can easily be solved by a LP solver. The resulting quantities will sum to one. And this should utilize all available information.

Additional context

The authors note the following: A study with known ingredient compositions shows that estimates are within a 0.9% difference of products’ actual recipes. This would be a huge improvement.

Code pointers

This could very easily be implemented in python using ortools library: https://developers.google.com/optimization/lp/lp_example#python_7 .

Number of products impacted

All products composed of multiple ingredients

Time per product

More accuracy for end user

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

Status

To discuss and validate

Status

To triage

Status

Todo

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions