Skip to content

Comments

feat: Unit standardization / conversion#7121

Open
michael-genson wants to merge 33 commits intomealie-nextfrom
feat/standardize-units
Open

feat: Unit standardization / conversion#7121
michael-genson wants to merge 33 commits intomealie-nextfrom
feat/standardize-units

Conversation

@michael-genson
Copy link
Collaborator

What this PR does / why we need it:

(REQUIRED)

Enables standardization of units, allowing for things such as automatic unit conversion. This PR specifically implements:

  • Two new unit fields: standard quantity & standard unit
  • A bunch of unit conversion tools levering these tools and the pint library (which is already used by our NLP library)
  • Automatic standardization of new/existing units (where possible)
  • Automatic merging of shopping list items for compatible units

To the user the only new useful feature of this PR is automatically merging shopping list units:
2026-02-21_16h30_44

Going forward, unit standardization will enable several features, such as:

  • Automatic conversion of recipe ingredients on the fly (we can add an "Imperial" vs "Metric" toggle)
  • Unit parsing post-processing (e.g. define "pint" but always convert into "cup" or "liter")
  • Automatic nutrition calculation (not fully enabled by this PR, but is a required step)

How is the data stored?

At the data layer standard units are just strings. The understanding is that anything stored as a "standard unit" is something understood by pint, but we don't actually validate this (the unit conversion processing has error handling if we get bad data, so it doesn't matter). On the frontend we have an arbitrary hardcoded list:

  • fluid ounce
  • cup
  • ounce
  • pound
  • milliliter
  • liter
  • gram
  • kilogram

Pint supports hundreds of units so this is obviously not comprehensive, but I think this covers everything a Mealie user would need (includes a few different volume and mass units, in both imperial and metric systems). We could always add/remove options, too, since this list is not stored in the db.

How is this data used?

The new UnitConverter class handles conversions between units. It's a wrapper around pint and enables parsing units as strings or pint.Unit objects, as well as associated quantities. When we receive a Mealie unit we tell pint what this unit means:

uc.ureg.define(f"mealie-unit-1 = {unit_1.standard_quantity} * {unit_1.standard_unit}")
uc.ureg.define(f"mealie-unit-2 = {unit_2.standard_quantity} * {unit_2.standard_unit}")

...and now pint understands what mealie-unit-1 and mealie-unit-2 are. This is done at runtime so we don't have to manage any external definitions file.

How is this data managed?

Since units, unlike foods, are pretty universally standardized in context of recipes, I've implemented several automatic standardization methods:

  1. When a unit is created (either manually or via seeding) and standardization data is not provided, if it matches one of several hardcoded units (teaspoon, cup, liter, pound, kilogram, etc.) we automatically add the standard definitions to them. This hardcoded mapping is locale aware and leverages our seed data translations (e.g. "taza", which is "cup" in Spanish, will be standardized if the user has their language set to Spanish or seeds Spanish data).
  2. Using similar logic, all existing units will be standardized at migration-time. Unlike the local-aware logic, the migration has no idea what language your units are in, so we check all supported languages (which is kind of slow, but it's a migration so it doesn't matter).

So existing users don't have to do anything to standardize their data for the most part, unless they have unexpected units they want to standardize. Similarly, new users using seed data also don't have to do this manually.

For users with unexpected units, or users who want to tweak things for some reason, these fields are added to the data management page:
image

Where the second field is a dropdown of our arbitrary known units:
image

Which issue(s) this PR fixes:

(REQUIRED)

Several discussions either ask for or otherwise depend on this:

Special notes for your reviewer:

(fill-in or delete this section)

I added some gross-looking special handling for ounce. Ounce is a weird one because often times (at least here in the US) recipes use "ounce" for both mass and volume (e.g. "1 cup = 8 ounces"). This is technically incorrect because ounces are a mass unit, whereas we should be using "fluid ounces" (e.g. "1 cup = 8 fluid ounces"). The special handling looks at the two units it wants to combine, and if one is literally "ounce" and the other is a volume unit, assume we actually meant "fluid ounce".

Users can, of course, use "ounce" and "fluid ounce" correctly by defining both and diligently updating their recipes to use the correct unit... but I'm certainly not going to do that (and I'm sure most users agree). AFAIK there are no other cases where we want this kind of logic other than this exact use case.

Testing

(fill-in or delete this section)

Comprehensive backend tests. I also manually tested a few edgecases around the data management page.

Comment on lines +62 to +70
export type StandardizedUnitType
= | "fluid_ounce"
| "cup"
| "ounce"
| "pound"
| "milliliter"
| "liter"
| "gram"
| "kilogram";
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our arbitrary list of unit types. Pint understands all of these.
There's no difference in defining a unit as 1 * liter vs 1000 * milliliter other than user preference.

Comment on lines +42 to +82
def populate_standards() -> None:
bind = op.get_bind()

session = orm.Session(bind)

# We aren't using most of the functionality of this class, so we pass dummy args
repo = RepositoryUnit(None, None, None, None, group_id=None) # type: ignore

stmt = sa.select(IngredientUnitModel)
units = session.execute(stmt).scalars().all()
if not units:
return

# Manually build repo._standardized_unit_map with all locales
repo._standardized_unit_map = {}
for locale in LOCALE_CONFIG:
locale_file = IngredientUnitsSeeder.get_file(locale)
for unit_key, unit in IngredientUnitsSeeder.load_file(locale_file).items():
for prop in ["name", "plural_name", "abbreviation"]:
val = unit.get(prop)
if val and isinstance(val, str):
repo._standardized_unit_map[val.strip().lower()] = unit_key

for unit in units:
unit_data = {
"name": unit.name,
"plural_name": unit.plural_name,
"abbreviation": unit.abbreviation,
"plural_abbreviation": unit.plural_abbreviation,
}

standardized_data = repo._add_standardized_unit(unit_data)
std_q = standardized_data.get("standard_quantity")
std_u = standardized_data.get("standard_unit")
if std_q and std_u:
logger.info(f"Found unit '{unit.name}', which is standardized as '{std_q} * {std_u}'")
unit.standard_quantity = std_q
unit.standard_unit = std_u

session.commit()
session.close()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check all units against all languages and automatically standardize known ones. This assumes there isn't an identical string between languages with different definitions. I think this is a fair assumption (and users can manually fix if needed).

Comment on lines +16 to +38
@property
def standardized_unit_map(self) -> dict[str, str]:
"""A map of potential known units to its standardized name in our seed data"""

if self._standardized_unit_map is None:
from .seed.seeders import IngredientUnitsSeeder

ctx = get_locale_context()
if ctx:
locale = ctx[1].key
else:
locale = None

self._standardized_unit_map = {}
locale_file = IngredientUnitsSeeder.get_file(locale=locale)
for unit_key, unit in IngredientUnitsSeeder.load_file(locale_file).items():
for prop in ["name", "plural_name", "abbreviation"]:
val = unit.get(prop)
if val and isinstance(val, str):
self._standardized_unit_map[val.strip().lower()] = unit_key

return self._standardized_unit_map

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a given locale, check values against "name", "plural_name" and "abbreviation" from our seed data (our seed data doesn't define a plural abbreviation for some reason).

Comment on lines +61 to +105
match standardized_unit_key:
case "teaspoon":
data["standard_quantity"] = 1 / 6
data["standard_unit"] = StandardizedUnitType.FLUID_OUNCE
case "tablespoon":
data["standard_quantity"] = 1 / 2
data["standard_unit"] = StandardizedUnitType.FLUID_OUNCE
case "cup":
data["standard_quantity"] = 1
data["standard_unit"] = StandardizedUnitType.CUP
case "fluid-ounce":
data["standard_quantity"] = 1
data["standard_unit"] = StandardizedUnitType.FLUID_OUNCE
case "pint":
data["standard_quantity"] = 2
data["standard_unit"] = StandardizedUnitType.CUP
case "quart":
data["standard_quantity"] = 4
data["standard_unit"] = StandardizedUnitType.CUP
case "gallon":
data["standard_quantity"] = 16
data["standard_unit"] = StandardizedUnitType.CUP
case "milliliter":
data["standard_quantity"] = 1
data["standard_unit"] = StandardizedUnitType.MILLILITER
case "liter":
data["standard_quantity"] = 1
data["standard_unit"] = StandardizedUnitType.LITER
case "pound":
data["standard_quantity"] = 1
data["standard_unit"] = StandardizedUnitType.POUND
case "ounce":
data["standard_quantity"] = 1
data["standard_unit"] = StandardizedUnitType.OUNCE
case "gram":
data["standard_quantity"] = 1
data["standard_unit"] = StandardizedUnitType.GRAM
case "kilogram":
data["standard_quantity"] = 1
data["standard_unit"] = StandardizedUnitType.KILOGRAM
case "milligram":
data["standard_quantity"] = 1 / 1000
data["standard_unit"] = StandardizedUnitType.GRAM
case _:
continue
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big hardcoded standardization logic. These are ultimately the units we automatically standardize.

Comment on lines +158 to +167
@model_validator(mode="after")
def validate_standardization_fields(self):
# If one is set, the other must be set.
# If quantity is <= 0, it's considered not set.
if not self.standard_unit:
self.standard_quantity = self.standard_unit = None
elif not ((self.standard_quantity or 0) > 0):
self.standard_quantity = self.standard_unit = None

return self
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A standard quantity without a unit is useless, and vice versa, so we drop partial definitions.

Comment on lines 57 to 97
@@ -69,7 +81,20 @@ def merge_items(
Attributes of the `to_item` take priority over the `from_item`, except extras with overlapping keys
"""

to_item.quantity += from_item.quantity
to_item_unit = to_item.unit or self.data_matcher.units_by_id.get(to_item.unit_id)
from_item_unit = from_item.unit or self.data_matcher.units_by_id.get(from_item.unit_id)
if to_item_unit and to_item_unit.standard_unit and from_item_unit and from_item_unit.standard_unit:
merged_qty, merged_unit = merge_quantity_and_unit(
from_item.quantity or 0, from_item_unit, to_item.quantity or 0, to_item_unit
)
to_item.quantity = merged_qty
to_item.unit_id = merged_unit.id
to_item.unit = merged_unit

else:
# No conversion needed, just sum the quantities
to_item.quantity += from_item.quantity

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only user-facing part of this PR, aside from data management, which merges shopping list items

Comment on lines +1 to +146
from typing import TYPE_CHECKING, Literal, overload

from pint import Quantity, Unit, UnitRegistry

if TYPE_CHECKING:
from mealie.schema.recipe.recipe_ingredient import CreateIngredientUnit


class UnitNotFound(Exception):
"""Raised when trying to access a unit not found in the unit registry."""

def __init__(self, message: str = "Unit not found in unit registry"):
self.message = message
super().__init__(self.message)

def __str__(self):
return f"{self.message}"


class UnitConverter:
def __init__(self):
self.ureg = UnitRegistry()

def _resolve_ounce(self, unit_1: Unit, unit_2: Unit) -> tuple[Unit, Unit]:
"""
Often times "ounce" is used in place of "fluid ounce" in recipes.
When trying to convert/combine ounces with a volume, we can assume it should have been a fluid ounce.
This function will convert ounces to fluid ounces if the other unit is a volume.
"""

OUNCE = self.ureg("ounce")
FL_OUNCE = self.ureg("fluid_ounce")
VOLUME = "[length] ** 3"

if unit_1 == OUNCE and unit_2.dimensionality == VOLUME:
return FL_OUNCE, unit_2
if unit_2 == OUNCE and unit_1.dimensionality == VOLUME:
return unit_1, FL_OUNCE

return unit_1, unit_2

@overload
def parse(self, unit: str | Unit, strict: Literal[False] = False) -> str | Unit: ...

@overload
def parse(self, unit: str | Unit, strict: Literal[True]) -> Unit: ...

def parse(self, unit: str | Unit, strict: bool = False) -> str | Unit:
"""
Parse a string unit into a pint.Unit.

If strict is False (default), returns a pint.Unit if it exists, otherwise returns the original string.
If strict is True, raises UnitNotFound instead of returning a string.
If the input is already a parsed pint.Unit, returns it as-is.
"""
if isinstance(unit, Unit):
return unit

try:
return self.ureg(unit).units
except Exception as e:
if strict:
raise UnitNotFound(f"Unit '{unit}' not found in unit registry") from e
return unit

def can_convert(self, unit: str | Unit, to_unit: str | Unit) -> bool:
"""Whether or not a given unit can be converted into another unit."""

unit = self.parse(unit)
to_unit = self.parse(to_unit)

if not (isinstance(unit, Unit) and isinstance(to_unit, Unit)):
return False

unit, to_unit = self._resolve_ounce(unit, to_unit)
return unit.is_compatible_with(to_unit)

def convert(self, quantity: float, unit: str | Unit, to_unit: str | Unit) -> tuple[float, Unit]:
"""
Convert a quantity and a unit into another unit.

Returns tuple[quantity, unit]
"""

unit = self.parse(unit, strict=True)
to_unit = self.parse(to_unit, strict=True)
unit, to_unit = self._resolve_ounce(unit, to_unit)

qty = quantity * unit
converted = qty.to(to_unit)
return float(converted.magnitude), converted.units

def merge(self, quantity_1: float, unit_1: str | Unit, quantity_2: float, unit_2: str | Unit) -> tuple[float, Unit]:
"""Merge two quantities together"""

unit_1 = self.parse(unit_1, strict=True)
unit_2 = self.parse(unit_2, strict=True)
unit_1, unit_2 = self._resolve_ounce(unit_1, unit_2)

q1 = quantity_1 * unit_1
q2 = quantity_2 * unit_2

out: Quantity = q1 + q2
return float(out.magnitude), out.units


def merge_quantity_and_unit[T: CreateIngredientUnit](
qty_1: float, unit_1: T, qty_2: float, unit_2: T
) -> tuple[float, T]:
"""
Merge a quantity and unit.

Returns tuple[quantity, unit]
"""

if not (unit_1.standard_quantity and unit_1.standard_unit and unit_2.standard_quantity and unit_2.standard_unit):
raise ValueError("Both units must contain standardized unit data")

PINT_UNIT_1_TXT = "_mealie_unit_1"
PINT_UNIT_2_TXT = "_mealie_unit_2"

uc = UnitConverter()

# pre-process units to account for ounce -> fluid_ounce conversion
unit_1_standard = uc.parse(unit_1.standard_unit, strict=True)
unit_2_standard = uc.parse(unit_2.standard_unit, strict=True)
unit_1_standard, unit_2_standard = uc._resolve_ounce(unit_1_standard, unit_2_standard)

# create custon unit definition so pint can handle them natively
uc.ureg.define(f"{PINT_UNIT_1_TXT} = {unit_1.standard_quantity} * {unit_1_standard}")
uc.ureg.define(f"{PINT_UNIT_2_TXT} = {unit_2.standard_quantity} * {unit_2_standard}")

pint_unit_1 = uc.parse(PINT_UNIT_1_TXT)
pint_unit_2 = uc.parse(PINT_UNIT_2_TXT)

merged_q, merged_u = uc.merge(qty_1, pint_unit_1, qty_2, pint_unit_2)

# Convert to the bigger unit if quantity >= 1, else the smaller unit
merged_q, merged_u = uc.convert(merged_q, merged_u, max(pint_unit_1, pint_unit_2))
if abs(merged_q) < 1:
merged_q, merged_u = uc.convert(merged_q, merged_u, min(pint_unit_1, pint_unit_2))

if str(merged_u) == PINT_UNIT_1_TXT:
return merged_q, unit_1
else:
return merged_q, unit_2
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The meat of the PR, which handles all the actual unit conversions

return unit_1, FL_OUNCE

return unit_1, unit_2

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The magic "ounce -> fluid ounce" converter (more info in the PR body)

@michael-genson
Copy link
Collaborator Author

Switched to the new auto form layout, doesn't look as fancy but we can pretty it up in another PR (will need to overhaul the auto form to allow side-by-side fields)
image

@Kuchenpirat
Copy link
Collaborator

Kuchenpirat commented Feb 24, 2026

will need to overhaul the auto form to allow side-by-side fields

YES, we absolutely do. Also its wasting huge amounts of vertical space because it adds space for potential hints, labels etc that are usually hidden

@michael-genson
Copy link
Collaborator Author

Yeah something as simple as putting it in a grid and specifying the cols (defaulting to 12) would probably work quite well. Definitely something I can take up in the future

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants