Skip to content

ingredient_spellcheck HTML encodes original text incorrectly #1683

@hangy

Description

@hangy

What

It looks like Robotoff incorrectly encodes HTML entities in the JSON response (or potentially before sending them to the spellcheck API in the first place). This should be done in the Hunger Games frontend.

Steps to reproduce the behavior

  1. Go to https://hunger.openfoodfacts.org/ingredient-spellcheck and randomly land on an insight for a product with potential HTML entities in the text, ie. 4025500283148
  2. See that the " in the ingredients list is incorrectly encoded
  3. Check the JSON response https://robotoff.openfoodfacts.org/api/v1/insights?lc=de&insight_types=ingredient_spellcheck&annotated=false and see that "" is already encoded as " in the JSON:
  4.  {
       "id": "5dccae13-c442-4f29-8a77-7f85a18d3dca",
       "barcode": "4025500283148",
       "type": "ingredient_spellcheck",
       "data": {
         "lang": "de",
         "original": "Buttermilch, Zucker, Wasser, 3% Heidelbeersaft" aus Heidelbeersaftkonzentrat, 1,8% Airbende Pflanzenkonzentrate (Karotte, Aronia), Bananenpüree, Stärke, Zitronensaftkonzentrat, natürliches Aroma (enthält Milch), Laktase. 4,8 % Fruchtanteil im Endprodukt",
         "correction": "Buttermilch, Zucker, Wasser, 3% Heidelbeersaft\" aus Heidelbeersaftkonzentrat, 1,8% Apfelpflanzenkonzentrate (Karotte, Aronia), Bananenpüree, Stärke, Zitronensaftkonzentrat, natürliches Aroma (enthält Milch), Laktase. 4,8 % Fruchtanteil im Endprodukt",
         "lang_confidence": 0.88093853
       },
       "timestamp": "2025-06-11T09:48:32.684779",
       "completed_at": null,
       "annotation": null,
       "annotated_result": null,
       "n_votes": 0,
       "username": null,
       "countries": ["en:germany"],
       "brands": ["xx:muller"],
       "process_after": null,
       "value_tag": "de",
       "value": null,
       "source_image": null,
       "automatic_processing": false,
       "server_type": "off",
       "unique_scans_n": 1,
       "reserved_barcode": false,
       "predictor": "fine-tuned-mistral-7b",
       "predictor_version": "llm-v1-20241227172352",
       "campaign": [],
       "confidence": null,
       "bounding_box": null,
       "lc": ["de"]
     }

Expected behavior

The product text should only be encoded by the website, making parts of the suggested correction unnecessary.

Screenshots

Image

Platform (Desktop, Mobile, Hunger Games)

  • OS: Windows 10.0.22621.5624, Firefox 140.0.4
  • Platform: Hunger Games

Metadata

Metadata

Assignees

No one assigned

    Projects

    Status

    To triage

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions