Skip to content

feat: new openfoodfacts configuration#301

Open
alexgarel wants to merge 13 commits intomainfrom
feat-new-openfoodfacts-config
Open

feat: new openfoodfacts configuration#301
alexgarel wants to merge 13 commits intomainfrom
feat-new-openfoodfacts-config

Conversation

@alexgarel
Copy link
Copy Markdown
Member

  • index all the fields we want to be able to use
  • update API to fetch document to last version

Comment thread app/openfoodfacts.py
Comment thread app/openfoodfacts.py Outdated
Comment thread app/openfoodfacts.py Outdated
Comment thread app/openfoodfacts.py
for nova_group, markers in nova_groups_markers.items()
]

def transform_images(self, document: JSONType):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A docstring summarizing the transformations would be nice.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should change the schema in product opener instead of here, and use a more descriptive format (e.g. as you did, using hash keys to indicate what it is instead of arrays with different meaning according to the position in the array...)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephanegigandet we want to deploy quickly.

I think transforming nova_group_markers in the API is a good idea, and we can drop this code after that.

But for images, I think using image id as hash key is ok for the API (and easy to manipulate in JS), it's just that for ES we want a list of dicts, but it's more related to the way ES works. So it's ok to transform the data in this scenario.

And this leaves a question to you @raphael0202 and @stephanegigandet: should we transform data back to the hashmap style when we send back results ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this leaves a question to you @raphael0202 and @stephanegigandet: should we transform data back to the hashmap style when we send back results ?

I think we should stay close to the OFF API, so I would prefer transforming it back to the format returned by Product Opener in v3.4.

Comment thread data/config/openfoodfacts.yml
Comment thread data/config/openfoodfacts.yml
Comment thread data/config/openfoodfacts.yml Outdated
Comment thread data/config/openfoodfacts.yml Outdated
Comment thread data/config/openfoodfacts.yml Outdated
Comment thread data/config/openfoodfacts.yml
Comment thread data/config/openfoodfacts.yml
Comment thread data/config/openfoodfacts.yml Outdated
Comment thread data/config/openfoodfacts.yml Outdated
Copy link
Copy Markdown
Contributor

@raphael0202 raphael0202 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few remarks, but otherwise looks good to me!

@github-project-automation github-project-automation Bot moved this from Backlog (ready for dev) to In Progress in 🔎 Search-a-licious Jun 20, 2025
@alexgarel alexgarel changed the title feat: wip on new openfoodfacts configuration feat: new openfoodfacts configuration Jun 20, 2025
Comment thread data/config/openfoodfacts.yml
Comment thread data/config/openfoodfacts.yml Outdated
Comment thread data/config/openfoodfacts.yml
Comment thread data/config/openfoodfacts.yml
to be able to handle them correctly (with files ES does not start
@github-actions github-actions Bot added 🐳 docker Pull requests that update Docker code Taxonomies labels Nov 6, 2025
@alexgarel alexgarel marked this pull request as ready for review November 6, 2025 14:16
@alexgarel
Copy link
Copy Markdown
Member Author

alexgarel commented Dec 3, 2025

For information, the state of this PR is the following:
We wanted to handle taxonomy synonyms using the synonyms filter of ES but this leads to something like a mapping explosion, that is the startup time of ES is really really long (>40 min).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: In progress
Status: In Progress

Development

Successfully merging this pull request may close these issues.

Re-configure Open Food Facts indexation

4 participants