This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
@AGENTS.md
Chef script builds node tree → uploadchannel() validates → processes all files (download, convert, extract metadata) → diffs against Studio → uploads missing files → uploads channel structure → optionally publishes.
The pipeline is the best-architected part of the codebase. Use it as the reference model for new code.
Three stages with ordered handlers: transfer (download) → convert (compress/transform) → extract_metadata. Handlers implement should_handle() / execute(). Transfer uses FirstHandlerOnly mode; convert and metadata use AllHandlers mode.
le_utils provides constants defining the Kolibri content model: content_kinds, file_formats, format_presets, licenses, languages, and label taxonomies. It increasingly includes validation schemas that ricecooker should conform to.
- The pipeline code is the reference architecture — follow its OOP patterns when writing new code.
- Older code (especially
ricecooker/classes/) has less test coverage. Take extra care when modifying it. - PRs target
mainonlearningequality/ricecooker. CI tests Python 3.9–3.13 on Linux, macOS, and Windows.
Adding a new file type requires exactly two changes to the pipeline:
- A conversion handler in
convert.py(subclassArchiveProcessingBaseHandlerfor zip-based formats) — registered inConversionStageHandler.DEFAULT_CHILDREN - A metadata extractor in
extract_metadata.py(subclassMetadataExtractor, add extension-to-preset mapping inPRESETS_FROM_EXTENSIONS) — registered inExtractMetadataStageHandler.DEFAULT_CHILDREN
The pipeline automatically infers content kind and preset from file extensions. Since the pipeline refactor, most dedicated File subclasses (e.g., HTMLZipFile, DocumentFile) and Node subclasses (e.g., HTML5AppNode, DocumentNode) are essentially backwards-compatibility APIs — in most cases the same effect is achieved by using ContentNode with a uri parameter. Do not create new File or Node subclasses, or modify existing ones to add presets, when adding new file type support. Only modify classes/ modules if explicitly asked or if runtime context beyond what the pipeline provides is needed (e.g., subtitles, exercise questions).
Each handler should implement only the validation logic specified in its requirements. Do not copy validation logic from other handlers (e.g., HTML body parsing) unless the requirements call for it.
Pipeline tests go in tests/pipeline/ (e.g. test_convert.py, test_extract_metadata.py). Add new tests to existing files in the appropriate directory rather than creating new top-level test files.