Skip to content

Switch from python_packages to python_requirements #157

Open
@bfirsh

Description

@bfirsh

The design is outlined in this comment:

  • support the requirements.txt format via python_requirements in cog.yaml. This exists today but is currently undocumented.
  • continue to support python_packages in cog.yaml, but remove documentation for it.
  • leave the door open for other dependency file types later (pipfile.lock, etc), but only support requirements.txt for now
  • update python_requirements parsing behavior:
    • if an unqualified version of pytorch and tensorflow is specified (like 1.11.0), do magic to find the right version (torch==1.11.0+cpu, torch==1.11.0+cu113), and log it so the user knows this magic is happening.
    • if a fixed version of pytorch or tensorflow is specified by the user with extra qualifiers like torch==1.11.0+cu113 or torch==1.11.0+cpu, don't do any magic and install that exact version, and tell the user we're not doing any magic.
    • if a version range (package>=0.2,<0.3) is specified in requirements.txt (for any package), let pip handle the version resolution, but log a warning to the user recommending that versions in requirements.txt should be pinned for the sake of reproducibility.

Todo

Design process

We should decide between python_requirements and python_packages and make it work properly.

There is some background here that is not written down anywhere that needs writing down. tl;dr python_requirements doesn't work properly, python_packages is not ideal in various ways.

Potential designs

  1. Leave it how it is, remove python_requirements. The main downside of this is Python requirements are either in a non-standard location or duplicated.
  2. Switch to python_requirements, and use a simple parser to determine torch and tensorflow versions. The main downside of this is that it is not clear to the user that Cog is rewriting versions.
  3. Switch to python_requirements, but split out torch and tensorflow as separate top-level options to make it clear they do some special sauce behind the scenes. If torch/tensorflow is detected in python_requirements, then it ignores them, warns the user, or something else sensible.

User data

  • Streamlit have learned that users normally already have some way of defining their dependencies (pip, Poetry, Conda, etc) and are unwilling to switch to a different method. So, they read dependencies from wherever the user has already defined them.
  • Python dependencies should automatically install when you open a GitHub codespaces. They don't do that if they're inside python_packages, but they do if they're in a plain requirements.txt

(Consider this a wiki and please edit! Edited by @bfirsh, ...)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions