Skip to content

Discussion: extending base HPI/overlays/overrides #102

Open
@karlicoss

Description

@karlicoss

Related issues: #12, #46; but I think worth a separate discussion.

From my experience, it's pretty hard to predict how other people want to use their data:

  • you might miss some attributes they care about
  • some people want to be more paranoid or more defensive (e.g. timezone handling/None safety/etc)
  • they might want to do some extra filtering
  • they might want to merge in extra data sources or suppress existing

The list is endless! So it would be nice if it was possible to easily override small bits of HPI modules/mix in other data, etc.

The main goals are:

  • low effort: ideally it should be a matter of a few lines of code to override something.
  • good interop: e.g. ability to keep with the upstream, use modules coming from separate repositories, etc.
  • ideally mypy friendly. This kind of means 'not too dynamic and magical', which is ultimately a good thing even if you don't care about mypy.

Once again, I see Emacs as a good role model. Everything is really decentralized, you have some core library, you have certain patterns that everyone follows... but apart from that the modules are mostly independent.
Many people still use 'monolith' base configurations (e.g. Doom/Spacemacs), because it's kinda convenient, as long as you have a maintainer. Arguably this is what this repository is at the moment, although it's obviously not as popular as Emacs distributions.

Emacs fits these goals well:

  • low effort: the simplest way to confugure something is to override a variable in your config (thanks to dynamic scope, it 'just works').
    You can even literally override whole functions as a means of quickly getting the behaviour you want.
  • good interop: yes, unless the developer broke some APIs, usually you can safely update the upstream module.

How to achieve this within HPI:

For combining independent modules together (say, something like my/youtube.py and my/vimeo.py coming from different repositories), the easiest is to use:

  • symlinks (at least if you have just a few files/directories to mixin)
  • namespace packages (more on them later)

Now, the tricky case is when you want to partially override something.
The first option is: fork & apply your modifications on top. For example: https://github.com/seanbreckenridge/HPI

  • effort: very straightforward
  • interop: merging with the upstream a bit manual, but if you use atomic commits & interactive rebase/cherry pick, should be manageable
  • at least not any more magical than the original repository

Not sure if there is much to discuss here, so straight to the second and a more flexible option.

Once again, we rely on namespace packages! I'll just explain on a couple of examples, since it's easier.

  • example: mixing in a data source

    The main idea is that you can override all.py (also some discussion here), and remove/add extra data sources.
    Since all.py is tiny, it's not a big problem to just copy/paste it and apply your changes.

    Some existing modules implemented with this approach in mind:

    (I still haven't settled on the naming. all and main as the entry point kind of both make sense)

  • example: my.calendar.holidays

    As you can guess, this module is responsible for flagging days as holidays, by exposing is_holiday function.
    As a reasonable default, it's just using the user's country of residence and flags national holidays.
    However, you might also want to mix in your work vacation, and this is harder to make uniform for everyone, and it's a good candidate for a custom user override:

    import my.orig.my.calendar.holidays as M
    from   my.orig.my.calendar.holidays import *
    
    is_holiday_orig = M.is_holiday
    def is_holiday(d: DateIsh) -> bool:
        # if it's a public holiday, definitely a holiday?
        if is_holiday_orig(d):
            return True
        # then check private data of days off work
        if is_day_off_work(d):
            return True
        return False
    M.is_holiday = is_holiday

    Thanks to namespace packages, when I import my.calendar.holidays it will hit my override first, monkey patch the is_holiday function, and expose the rest intact due to import *.
    For example, hpi doctor my.calendar.holiday will run against the override, reusing the stats function or any other original functions.

    My personal HPI override has more example code, and I'll gradually move some stuff from this repository there as well
    (for example most things in my.body don't make much sense for other people).

Things I'm not sure about with this approach:

  • To import the 'original' module and monkey patch it, you need some alternative way of referencing it.
    • for now, I'm using a symlink (/code/hpi-overlay/src/my/orig -> /code/hpi/src/my)

      This is simple enough, but maintaining the symlink manually, referencing the 'original' package through my.orig .. meh.
      Also not sure what to do if there are multiple overrides, e.g. 'chain' (although this is probably a bit extreme).

    • it's probably possible to do something hacky and dynamic. E.g. take __path__, remove the first entry (which would be the 'override'), and then use importlib to import the 'original' module.

      The downside is that it's gonna be unfriendly to mypy (and generally a bit too magical?).

    • another option is to have some sort of dynamic 'hook', which is imported before anything else.

      In the hook code, you import the original module and monkey patch. Same downsides, a bit too dynamic and not mypy friendly, but possible.

Caveats I know of:

  • packages can't contain __init__, otherwise the whole namespace package thing doesn't work

  • you need to be careful about the namespace package resolution order. It seems that the last installed package will be the last in the import order.

    • so you'd need to run pip install -e /path/to/override and then pip install -e /path/to/original (even if it's already installed).

    • another option is to reorder stuff in ~/.local/lib/python3.x/site-packages/easy-install.pth manually, but it's not very robust either (although at least it clearly shows the order)
      hpi doctor my.module displays some helpful info, but it's still easy to forget/mess it up by accident.

       $ hpi doctor my.calendar.holidays  
       ✅ import order: ['/code/hpi-overlay/src/my', '/code/hpi/my']
      
  • import * doesn't import functions that start from the underscore ('private').

    Possible to get around this dynamically, but would be nice to cooperate with mypy somehow..

Happy to hear suggestions and thoughts on that. Once there's been some discussion, I'll move this to doc/, perhaps.


TODOS:

  • also thought that it should e possible to reuse the configuration in ~/.config/my as the 'default' overlay. In fact, treating it like a proper namespace package (at the moment it's a bit of dynamic hackery) might make everything even cleaner and simpler.
  • find some good tutorial on monkey patching and link? Wouldn't want to duplicate the efforts twice..
  • add some examples of motivation for overrides, just for documentation purposes
  • update docs here https://github.com/karlicoss/HPI/blob/master/doc/SETUP.org#addingmodifying-modules

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions