Skip to content

Add type hints to etl classes#1740

Draft
crusopaul wants to merge 11 commits intomove-coop:mainfrom
crusopaul:etl-typing
Draft

Add type hints to etl classes#1740
crusopaul wants to merge 11 commits intomove-coop:mainfrom
crusopaul:etl-typing

Conversation

@crusopaul
Copy link
Contributor

What is this change?

  • Adds type hints to parameters and returns for classes under parsons.etl.
  • Addresses Add type hints/typing #1266 for some core classes - such as ETL, ToFrom, and Table.

Considerations for discussion

  • ETL constructor now instantiates self.table so that a user / developer may create a subclass from ETL vs Table without defining __init__() to do so.

How to test the changes (if needed)

  • Since this PR only aims to add type hints as opposed to changes them, no testing should be necessary. That said, it would not cause harm to do live tests to confirm.

@bmos
Copy link
Contributor

bmos commented Feb 22, 2026

Unfortunately Self isn't available in typing in python 3.10, which we still support.
Options:

  • Change Self to "ETL"
  • Add from __future__ import annotations to the top of the file and change Self to ETL

@bmos
Copy link
Contributor

bmos commented Feb 22, 2026

Your syntax for Callable is very close, but it should be Callable[[Any], Any] (or just Callable since [Any, Any] doesn't really add anything).

The subscription syntax must always be used with exactly two values: the argument list and the return type. The argument list must be a list of types, a ParamSpec, Concatenate, or an ellipsis (...). The return type must be a single type.

@bmos
Copy link
Contributor

bmos commented Feb 22, 2026

Beyond the ETL constructor now instantiating self.table, there does appear to be one change to behavior here which is that fill_value in fill_column is now optional. Is there a reason for this change? Wouldn't calling fill_column without fill_value always be a mistake?

@crusopaul
Copy link
Contributor Author

crusopaul commented Feb 23, 2026

Beyond the ETL constructor now instantiating self.table, there does appear to be one change to behavior here which is that fill_value in fill_column is now optional. Is there a reason for this change? Wouldn't calling fill_column without fill_value always be a mistake?

Ty for checking! I'm still planning to hit other files, but haven't gotten there yet.

For this one, I wasn't 100% on the types that the underlying petl functions accepts for its value from their docs. However, the logic in fillna_column seems to indicate that it can store None. For that reason, I am thinking it is appropriate to insert None as values, and that fillna_column should also support None values. This makes more sense with the fromdb and todb methods that petl provides - these would imply that you may need to store null values as None.

I am still unsure if petl tables like python class instances as values vs primitive types. I would suspect no... Not sure of the best way to indicate that in type hints.

But also you make a good point about that being a code break. This doesn't seem to be supported currently by Parsons, and that is not my intent with this PR. Will revert, even if correct. Good catch! I will certainly get carried away again lol.

@github-actions
Copy link

github-actions bot commented Feb 23, 2026

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  parsons/etl
  etl.py 16, 954, 976
Project Total  

This report was generated by python-coverage-comment-action

@bmos
Copy link
Contributor

bmos commented Feb 23, 2026

Couple things you might find helpful:
ruff check --select ANN parsons/etl will tell you about some type annotation issues.
uvx mypy parsons/etl will run the mypy static type checker (which may be noisy since a lot of parsons and our dependencies aren't well-typed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants