Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1809 +/- ##
===========================================
- Coverage 92.51% 28.71% -63.81%
===========================================
Files 71 71
Lines 10279 10297 +18
===========================================
- Hits 9510 2957 -6553
- Misses 769 7340 +6571 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
OMG! 100% yes! 🥳 |
Indeed, thanks for starting this! Do you think the current pandas functions should still be retained for a time even after this is merged? Also, it seems like the PR description in the original message requires more details. |
I was just doing some comparisons of the two at the moment. However, I think that the new one should just take it's place. |
|
Maybe @ColtAllen is interested in taking this over? |
FBruzzesi
left a comment
There was a problem hiding this comment.
Hey @williambdean 👋🏼 I just find out this PR 🔥 left a couple of comments that might help😇
| if observation_period_end is None: | ||
| observation_period_end = transactions[datetime_col].cast(nw.Datetime).max() |
There was a problem hiding this comment.
I am not sure if this would work/is supported, but you might try to do:
| if observation_period_end is None: | |
| observation_period_end = transactions[datetime_col].cast(nw.Datetime).max() | |
| if observation_period_end is None: | |
| observation_period_end = pl.col("max").max() |
to get the global max datetime value.
This might also help to avoid this requirement:
The LazyFrame like libraries will require a provided observation_end_date. However, that can be found outside of the
| ) -> IntoFrameT: | ||
| transactions = nw.from_native(transactions) | ||
|
|
||
| date = nw.col(datetime_col).cast(nw.Datetime) |
There was a problem hiding this comment.
This is very tempting, but consider creating a new column between operations - I would be afraid that for pandas the casting happens multiple times instead of once
|
|
||
| customers = ( | ||
| nw.from_native(repeated_transactions) | ||
| .group_by(customer_id_col) |
There was a problem hiding this comment.
For some time now, it should be possible to pass an expression so that you can avoid the renaming down in the pipeline, but it's definitely more of a personal preference 😇
| .group_by(customer_id_col) | |
| .group_by(nw.col(customer_id_col).alias("customer_id")) |
Yes - I wrote all the CLV agg utilities, and will reach out to @FBruzzesi as needed to drive this one home. We're still roadmapping what will be included in version 1.0, but I think this could be a great addition to it! |
Description
Still a work in progress.
The LazyFrame like libraries will require a provided observation_end_date. However, that can be found outside of the
Still building out the functionality for the:
Related Issue
Checklist
pre-commit.ci autofixto auto-fix.📚 Documentation preview 📚: https://pymc-marketing--1809.org.readthedocs.build/en/1809/