Skip to content

[BUG] Parallelogram_olf is buggy for monthly grain #524

Open
@jbogaardt

Description

@jbogaardt

Describe the bug
There is some special leap year handling that isn't generalized well. I think it is this line:
https://github.com/casact/chainladder-python/blob/master/chainladder/utils/utility_functions.py#L244-L245

This assumes that the format of the of the time dimension is annual, but it can also be monthly.

To Reproduce
Steps to reproduce the behavior. Code should be self-contained and runnable against publicly available data. For example:

Python 3.12.2 | packaged by Anaconda, Inc. | (main, Feb 27 2024, 17:28:07) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import chainladder as cl
>>> import pandas as pd
>>> print(cl.__version__)
0.8.22
>>> print(pd.__version__)
2.1.4
>>> rate_history = pd.Series({
...     '2020-01-01': 0.05,
...     '2021-01-15': 0.03,
...     '2022-07-20': 0.01,
...     '2023-01-01': -0.02,})
>>> rate_history.index = pd.DatetimeIndex(rate_history.index)
>>>
>>> cl.parallelogram_olf(
...     values=rate_history.values, date=rate_history.index,
...     start_date = pd.Timestamp('2019-07-01 00:00:00'),
...     end_date=pd.Timestamp('2024-05-31 23:59:59.999999999'),
...     grain='M',
...     vertical_line=False,
...     approximation_grain='M'
...     )
C:\Users\jbogaard\Documents\bitbucket\chainladder-python\chainladder\utils\utility_functions.py:188: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
  crl = cum_rate_changes[-1]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\jbogaard\Documents\bitbucket\chainladder-python\chainladder\utils\utility_functions.py", line 243, in parallelogram_olf
    combined["is_leap"] = pd.to_datetime(
                          ^^^^^^^^^^^^^^^
  File "C:\Users\jbogaard\AppData\Local\miniconda3\envs\cl_dev\Lib\site-packages\pandas\core\tools\datetimes.py", line 1112, in to_datetime
    values = convert_listlike(arg._values, format)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\jbogaard\AppData\Local\miniconda3\envs\cl_dev\Lib\site-packages\pandas\core\tools\datetimes.py", line 488, in _convert_listlike_datetimes
    return _array_strptime_with_fallback(arg, name, utc, format, exact, errors)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\jbogaard\AppData\Local\miniconda3\envs\cl_dev\Lib\site-packages\pandas\core\tools\datetimes.py", line 519, in _array_strptime_with_fallback
    result, timezones = array_strptime(arg, fmt, exact=exact, errors=errors, utc=utc)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "strptime.pyx", line 534, in pandas._libs.tslibs.strptime.array_strptime
  File "strptime.pyx", line 359, in pandas._libs.tslibs.strptime.array_strptime
ValueError: unconverted data remains when parsing with format "%Y": "-07", at position 0. You might want to try:
    - passing `format` if your strings have a consistent format;
    - passing `format='ISO8601'` if your strings are all ISO8601 but not necessarily in exactly the same format;
    - passing `format='mixed'`, and the format will be inferred for each element individually. You might want to use `dayfirst` alongside this.

Expected behavior
I expect this to not error out.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions