Open
Description
Describe the bug
There is some special leap year handling that isn't generalized well. I think it is this line:
https://github.com/casact/chainladder-python/blob/master/chainladder/utils/utility_functions.py#L244-L245
This assumes that the format of the of the time dimension is annual, but it can also be monthly.
To Reproduce
Steps to reproduce the behavior. Code should be self-contained and runnable against publicly available data. For example:
Python 3.12.2 | packaged by Anaconda, Inc. | (main, Feb 27 2024, 17:28:07) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import chainladder as cl
>>> import pandas as pd
>>> print(cl.__version__)
0.8.22
>>> print(pd.__version__)
2.1.4
>>> rate_history = pd.Series({
... '2020-01-01': 0.05,
... '2021-01-15': 0.03,
... '2022-07-20': 0.01,
... '2023-01-01': -0.02,})
>>> rate_history.index = pd.DatetimeIndex(rate_history.index)
>>>
>>> cl.parallelogram_olf(
... values=rate_history.values, date=rate_history.index,
... start_date = pd.Timestamp('2019-07-01 00:00:00'),
... end_date=pd.Timestamp('2024-05-31 23:59:59.999999999'),
... grain='M',
... vertical_line=False,
... approximation_grain='M'
... )
C:\Users\jbogaard\Documents\bitbucket\chainladder-python\chainladder\utils\utility_functions.py:188: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
crl = cum_rate_changes[-1]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\jbogaard\Documents\bitbucket\chainladder-python\chainladder\utils\utility_functions.py", line 243, in parallelogram_olf
combined["is_leap"] = pd.to_datetime(
^^^^^^^^^^^^^^^
File "C:\Users\jbogaard\AppData\Local\miniconda3\envs\cl_dev\Lib\site-packages\pandas\core\tools\datetimes.py", line 1112, in to_datetime
values = convert_listlike(arg._values, format)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\jbogaard\AppData\Local\miniconda3\envs\cl_dev\Lib\site-packages\pandas\core\tools\datetimes.py", line 488, in _convert_listlike_datetimes
return _array_strptime_with_fallback(arg, name, utc, format, exact, errors)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\jbogaard\AppData\Local\miniconda3\envs\cl_dev\Lib\site-packages\pandas\core\tools\datetimes.py", line 519, in _array_strptime_with_fallback
result, timezones = array_strptime(arg, fmt, exact=exact, errors=errors, utc=utc)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "strptime.pyx", line 534, in pandas._libs.tslibs.strptime.array_strptime
File "strptime.pyx", line 359, in pandas._libs.tslibs.strptime.array_strptime
ValueError: unconverted data remains when parsing with format "%Y": "-07", at position 0. You might want to try:
- passing `format` if your strings have a consistent format;
- passing `format='ISO8601'` if your strings are all ISO8601 but not necessarily in exactly the same format;
- passing `format='mixed'`, and the format will be inferred for each element individually. You might want to use `dayfirst` alongside this.
Expected behavior
I expect this to not error out.