Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Improved error message and raise new error for small-string NaN edge case in HDFStore.append #60829

Merged

Conversation

JakeTT404
Copy link
Contributor

Updated pytables.py to improve error messages caused by datatype mismatches in HDFStore.append. Also added ValueError for when the NaN representation cannot fit into the column. Modified tests concerning the improved error message and added new test for when column is type string with length <3 and as such nan_rep 'nan' is too big.

@JakeTT404 JakeTT404 changed the title Improved error message and raise new error for small-string NaN edge case in HDFStore.append ENH: Improved error message and raise new error for small-string NaN edge case in HDFStore.append Feb 2, 2025
@mroeschke mroeschke added Error Reporting Incorrect or improved errors from pandas IO HDF5 read_hdf, HDFStore labels Feb 3, 2025
…Raise ValueError when nan_rep too large for pytable column. Add and modify applicable test code.
@JakeTT404 JakeTT404 force-pushed the Wrong-error-message-in-HDFStore.append branch from 04a8169 to 500ab5a Compare February 3, 2025 21:05
@JakeTT404
Copy link
Contributor Author

I've removed the comments mentioned and reverted the error type. I've replaced the test with a function of 3 tests testing the following cases:

  • Error on string column too small
  • Error on modified nan_rep too big
  • Success on small modified nan_rep in small string column

There may be some odd git stuff during me updating the main branch due to my unfamiliarity with the GitHub contribution workflow and git as a whole.

@mroeschke mroeschke added this to the 3.0 milestone Feb 5, 2025
@mroeschke mroeschke merged commit 57340ec into pandas-dev:main Feb 5, 2025
38 of 42 checks passed
@mroeschke
Copy link
Member

Thanks @JakeTT404

@JakeTT404 JakeTT404 deleted the Wrong-error-message-in-HDFStore.append branch February 5, 2025 20:34
@rhshadrach
Copy link
Member

rhshadrach commented Feb 9, 2025

This impacted #60795 - #60795 (comment)

Is this okay to backport to 2.3?

@mroeschke
Copy link
Member

Yup sure OK with me

rhshadrach pushed a commit to rhshadrach/pandas that referenced this pull request Feb 10, 2025
…edge case in HDFStore.append (pandas-dev#60829)

* Add clearer error messages for datatype mismatch in HDFStore.append. Raise ValueError when nan_rep too large for pytable column. Add and modify applicable test code.

* Fix missed tests and correct mistake in error message.

* Remove excess comments. Reverse error type change to avoid api changes. Move nan_rep tests into separate function.

(cherry picked from commit 57340ec)
@rhshadrach
Copy link
Member

Backport PR: #60907

mroeschke pushed a commit that referenced this pull request Feb 10, 2025
#60907)

ENH: Improved error message and raise new error for small-string NaN edge case in HDFStore.append (#60829)

* Add clearer error messages for datatype mismatch in HDFStore.append. Raise ValueError when nan_rep too large for pytable column. Add and modify applicable test code.

* Fix missed tests and correct mistake in error message.

* Remove excess comments. Reverse error type change to avoid api changes. Move nan_rep tests into separate function.

(cherry picked from commit 57340ec)

Co-authored-by: Jake Thomas Trevallion <[email protected]>
mroeschke pushed a commit that referenced this pull request Feb 16, 2025
…0916)

* ENH: Improved error message and raise new error for small-string NaN edge case in HDFStore.append (#60829)

* Add clearer error messages for datatype mismatch in HDFStore.append. Raise ValueError when nan_rep too large for pytable column. Add and modify applicable test code.

* Fix missed tests and correct mistake in error message.

* Remove excess comments. Reverse error type change to avoid api changes. Move nan_rep tests into separate function.

(cherry picked from commit 57340ec)

* TST(string dtype): Resolve xfails in pytables (#60795)

(cherry picked from commit 4511251)

* Adjust test

---------

Co-authored-by: Jake Thomas Trevallion <[email protected]>
mroeschke pushed a commit that referenced this pull request Feb 20, 2025
…ading with condition (#60967)

* ENH: Improved error message and raise new error for small-string NaN edge case in HDFStore.append (#60829)

* Add clearer error messages for datatype mismatch in HDFStore.append. Raise ValueError when nan_rep too large for pytable column. Add and modify applicable test code.

* Fix missed tests and correct mistake in error message.

* Remove excess comments. Reverse error type change to avoid api changes. Move nan_rep tests into separate function.

(cherry picked from commit 57340ec)

* TST(string dtype): Resolve xfails in pytables (#60795)

(cherry picked from commit 4511251)

* BUG(string dtype): Resolve pytables xfail when reading with condition (#60943)

(cherry picked from commit 0ec5f26)

---------

Co-authored-by: Jake Thomas Trevallion <[email protected]>
mroeschke pushed a commit that referenced this pull request Feb 20, 2025
* ENH: Improved error message and raise new error for small-string NaN edge case in HDFStore.append (#60829)

* Add clearer error messages for datatype mismatch in HDFStore.append. Raise ValueError when nan_rep too large for pytable column. Add and modify applicable test code.

* Fix missed tests and correct mistake in error message.

* Remove excess comments. Reverse error type change to avoid api changes. Move nan_rep tests into separate function.

(cherry picked from commit 57340ec)

* TST(string dtype): Resolve xfails in pytables (#60795)

(cherry picked from commit 4511251)

* BUG(string dtype): Resolve pytables xfail when reading with condition (#60943)

(cherry picked from commit 0ec5f26)

* Backport PR #60940: ENH: Add dtype argument to str.decode

---------

Co-authored-by: Jake Thomas Trevallion <[email protected]>
mroeschke pushed a commit that referenced this pull request Feb 22, 2025
…cked strings (#60984)

* ENH: Improved error message and raise new error for small-string NaN edge case in HDFStore.append (#60829)

* Add clearer error messages for datatype mismatch in HDFStore.append. Raise ValueError when nan_rep too large for pytable column. Add and modify applicable test code.

* Fix missed tests and correct mistake in error message.

* Remove excess comments. Reverse error type change to avoid api changes. Move nan_rep tests into separate function.

(cherry picked from commit 57340ec)

* TST(string dtype): Resolve xfails in pytables (#60795)

(cherry picked from commit 4511251)

* BUG(string dtype): Resolve pytables xfail when reading with condition (#60943)

(cherry picked from commit 0ec5f26)

* Backport PR #60940: ENH: Add dtype argument to str.decode

* Backport PR #60938: ENH(string dtype): Implement cumsum for Python-backed strings

---------

Co-authored-by: Jake Thomas Trevallion <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas IO HDF5 read_hdf, HDFStore
Projects
None yet
3 participants