Skip to content

If all fields in input.csv are not quoted, anyascii fails when attempting to convert ints to ASCII #52

@mjfos2r

Description

@mjfos2r

This is an error I just came across where on execution of pprl create

It returned the following error:

[INFO] pprl.pprl: TOTAL RECORDS: [redacted]
[ERROR] pprl.commands.create: Unhandled error during 'create_CLKs' execution.
Traceback (most recent call last):
  File "/Users/mf019/bioinformatics/pprl/src/pprl/commands/create.py", line 44, in run_create
    rc = pprl.create_CLKs(args)
  File "/Users/mf019/bioinformatics/pprl/src/pprl/pprl.py", line 55, in create_CLKs
    rc = _create_CLKs(**configuration)
  File "/Users/mf019/bioinformatics/pprl/src/pprl/pprl.py", line 125, in _create_CLKs
    .map(anyascii)
     ~~~^^^^^^^^^^
  File "/Users/mf019/bioinformatics/pprl/.venv/lib/python3.13/site-packages/pandas/core/frame.py", line 10495, in map
    return self.apply(infer).__finalize__(self, "map")
           ~~~~~~~~~~^^^^^^^
  File "/Users/mf019/bioinformatics/pprl/.venv/lib/python3.13/site-packages/pandas/core/frame.py", line 10401, in apply
    return op.apply().__finalize__(self, method="apply")
           ~~~~~~~~^^
  File "/Users/mf019/bioinformatics/pprl/.venv/lib/python3.13/site-packages/pandas/core/apply.py", line 916, in apply
    return self.apply_standard()
           ~~~~~~~~~~~~~~~~~~~^^
  File "/Users/mf019/bioinformatics/pprl/.venv/lib/python3.13/site-packages/pandas/core/apply.py", line 1063, in apply_standard
    results, res_index = self.apply_series_generator()
                         ~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/Users/mf019/bioinformatics/pprl/.venv/lib/python3.13/site-packages/pandas/core/apply.py", line 1081, in apply_series_generator
    results[i] = self.func(v, *self.args, **self.kwargs)
                 ~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mf019/bioinformatics/pprl/.venv/lib/python3.13/site-packages/pandas/core/frame.py", line 10493, in infer
    return x._map_values(func, na_action=na_action)
           ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mf019/bioinformatics/pprl/.venv/lib/python3.13/site-packages/pandas/core/base.py", line 925, in _map_values
    return algorithms.map_array(arr, mapper, na_action=na_action, convert=convert)
           ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mf019/bioinformatics/pprl/.venv/lib/python3.13/site-packages/pandas/core/algorithms.py", line 1743, in map_array
    return lib.map_infer(values, mapper, convert=convert)
           ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "pandas/_libs/lib.pyx", line 2999, in pandas._libs.lib.map_infer
  File "/Users/mf019/bioinformatics/pprl/.venv/lib/python3.13/site-packages/anyascii/__init__.py", line 32, in anyascii
    for char in string:
                ^^^^^^
TypeError: 'int' object is not iterable

I went back and outputted my data with the following:
final_list.astype(str).to_csv("[redacted]", sep=',', index=False, quoting=csv.QUOTE_ALL)

and then re-ran pprl create which executed without errors.

Here are the cols & types that generated the error above:

source    object
row_id     int64
first     object
last      object
city      object
state     object
zip       object
dob       object
dtype: object

and the cols & types after enforcing quoting=csv.QUOTE_ALL

source    object
row_id    object
first     object
last      object
city      object
state     object
zip       object
dob       object
dtype: object

This is unexpected behavior and want to document it here should any other sites experience this same issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions