-
Notifications
You must be signed in to change notification settings - Fork 23
Open
Description
- partridge version: 0.11.0 (but also happens on 1.1.1)
- Python version: 3.8
- Operating System: Win 10
Description
I tried to change the types of the _id columns (i.e. route_id) in some table from dtype object to numeric, to lower the memory usage. I did that by adding a converter to the default config.
It went fine at, but the DataFrames came back empty. I looked into that a little bit and I think it is because the read_file method does the prune part before the type conversion, causing the comparison of object column (the column in the current table) with numeric column (from the dependency table that is type converted).
I'm not sure what would be the right solution for that, maybe changing both columns to object before comparison.
What I Did
In[1]: import partridge as ptg
In[2]: conf = ptg.config.default_config()
In[3]: ptg.load_feed(path, config=conf)
Out[3]:
route_id agency_id route_short_name ... route_desc route_type route_color
0 1 25 1 ... 67001-1-# 3 NaN
1 2 25 1 ... 67001-2-# 3 NaN
2 3 25 2 ... 56002-1-# 3 NaN
[3 rows x 7 columns]
In[4]: import pandas as pd
In[5]: conf.nodes['trips.txt']['converters']['route_id'] = pd.to_numeric
In[6]: conf.nodes['routes.txt']['converters']['route_id'] = pd.to_numeric
In[7]: ptg.load_feed(path, config=conf).routes
Out[7]:
Empty DataFrame
Columns: [route_id, agency_id, route_short_name, route_long_name, route_desc, route_type, route_color]
Index: []
Metadata
Metadata
Assignees
Labels
No labels