-
Couldn't load subscription status.
- Fork 135
Open
Labels
enhancementImpact - something should be added to or changed about Parsons that isn't causing a current breakageImpact - something should be added to or changed about Parsons that isn't causing a current breakage
Description
petl has some support for generators. if we took advantage of this parsons's data creation, we could often avoid memory problems.
Here's what that could look like.
class Table(ETL, ToFrom):
"""
Create a Parsons Table. Accepts one of the following:
- A list of lists, with list[0] holding field names, and the other lists holding data
- A list of dicts
- A petl table
`Args:`
lst: list
See above for accepted list formats
source: str
The original data source from which the data was pulled (optional)
name: str
The name of the table (optional)
"""
def __init__(
self,
lst: Union[list, tuple, petl.util.base.Table, _EmptyDefault] = _EMPTYDEFAULT,
):
self.table = None
# Normally we would use None as the default argument here
# Instead of using None, we use a sentinal
# This allows us to maintain the existing behavior
# This is allowed: Table()
# This should fail: Table(None)
if lst is _EMPTYDEFAULT:
self.table = petl.fromdicts([])
elif isinstance(lst, petl.util.base.Table):
# Create from a petl table
self.table = lst
else:
try:
iterable_data = iter(lst)
except TypeError:
raise ValueError(
f"Could not initialize table from input type. "
f"Got {type(lst)}, expected list, tuple, or petl Table"
) from None
try:
peek = next(iterable_data)
except StopIteration:
self.table = petl.fromdicts([])
else:
# petl can handle generators but does an explicit
# inspect.generator check instead of duck typing, so we have to make
# sure that this is a generator
iterable_data = (each for each in itertools.chain([peek], iterable_data))
row_type = type(peek)
# Check for list of dicts
if row_type is dict:
self.table = petl.fromdicts(iterable_data)
# Check for list of lists
elif row_type in [list, tuple]:
# the wrap method does not support generators (or
# more precisely only allows us to read a table
# created from generator once
self.table = petl.wrap(list(iterable_data))
if not self.is_valid_table():
raise ValueError("Could not create Table")
# Count how many times someone is indexing directly into this
# table, so we can warn against inefficient usage.
self._index_count = 0Metadata
Metadata
Assignees
Labels
enhancementImpact - something should be added to or changed about Parsons that isn't causing a current breakageImpact - something should be added to or changed about Parsons that isn't causing a current breakage