Skip to content

Commit c0c54ba

Browse files
authored
Use dicts instead of OrderedDicts for headers (#133)
1 parent 7051595 commit c0c54ba

File tree

7 files changed

+50
-45
lines changed

7 files changed

+50
-45
lines changed

CHANGELOG.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,9 @@
11
# TFS-Pandas Changelog
22

3-
## IN PROGRESS - 3.9.0
3+
## Version 3.8.2
4+
5+
- Changed:
6+
- The headers of a `TfsDataFrame` are now stored as a `dict` and no longer an `OrderedDict`. This is transparent to the user.
47

58
- Fixed:
69
- Removed a workaround function which is no longer necessary due to the higher minimum `pandas` version.

README.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,31 +4,34 @@
44
[![Code Climate coverage](https://img.shields.io/codeclimate/coverage/pylhc/tfs.svg?style=popout)](https://codeclimate.com/github/pylhc/tfs)
55
[![Code Climate maintainability (percentage)](https://img.shields.io/codeclimate/maintainability-percentage/pylhc/tfs.svg?style=popout)](https://codeclimate.com/github/pylhc/tfs)
66
<!-- [![GitHub last commit](https://img.shields.io/github/last-commit/pylhc/tfs.svg?style=popout)](https://github.com/pylhc/tfs/) -->
7-
[![PyPI Version](https://img.shields.io/pypi/v/tfs-pandas?label=PyPI&logo=pypi)](https://pypi.org/project/tfs-pandas/)
87
[![GitHub release](https://img.shields.io/github/v/release/pylhc/tfs?logo=github)](https://github.com/pylhc/tfs/)
8+
[![PyPI Version](https://img.shields.io/pypi/v/tfs-pandas?label=PyPI&logo=pypi)](https://pypi.org/project/tfs-pandas/)
99
[![Conda-forge Version](https://img.shields.io/conda/vn/conda-forge/tfs-pandas?color=orange&logo=anaconda)](https://anaconda.org/conda-forge/tfs-pandas)
1010
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.5070986.svg)](https://doi.org/10.5281/zenodo.5070986)
1111

12-
This package provides reading and writing functionality for [**Table Format System (TFS)** files](http://mad.web.cern.ch/mad/madx.old/Introduction/tfs.html).
13-
Files are read into a `TfsDataFrame`, a class built on top of the famous `pandas.DataFrame`, which in addition to the normal behavior attaches an `OrderedDict` of headers to the `DataFrame`.
12+
This package provides reading and writing functionality for [**Table Format System (TFS)** files](http://mad.web.cern.ch/mad/madx.old/Introduction/tfs.html).
13+
Files are read into a `TfsDataFrame`, a class built on top of the famous `pandas.DataFrame`, which in addition to the normal behavior attaches a dictionary of headers to the `DataFrame`.
1414

1515
See the [API documentation](https://pylhc.github.io/tfs/) for details.
1616

1717
## Installing
1818

1919
Installation is easily done via `pip`:
20+
2021
```bash
2122
python -m pip install tfs-pandas
2223
```
2324

2425
One can also install in a `conda`/`mamba` environment via the `conda-forge` channel with:
26+
2527
```bash
2628
conda install -c conda-forge tfs-pandas
2729
```
2830

2931
## Example Usage
3032

3133
The package is imported as `tfs`, and exports top-level functions for reading and writing:
34+
3235
```python
3336
import tfs
3437

@@ -50,6 +53,7 @@ tfs.write("path_to_output.tfs", data_frame, save_index="index_column")
5053
```
5154

5255
Reading and writing compressed files is also supported, and done automatically based on the provided file extension:
56+
5357
```python
5458
import tfs
5559

tests/test_frame.py

Lines changed: 17 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
11
import pathlib
2-
from collections import OrderedDict
32
from functools import partial, reduce
43

54
import pandas as pd
@@ -21,8 +20,8 @@ def test_validate_raises_on_wrong_unique_behavior(self):
2120

2221
@pytest.mark.parametrize("how", ["invalid", "not_left", "not_right"])
2322
def test_merge_headers_raises_on_invalid_how_key(self, how):
24-
headers_left = OrderedDict()
25-
headers_right = OrderedDict()
23+
headers_left = {}
24+
headers_right = {}
2625

2726
with pytest.raises(ValueError, match="Invalid 'how' argument"):
2827
merge_headers(headers_left, headers_right, how=how)
@@ -49,7 +48,7 @@ def test_correct_merging(self, _tfs_file_x_pathlib, _tfs_file_y_pathlib, how_hea
4948
result = dframe_x.merge(dframe_y, how_headers=how_headers, how=how, on=on)
5049

5150
assert isinstance(result, TfsDataFrame)
52-
assert isinstance(result.headers, OrderedDict)
51+
assert isinstance(result.headers, dict)
5352
assert_dict_equal(result.headers, merge_headers(dframe_x.headers, dframe_y.headers, how=how_headers))
5453
assert_frame_equal(result, pd.DataFrame(dframe_x).merge(pd.DataFrame(dframe_y), how=how, on=on))
5554

@@ -64,10 +63,10 @@ def test_merging_accepts_pandas_dataframe(
6463
result = dframe_x.merge(dframe_y, how_headers=how_headers, how=how, on=on)
6564

6665
assert isinstance(result, TfsDataFrame)
67-
assert isinstance(result.headers, OrderedDict)
66+
assert isinstance(result.headers, dict)
6867

69-
# using empty OrderedDict here as it's what dframe_y is getting when converted in the call
70-
assert_dict_equal(result.headers, merge_headers(dframe_x.headers, OrderedDict(), how=how_headers))
68+
# using empty dict here as it's what dframe_y is getting when converted in the call
69+
assert_dict_equal(result.headers, merge_headers(dframe_x.headers, headers_right={}, how=how_headers))
7170
assert_frame_equal(result, pd.DataFrame(dframe_x).merge(pd.DataFrame(dframe_y), how=how, on=on))
7271

7372

@@ -78,7 +77,7 @@ def test_headers_merging_left(self, _tfs_file_x_pathlib, _tfs_file_y_pathlib, ho
7877
headers_right = tfs.read(_tfs_file_y_pathlib).headers
7978
result = merge_headers(headers_left, headers_right, how=how)
8079

81-
assert isinstance(result, OrderedDict)
80+
assert isinstance(result, dict)
8281
assert len(result) >= len(headers_left) # no key disappeared
8382
assert len(result) >= len(headers_right) # no key disappeared
8483
for key in result: # check that we prioritized headers_left's contents
@@ -91,7 +90,7 @@ def test_headers_merging_right(self, _tfs_file_x_pathlib, _tfs_file_y_pathlib, h
9190
headers_right = tfs.read(_tfs_file_y_pathlib).headers
9291
result = merge_headers(headers_left, headers_right, how=how)
9392

94-
assert isinstance(result, OrderedDict)
93+
assert isinstance(result, dict)
9594
assert len(result) >= len(headers_left) # no key disappeared
9695
assert len(result) >= len(headers_right) # no key disappeared
9796
for key in result: # check that we prioritized headers_right's contents
@@ -103,17 +102,17 @@ def test_headers_merging_none_returns_empty_dict(self, _tfs_file_x_pathlib, _tfs
103102
headers_left = tfs.read(_tfs_file_x_pathlib).headers
104103
headers_right = tfs.read(_tfs_file_y_pathlib).headers
105104
result = merge_headers(headers_left, headers_right, how=how)
106-
assert result == OrderedDict() # giving None returns empty headers
105+
assert result == {} # giving None returns empty headers
107106

108107
def test_providing_new_headers_overrides_merging(self, _tfs_file_x_pathlib, _tfs_file_y_pathlib):
109108
dframe_x = tfs.read(_tfs_file_x_pathlib)
110109
dframe_y = tfs.read(_tfs_file_y_pathlib)
111110

112-
assert dframe_x.merge(right=dframe_y, new_headers={}).headers == OrderedDict()
113-
assert dframe_y.merge(right=dframe_x, new_headers={}).headers == OrderedDict()
111+
assert dframe_x.merge(right=dframe_y, new_headers={}).headers == {}
112+
assert dframe_y.merge(right=dframe_x, new_headers={}).headers == {}
114113

115-
assert tfs.concat([dframe_x, dframe_y], new_headers={}).headers == OrderedDict()
116-
assert tfs.concat([dframe_y, dframe_x], new_headers={}).headers == OrderedDict()
114+
assert tfs.concat([dframe_x, dframe_y], new_headers={}).headers == {}
115+
assert tfs.concat([dframe_y, dframe_x], new_headers={}).headers == {}
117116

118117

119118
class TestPrinting:
@@ -157,7 +156,7 @@ def test_correct_concatenating(self, _tfs_file_x_pathlib, _tfs_file_y_pathlib, h
157156
merger = partial(merge_headers, how=how_headers)
158157
all_headers = (tfsdframe.headers for tfsdframe in objs)
159158
assert isinstance(result, TfsDataFrame)
160-
assert isinstance(result.headers, OrderedDict)
159+
assert isinstance(result.headers, dict)
161160
assert_dict_equal(result.headers, reduce(merger, all_headers))
162161
assert_frame_equal(result, pd.concat(objs, axis=axis, join=join))
163162

@@ -175,10 +174,10 @@ def test_concatenating_accepts_pandas_dataframes(
175174
merger = partial(merge_headers, how=how_headers)
176175
# all_headers = (tfsdframe.headers for tfsdframe in objs)
177176
assert isinstance(result, TfsDataFrame)
178-
assert isinstance(result.headers, OrderedDict)
177+
assert isinstance(result.headers, dict)
179178

180-
all_headers = [ # empty OrderedDicts here as it's what objects are getting when converted in the call
181-
dframe.headers if isinstance(dframe, TfsDataFrame) else OrderedDict() for dframe in objs
179+
all_headers = [ # empty dicts here as it's what objects are getting when converted in the call
180+
dframe.headers if isinstance(dframe, TfsDataFrame) else {} for dframe in objs
182181
]
183182
assert_dict_equal(result.headers, reduce(merger, all_headers))
184183
assert_frame_equal(result, pd.concat(objs, axis=axis, join=join))

tfs/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
__title__ = "tfs-pandas"
1212
__description__ = "Read and write tfs files."
1313
__url__ = "https://github.com/pylhc/tfs"
14-
__version__ = "3.8.1"
14+
__version__ = "3.8.2"
1515
__author__ = "pylhc"
1616
__author_email__ = "[email protected]"
1717
__license__ = "MIT"

tfs/frame.py

Lines changed: 16 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,6 @@
99
from __future__ import annotations
1010

1111
import logging
12-
from collections import OrderedDict
1312
from contextlib import suppress
1413
from functools import partial, reduce
1514
from typing import TYPE_CHECKING, ClassVar
@@ -147,23 +146,25 @@ def merge(
147146
return TfsDataFrame(data=dframe, headers=new_headers)
148147

149148

150-
def merge_headers(headers_left: dict, headers_right: dict, how: str) -> OrderedDict:
149+
def merge_headers(headers_left: dict, headers_right: dict, how: str) -> dict:
151150
"""
152151
Merge headers of two ``TfsDataFrames`` together.
153152
154153
Args:
155-
headers_left (dict): Headers of caller (left) ``TfsDataFrame`` when calling ``.append``, ``.join`` or
156-
``.merge``. Headers of the left (preceeding) ``TfsDataFrame`` when calling ``tfs.frame.concat``.
157-
headers_right (dict): Headers of other (right) ``TfsDataFrame`` when calling ``.append``, ``.join``
158-
or ``.merge``. Headers of the left (preceeding) ``TfsDataFrame`` when calling
159-
``tfs.frame.concat``.
160-
how (str): Type of merge to be performed, either **left** or **right**. If **left*, prioritize keys
161-
from **headers_left** in case of duplicate keys. If **right**, prioritize keys from
162-
**headers_right** in case of duplicate keys. Case insensitive. If ``None`` is given,
163-
an empty dictionary will be returned.
154+
headers_left (dict): Headers of caller (left) ``TfsDataFrame`` when calling
155+
``.append``, ``.join`` or ``.merge``. Headers of the left (preceeding)
156+
``TfsDataFrame`` when calling ``tfs.frame.concat``.
157+
headers_right (dict): Headers of other (right) ``TfsDataFrame`` when calling
158+
``.append``, ``.join`` or ``.merge``. Headers of the left (preceeding)
159+
``TfsDataFrame`` when calling ``tfs.frame.concat``.
160+
how (str): Type of merge to be performed, either **left** or **right**. If
161+
**left**, prioritize keys from **headers_left** in case of duplicate keys.
162+
If **right**, prioritize keys from **headers_right** in case of duplicate
163+
keys. Case-insensitive. If ``None`` is given, an empty dictionary will be
164+
returned.
164165
165166
Returns:
166-
A new ``OrderedDict`` as the merge of the two provided dictionaries.
167+
A new dictionary as the merge of the two provided dictionaries.
167168
"""
168169
accepted_merges: set[str] = {"left", "right", "none"}
169170
if str(how).lower() not in accepted_merges: # handles being given None
@@ -172,14 +173,14 @@ def merge_headers(headers_left: dict, headers_right: dict, how: str) -> OrderedD
172173

173174
LOGGER.debug(f"Merging headers with method '{how}'")
174175
if str(how).lower() == "left": # we prioritize the contents of headers_left
175-
result = headers_right.copy()
176+
result: dict = headers_right.copy()
176177
result.update(headers_left)
177178
elif str(how).lower() == "right": # we prioritize the contents of headers_right
178-
result = headers_left.copy()
179+
result: dict = headers_left.copy()
179180
result.update(headers_right)
180181
else: # we were given None, result will be an empty dict
181182
result = {}
182-
return OrderedDict(result) # so that the TfsDataFrame still has an OrderedDict as header
183+
return result
183184

184185

185186
def concat(

tfs/reader.py

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,6 @@
1010
import logging
1111
import pathlib
1212
import shlex
13-
from collections import OrderedDict
1413
from dataclasses import dataclass
1514

1615
import numpy as np
@@ -168,7 +167,7 @@ def read_tfs(
168167
return tfs_data_frame
169168

170169

171-
def read_headers(tfs_file_path: pathlib.Path | str) -> OrderedDict:
170+
def read_headers(tfs_file_path: pathlib.Path | str) -> dict:
172171
"""
173172
Parses the top of the **tfs_file_path** and returns the headers.
174173
@@ -178,7 +177,7 @@ def read_headers(tfs_file_path: pathlib.Path | str) -> OrderedDict:
178177
a Path object.
179178
180179
Returns:
181-
An ``OrderedDict`` with the headers read from the file.
180+
An dictionary with the headers read from the file.
182181
183182
184183
Examples:
@@ -207,7 +206,7 @@ def read_headers(tfs_file_path: pathlib.Path | str) -> OrderedDict:
207206
class _TfsMetaData:
208207
"""A dataclass to encapsulate the metadata read from a TFS file."""
209208

210-
headers: OrderedDict
209+
headers: dict
211210
non_data_lines: int
212211
column_names: np.ndarray
213212
column_types: np.ndarray
@@ -234,7 +233,7 @@ def _read_metadata(tfs_file_path: pathlib.Path | str) -> _TfsMetaData:
234233
"""
235234
LOGGER.debug("Reading headers and metadata from file")
236235
tfs_file_path = pathlib.Path(tfs_file_path)
237-
headers = OrderedDict()
236+
headers = {}
238237
column_names = column_types = None
239238

240239
# Read the headers, chunk by chunk (line by line) with pandas.read_csv as a

tfs/writer.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,6 @@
99

1010
import logging
1111
import pathlib
12-
from collections import OrderedDict
1312

1413
import numpy as np
1514
import pandas as pd
@@ -112,7 +111,7 @@ def write_tfs(
112111
try:
113112
headers_dict = data_frame.headers
114113
except AttributeError:
115-
headers_dict = OrderedDict()
114+
headers_dict = {}
116115

117116
data_frame = data_frame.convert_dtypes(convert_integer=False)
118117

0 commit comments

Comments
 (0)