Hypothesis strategies for Awkward Arrays.
Hypothesis is a property-based testing library. Its strategies are Python functions that strategically generate test data that can fail in pytest or other testing frameworks. Once a test fails, Hypothesis searches for the simplest sample that causes the same error. Hypothesis automatically explores edge cases; you do not need to come up with test data manually.
Hypothesis itself includes strategies for NumPy and pandas data types. Xarray provides strategies for its data structure. The Apache Arrow codebase has strategies for PyArrow, which are not officially documented in its API reference.
I am putting together Hypothesis strategies that I developed for Awkward Array in this package. This is very early work in progress and still experimental. The APIs may change over time.
You can install the package from PyPI using pip:
pip install hypothesis-awkwardThis also installs Hypothesis and Awkward Array as dependencies unless they are already installed.
The strategy from_numpy generates Awkward Arrays that are converted from NumPy
arrays. (Internally, it first generates NumPy arrays that can be converted to
Awkward Arrays, then converts them with ak.from_numpy.)
The test below converts the generated Awkward Array back to a NumPy array with
to_numpy and asserts that the list representations of both arrays are equal.
from hypothesis import given
import awkward as ak
import hypothesis_awkward.strategies as st_ak
@given(ak_array=st_ak.from_numpy(allow_structured=False))
def test_array(ak_array: ak.Array) -> None:
np_array = ak_array.to_numpy()
assert ak_array.to_list() == np_array.tolist()So far, I have written strategies based on the first two sections of the Awkward Array User Guide: "How to convert to/from NumPy" and "How to convert to/from Python objects".
These strategies are related to the section "How to convert to/from NumPy".
| Strategy | Data type |
|---|---|
from_numpy |
Awkward Arrays created from NumPy arrays |
numpy_arrays |
NumPy arrays that can be converted to Awkward Array |
numpy_dtypes |
NumPy dtypes (simple or array) supported by Awkward Array |
supported_dtypes |
NumPy dtypes (simple only) supported by Awkward Array |
supported_dtype_names |
Names of NumPy dtypes (simple only) supported by Awkward Array |
These strategies are related to the section "How to convert to/from Python objects".
| Strategy | Data type |
|---|---|
from_list |
Awkward Arrays created from Python lists |
lists |
Nested Python lists for which Awkward Arrays can be created |
items_from_dtype |
Python built-in type values for a given NumPy dtype |
builtin_safe_dtypes |
NumPy dtypes with corresponding Python built-in types |
builtin_safe_dtype_names |
Names of NumPy dtypes with corresponding Python built-in types |
The strategies that I developed so far only generate samples with certain types of layouts. It is probably possible to build strategies that generate fully general Awkward Arrays with the array builder and direct constructors, which would help close in on edge cases in developing tools that use Awkward Array and even Awkward Array itself.