v0.2.0
This is an interesting development checkpoint. We have restructured the code to make space for additional datasets, clarified the cache semantics, implementing fetching additional (large) data files from GitHub (and specifically from the v0.1.0 release assets), and sketched out an API allowing a researcher to directly load pandas DataFrames, which seems very useful potentially for exploration and analysis.
What's Changed
- feat(data): use GitHub releases as interim cache by @bassosimone in #49
- refactor(library): cache and pipeline are now packages by @bassosimone in #50
- refactor(pipeline): separate cache and pipeline code by @bassosimone in #51
- refactor(PipelineCacheEntry)!: return
PathnotPath | Noneby @bassosimone in #52 - refactor(pipeline)!: factor BigQuery-to-Parquet code by @bassosimone in #53
- refactor(pipeline): create pipeline.py with IQBPipeline by @bassosimone in #54
- feat: measure the code coverage with codecov by @bassosimone in #56
- doc(README.md): add badges by @bassosimone in #57
- feat(library/pipeline): add iqb_parquet_read by @bassosimone in #58
- feat(pipeline): introduce the dataset concept by @bassosimone in #59
- refactor(pipeline)!: use the dataset name concept by @bassosimone in #60
- feat(cache)!: use dataset granularity by @bassosimone in #61
- refactor(cache)!: explicitly mlab-scope funcs/data by @bassosimone in #62
- refactor(cache): separate mlab and generic code by @bassosimone in #63
- refactor(cache): move m-lab
get_datato mlab.py by @bassosimone in #64 - refactor(cache)!: re-introduce CacheEntry by @bassosimone in #65
- feat(cache)!: implement IQBCache.get_iqb_data by @bassosimone in #66
- refactor(cache): move IQBCache to cache.py by @bassosimone in #67
- fix(queries): ensure we include the AS name by @bassosimone in #68
- doc(analysis): document the pandas based API by @bassosimone in #69
Full Changelog: v0.1.0...v0.2.0