Skip to content

Commit 19d18b8

Browse files
committed
Added the intro section
1 parent 2449e0b commit 19d18b8

File tree

1 file changed

+134
-0
lines changed

1 file changed

+134
-0
lines changed

0-Intro.ipynb

Lines changed: 134 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,134 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {
6+
"deletable": true,
7+
"editable": true
8+
},
9+
"source": [
10+
"# HDF5 and pandas"
11+
]
12+
},
13+
{
14+
"cell_type": "markdown",
15+
"metadata": {
16+
"deletable": true,
17+
"editable": true
18+
},
19+
"source": [
20+
"HDF5 is both a data container and a library that is meant to store and retrieve large amounts of data in a convenient way. It is used extensively in science, engineering, finance and many other fields. HDF5 has to major Python packages that wrap it:\n",
21+
"\n",
22+
"1. h5py\n",
23+
"2. PyTables\n",
24+
"\n",
25+
"Also, pandas is using one of them (PyTables) so as to efficiently store and retrieve dataframes.\n",
26+
"\n",
27+
"During this tutorial you will learn how to create and read HDF5 datasets using both h5py and PyTables, as well as introducing the concept of data chunking and how it can be used to compress data efficiently. Moreover, a gentle description of the querying capabilities of PyTables will be made. Finally, we will see how HDF5 and pandas can interact, not only to serialize pandas dataframes, but also to efficiently query them right on-disk (i.e. with no need to load the data in-memory). "
28+
]
29+
},
30+
{
31+
"cell_type": "markdown",
32+
"metadata": {
33+
"deletable": true,
34+
"editable": true
35+
},
36+
"source": [
37+
"## Caveats for following the tutorial:\n",
38+
"\n",
39+
"1. These notebooks have been created and tested mainly on Jupyter notebook 4.4 and Python 3.6, but Python 2.7 should work equally fine, except for some particularities that will be seldom used.\n",
40+
"\n",
41+
"2. You can follow the tutorial by re-playing the [provided notebooks](https://github.com/FrancescAlted/PyData-BCN/releases). For those of you with problems with the Wifi, there are pendrives available.\n",
42+
"\n",
43+
"3. **In case** you cannot reproduce the desired results in your own laptop, do not worry too much; my advice is that you just concentrate in tutor's explanations and ask in case something is not clear enough."
44+
]
45+
},
46+
{
47+
"cell_type": "markdown",
48+
"metadata": {
49+
"deletable": true,
50+
"editable": true
51+
},
52+
"source": [
53+
"## Requisites\n",
54+
"\n",
55+
"* Jupyter notebook\n",
56+
"* numpy\n",
57+
"* h5py\n",
58+
"* tables (pytables)\n",
59+
"* pandas\n",
60+
"* matplotlib\n",
61+
"* cartopy\n",
62+
"\n",
63+
"These should be all in Anaconda or in the PyPI repo. The only exception could be `cartopy` that might not exist in the regular conda channel, so in order to install it, try the `conda-forge` channel instead:\n",
64+
"\n",
65+
"```\n",
66+
"$ conda install -c conda-forge cartopy\n",
67+
"```"
68+
]
69+
},
70+
{
71+
"cell_type": "markdown",
72+
"metadata": {
73+
"deletable": true,
74+
"editable": true
75+
},
76+
"source": [
77+
"## Contents"
78+
]
79+
},
80+
{
81+
"cell_type": "markdown",
82+
"metadata": {
83+
"deletable": true,
84+
"editable": true
85+
},
86+
"source": [
87+
"\n",
88+
"1. [Basic Datatypes](1-Basic-Datatypes.ipynb)\n",
89+
"\n",
90+
"1. [Chunking](2-Chunking.ipynb)\n",
91+
"\n",
92+
"1. [Using Compression](3-Using-Compression.ipynb)\n",
93+
"\n",
94+
"1. [Structuring Datasets](4-Structuring-Datasets.ipynb)\n",
95+
"\n",
96+
"1. [Querying with PyTables](5-Querying-With-PyTables.ipynb)\n",
97+
"\n",
98+
"1. [Integration with pandas](6-Integration-With-Pandas.ipynb)"
99+
]
100+
},
101+
{
102+
"cell_type": "code",
103+
"execution_count": null,
104+
"metadata": {
105+
"collapsed": true,
106+
"deletable": true,
107+
"editable": true
108+
},
109+
"outputs": [],
110+
"source": []
111+
}
112+
],
113+
"metadata": {
114+
"kernelspec": {
115+
"display_name": "Python 3",
116+
"language": "python",
117+
"name": "python3"
118+
},
119+
"language_info": {
120+
"codemirror_mode": {
121+
"name": "ipython",
122+
"version": 3
123+
},
124+
"file_extension": ".py",
125+
"mimetype": "text/x-python",
126+
"name": "python",
127+
"nbconvert_exporter": "python",
128+
"pygments_lexer": "ipython3",
129+
"version": "3.6.1"
130+
}
131+
},
132+
"nbformat": 4,
133+
"nbformat_minor": 0
134+
}

0 commit comments

Comments
 (0)