Skip to content

Commit 56eeab4

Browse files
author
Otto Wagner
committed
Refactor install docs [dc install, notebooks, ui] and ingestion docs with John's revisions.
1 parent d65dfd1 commit 56eeab4

File tree

4 files changed

+125
-98
lines changed

4 files changed

+125
-98
lines changed

docs/datacube_install.md

Lines changed: 15 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -119,18 +119,17 @@ pip install netcdf4
119119
```
120120

121121
Please note that the installed gdal version should be as close to your system gdal version as possible.
122+
At the time of this writing, the `gdalinfo` command below outputs 1.11.3, which means that version 1.11.2 is the closest version that satisfies our requirements.
122123
We try to install a non-existent version (99999999999) to have pip print all available version.
123124

124125
```
125126
gdalinfo --version
126127
pip install gdal==99999999999
127128
```
128129

129-
At the time this is being written, the above command outputs 1.11.3, which means that version 1.11.2 is the closest version that satisfies our requirements.
130-
131130
Now that all requirements have been satisfied, run the setup.py script in the agdc-v2 directory:
132131

133-
**It has come to our attention that the setup.py script fails the first time it is run due to some NetCDF/Cython issues. Run the script a second time to install if this occurs.**
132+
**It has come to our attention that the setup.py script can fail the first time it is run due to some NetCDF/Cython issues. Run the script a second time to install if this occurs.**
134133
```
135134
cd ~/Datacube/agdc-v2
136135
python setup.py develop
@@ -156,7 +155,7 @@ Open this file in your editor of choice and find the line that starts with 'time
156155
timezone = 'UTC'
157156
```
158157

159-
This will ensure that all of the datetime fields in the database are stored in UTC. Next, open the pg_hba.conf file found at:
158+
This will ensure that all of the datetime fields in the database are stored in UTC. Next, open the `pg_hba.conf` file found at:
160159

161160
```
162161
/etc/postgresql/9.5/main/pg_hba.conf
@@ -184,7 +183,7 @@ sudo service postgresql restart
184183

185184
Data Cube Configuration file
186185
---------------
187-
The Data Cube requires a configuration file that points to the correct database and provides credentials. The file's contents looks like below should be named '.datacube.conf':
186+
The Data Cube requires a configuration file that points to the correct database and provides credentials. The contents of the `.datacube.conf` file should appear as follows:
188187

189188
```
190189
[datacube]
@@ -208,9 +207,9 @@ gedit ~/Datacube/data_cube_ui/config/.datacube.conf
208207
cp ~/Datacube/data_cube_ui/config/.datacube.conf ~/.datacube.conf
209208
```
210209

211-
This will move the required .datacube.conf file to the home directory. The user's home directory is the default location for the configuration file and will be used for all command line based Data Cube operations. The next step is to create the database specified in the configuration file.
210+
This will copy the required `.datacube.conf` file to the home directory. The user's home directory is the default location for the configuration file and will be used for all command-line-based Data Cube operations. The next step is to create the database specified in the configuration file.
212211

213-
To create the database use the following:
212+
To create the database run the following commands:
214213

215214
```
216215
sudo -u postgres createuser --superuser dc_user
@@ -244,9 +243,15 @@ Done.
244243

245244
If you have PGAdmin3 installed, you can view the default schemas and relationships by connecting to the database named 'datacube' and viewing the tables, views, and indexes in the schema 'agdc'.
246245

246+
Alternatively, you can do the same from the command line. First log in with the command `psql -U dc_user datacube`.
247+
To view schemas, run `psql \dn`.
248+
View the full documentation of the `psql` command [here](https://www.postgresql.org/docs/9.5/static/app-psql.html).
249+
247250
<a name="next_steps"></a> Next Steps
248251
========
249-
Now that the Data Cube system is installed and initialized, the next step is to ingest some sample data. Our focus is on ARD (Analysis Ready Data) - the best introduction to the ingestion/indexing process is to use a single Landsat 7 or Landsat 8 SR product. Download a sample dataset from [Earth Explorer](https://earthexplorer.usgs.gov/) and proceed to the next document in this series, [The ingestion process](ingestion.md). Please ensure that the dataset you download is an SR product - the L\*.tar.gz should contain .tif files with the file pattern `L**_sr_band*.tif` This will correspond to datasets labeled "Collection 1 Higher-Level".
252+
Now that the Data Cube system is installed and initialized, the next step is to ingest some sample data. Our focus is on ARD (Analysis Ready Data) - the best introduction to the ingestion/indexing process is to use a single Landsat 7 or Landsat 8 SR product.
253+
There is a sample ingestion file provided in [the ingestion documentation](ingestion.md) in the "Prerequisites" section.
254+
More generally, download a sample dataset from [Earth Explorer](https://earthexplorer.usgs.gov/) and proceed to the next document in this series, [the ingestion process](ingestion.md). Please ensure that the dataset you download is an SR product - the L\*.tar.gz should contain .tif files with the file pattern `L**_sr_band*.tif` This will correspond to datasets labeled "Collection 1 Higher-Level".
250255

251256

252257
<a name="faqs"></a> Common problems/FAQs
@@ -281,7 +286,7 @@ Q:
281286
>Can the Data Cube be accessed from R/C++/IDL/etc.?
282287
283288
A:
284-
>This is not currently directly supported, the Data Cube is a Python based API. The base technology managing data access PostgreSQL, so theoretically the functionality can be ported to any language that can interact with the database. An additional option is just shelling out from those languages, accessing data using the Python API, then passing the result back to the other program/language.
289+
>This is not currently directly supported. The Data Cube is a Python-based API. The technology managing data access is PostgreSQL, so theoretically the functionality can be ported to any language that can interact with the database. An additional option is just shelling out from those languages, accessing data using the Python API, then passing the result back to the other program/language.
285290
286291
---
287292

@@ -297,7 +302,7 @@ Q:
297302
>I want to store more metadata that isn't mentioned in the documentation. Is this possible?
298303
299304
A:
300-
>This entire process is completely customizable. Users can configure exactly what metadata they want to capture for each dataset - we use the default for simplicities sake.
305+
>This entire process is completely customizable. Users can configure exactly what metadata they want to capture for each dataset - we use the default for simplicity's sake.
301306
302307
---
303308

docs/ingestion.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ To index and ingest data into the Data Cube, the following prerequisites must be
4141

4242
Note that the ingestion file hyperlinked above by "our AWS site" can be downloaded with the command:<br>
4343
```
44-
wget http://ec2-52-201-154-0.compute-1.amazonaws.com/datacube/data/LE071950542015121201T1-SC20170427222707.tar.gz
44+
wget -p /datacube/original_data http://ec2-52-201-154-0.compute-1.amazonaws.com/datacube/data/LE071950542015121201T1-SC20170427222707.tar.gz
4545
```
4646

4747
If you have not yet completed our Data Cube Installation Guide, please do so before continuing.
@@ -720,6 +720,7 @@ Q:
720720
721721
A:
722722
> If your dataset is already in an optimized format and you don't desire any projection or resampling changes, then you can simply index the data and then begin to use the Data Cube.
723+
You will have to specify CRS when loading indexed data, since the ingestion process - which informs the Data Cube about the metadata - has not occurred.
723724

724725
---
725726

docs/notebook_install.md

Lines changed: 33 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -23,14 +23,18 @@ Jupyter notebooks are extremely useful as a learning tool and as an introductory
2323

2424
To run our Jupyter notebook examples, the following prerequisites must be complete:
2525

26-
* The full Data Cube Installation Guide must have been followed and completed. This includes:
27-
* You have a local user that is used to run the Data Cube commands/applications
28-
* You have a database user that is used to connect to your 'datacube' database
29-
* The Data Cube is installed and you have successfully run 'datacube system init'
30-
* All code is checked out and you have a virtual environment in the correct directories: `~/Datacube/{data_cube_ui, data_cube_notebooks, datacube_env, agdc-v2}`
31-
* The full Ingestion guide must have been followed and completed. This includes:
32-
* A sample Landsat 7 scene was downloaded and uncompressed in your `/datacube/original_data` directory
33-
* The ingestion process was completed for that sample Landsat 7 scene
26+
The full Data Cube Installation Guide must have been followed and completed before proceeding. This includes:
27+
* You have a local user that is used to run the Data Cube commands/applications
28+
* You have a database user that is used to connect to your 'datacube' database
29+
* The Data Cube is installed and you have successfully run 'datacube system init'
30+
* All code is checked out and you have a virtual environment in the correct directories: `~/Datacube/{data_cube_ui, data_cube_notebooks, datacube_env, agdc-v2}`
31+
32+
If these requirements are not met, please see the associated documentation.
33+
34+
You can view the notebooks without ingesting any data, but to be able to run notebooks with the sample ingested data,
35+
the ingestion guide must have been followed and completed. The steps include:
36+
* A sample Landsat 7 scene was downloaded and uncompressed in your `/datacube/original_data` directory
37+
* The ingestion process was completed for that sample Landsat 7 scene
3438

3539
<a name="installation_process"></a> Installation Process
3640
========
@@ -44,36 +48,32 @@ source ~/Datacube/datacube_env/bin/activate
4448
Now install the following Python packages:
4549

4650
```
47-
pip install jupyter
48-
pip install matplotlib
49-
pip install scipy
50-
pip install sklearn
51-
pip install lcmap-pyccd
52-
pip install folium
51+
pip install jupyter matplotlib scipy sklearn lcmap-pyccd folium
5352
```
5453

5554
<a name="configuration"></a> Configuration
5655
========
5756

58-
The first step is to generate a notebook configuration file. Run the following commands:
59-
57+
The first step is to generate a notebook configuration file.
58+
Ensure that you're in the virtual environment. If not, activate with `source ~/Datacube/datacube_env/bin/activate`.
59+
Then run the following commands:
6060
```
61-
#ensure that you're in the virtual environment. If not, activate with 'source ~/Datacube/datacube_env/bin/activate'
6261
cd ~/Datacube/data_cube_notebooks
6362
jupyter notebook --generate-config
6463
jupyter nbextension enable --py --sys-prefix widgetsnbextension
6564
```
6665

67-
Jupyter will create a configuration file in `~/.jupyter/jupyter_notebook_config.py`. Now set the password and edit the server details:
66+
Jupyter will create a configuration file in `~/.jupyter/jupyter_notebook_config.py`.
67+
Now set the password and edit the server details. Remember this password for future reference.
6868

6969
```
70-
#enter a password - remember this for future reference.
7170
jupyter notebook password
72-
73-
gedit ~/.jupyter/jupyter_notebook_config.py
7471
```
7572

76-
Edit the generated configuration file to include relevant details - You'll need to find the relevant entries in the file:
73+
Now edit the Jupyter notebook configuration file `~/.jupyter/jupyter_notebook_config.py` with your favorite text editor.
74+
75+
Edit the generated configuration file to include relevant details.
76+
You'll need to set the relevant entries in the file:
7777

7878
```
7979
c.NotebookApp.ip = '*'
@@ -90,16 +90,19 @@ cd ~/Datacube/data_cube_notebooks
9090
jupyter notebook
9191
```
9292

93-
Open a web browser and go to localhost:8888 if you're on the server, or use 'ifconfig' to list your ip address and go to {ip}:8888. You should be greeted with a password field - enter the password from the previous step.
93+
Open a web browser and navigate to the notebook URL. If you are running your browser from the same machine that is
94+
hosting the notebooks, you can use `localhost:{jupyter_port_num}` as the URL, where `jupyter_port_num` is the port number set for `c.NotebookApp.port` in the configuration file.
95+
If you are connecting from another machine, you will need to enter the public IP address of the server in the URL (which can be determined by running the `ifconfig` command on the server) in place of `localhost`.
96+
You should be greeted with a password field. Enter the password from the previous step.
9497

9598
<a name="using_notebooks"></a> Using the Notebooks
9699
========
97100

98101
Now that your notebook server is running and the Data Cube is set up, you can run any of our examples.
99102

100-
Open the notebook titled 'Data_Cube_API_Demo' and run through all of the cells using either the button on the toolbar or CTRL+Enter.
103+
Open the notebook titled 'Data_Cube_Test' and run through all of the cells using either the "Run" button on the toolbar or `Shift+Enter`.
101104

102-
You'll see that a connection to the Data Cube is established, some metadata is listed, and some data is loaded and plotted. Further down the page, you'll see that we are also demonstrating our API that includes getting acquisition dates, scene metadata, and data.
105+
You'll see that a connection to the Data Cube is established, some metadata is queried, and some data is loaded and plotted.
103106

104107
<a name="next_steps"></a> Next Steps
105108
========
@@ -114,6 +117,10 @@ Q:
114117
>I’m having trouble connecting to my notebook server from another computer.
115118
116119
A:
117-
> There can be a variety of problems that can cause this issue. Check your notebook configuration file, your network settings, and your firewall settings.
120+
> There can be a variety of problems that can cause this issue.<br><br>
121+
First check the IP and port number in your notebook configuration file.
122+
Be sure you are connecting to `localhost:<port>` if your browser is running on the same
123+
machine as the Jupyter server, and `<IP>:<port>` otherwise.
124+
Also check that your firewall is not blocking the port that it is running on.
118125

119126
---

0 commit comments

Comments
 (0)