Up/strem tutorial (#1762)

oruebel · web-flow · commit 8eb9a3626587 · 2023-08-17T17:18:16.000-07:00
* Update text in streaming tutorial
* Fix bad intendation warning for plot_file.py tutorial
* Fix broken references to basic_trials section to point to time_intervals instead
* Fix sphinx lexer warning in mock.rst due to python code block containing non-python output
* Updated Changelog
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -6,6 +6,10 @@
 - Add `TimeSeries.get_timestamps()`. @bendichter [#1741](https://github.com/NeurodataWithoutBorders/pynwb/pull/1741)
 - Add `TimeSeries.get_data_in_units()`. @bendichter [#1745](https://github.com/NeurodataWithoutBorders/pynwb/pull/1745)
 
+### Documentation and tutorial enhancements
+- Updated streaming tutorial to ensure code is run on tests and clarify text. @bendichter [#1760](https://github.com/NeurodataWithoutBorders/pynwb/pull/1760) @oruebel [#1762](https://github.com/NeurodataWithoutBorders/pynwb/pull/1762)
+- Fixed minor documentation build warnings and broken links to `basic_trials` tutorial  @oruebel [#1762](https://github.com/NeurodataWithoutBorders/pynwb/pull/1762)
+
 ## PyNWB 2.4.1 (July 26, 2023)
 - Stop running validation tests as part of integration tests. They cause issues in CI and can be run separately. @rly [#1740](https://github.com/NeurodataWithoutBorders/pynwb/pull/1740)
 
diff --git a/docs/gallery/advanced_io/streaming.py b/docs/gallery/advanced_io/streaming.py
@@ -92,22 +92,10 @@
 #
 # Streaming Method 2: ROS3
 # ------------------------
-# ROS3 is one of the supported methods for reading data from a remote store. ROS3 stands for "read only S3" and is a
-# driver created by the HDF5 Group that allows HDF5 to read HDF5 files stored remotely in s3 buckets. Using this method
-# requires that your HDF5 library is installed with the ROS3 driver enabled. This is not the default configuration,
-# so you will need to make sure you install the right version of ``h5py`` that has this advanced configuration enabled.
-# You can install HDF5 with the ROS3 driver from `conda-forge <https://conda-forge.org/>`_ using ``conda``. You may
-# first need to uninstall a currently installed version of ``h5py``.
-#
-# .. code-block:: bash
-#
-#    pip uninstall h5py
-#    conda install -c conda-forge "h5py>=3.2"
-#
-# Now instantiate a :py:class:`~pynwb.NWBHDF5IO` object with the S3 URL and specify the driver as "ros3". This
-# will download metadata about the file from the S3 bucket to memory. The values of datasets are accessed lazily,
-# just like when reading an NWB file stored locally. So, slicing into a dataset will require additional time to
-# download the sliced data (and only the sliced data) to memory.
+# ROS3 stands for "read only S3" and is a driver created by the HDF5 Group that allows HDF5 to read HDF5 files stored
+# remotely in s3 buckets. Using this method requires that your HDF5 library is installed with the ROS3 driver enabled.
+# With ROS3 support enabled in h5py, we can instantiate a :py:class:`~pynwb.NWBHDF5IO` object with the S3 URL and
+# specify the driver as "ros3".
 
 from pynwb import NWBHDF5IO
 
@@ -116,18 +104,35 @@
     print(nwbfile)
     print(nwbfile.acquisition['lick_times'].time_series['lick_left_times'].data[:])
 
+##################################
+# This will download metadata about the file from the S3 bucket to memory. The values of datasets are accessed lazily,
+# just like when reading an NWB file stored locally. So, slicing into a dataset will require additional time to
+# download the sliced data (and only the sliced data) to memory.
+#
+# .. note::
+#
+#    Pre-built h5py packages on PyPI do not include this S3 support. If you want this feature, you could use packages
+#    from conda-forge, or build h5py from source against an HDF5 build with S3 support. You can install HDF5 with
+#    the ROS3 driver from `conda-forge <https://conda-forge.org/>`_ using ``conda``. You may
+#    first need to uninstall a currently installed version of ``h5py``.
+#
+#    .. code-block:: bash
+#
+#        pip uninstall h5py
+#        conda install -c conda-forge "h5py>=3.2"
+
+
 ##################################################
 # Which streaming method to choose?
 # ---------------------------------
 #
-# fsspec has many advantages over ros3:
-#
-# 1. fsspec is easier to install
-# 2. fsspec supports caching, which will dramatically speed up repeated requests for the
-#    same region of data
-# 3. fsspec automatically retries when s3 fails to return.
-# 4. fsspec works with other storage backends and
-# 5. fsspec works with other types of files.
-# 6. In our hands, fsspec is faster out-of-the-box.
+# From a user perspective, once opened, the :py:class:`~pynwb.file.NWBFile` works the same with
+# both fsspec and ros3.  However, in general, we currently recommend using fsspec for streaming
+# NWB files because it is more performant and reliable than ros3. In particular fsspec:
 #
-# For these reasons, we would recommend use fsspec for most Python users.
+# 1. supports caching, which will dramatically speed up repeated requests for the
+#    same region of data,
+# 2. automatically retries when s3 fails to return, which helps avoid errors when accessing data due to
+#     intermittent errors in connections with S3,
+# 3. works also with other storage backends (e.g., GoogleDrive or Dropbox, not just S3) and file formats, and
+# 4. in our experience appears to provide faster out-of-the-box performance than the ros3 driver.
diff --git a/docs/gallery/domain/plot_behavior.py b/docs/gallery/domain/plot_behavior.py
@@ -100,7 +100,7 @@
 # .. note::
 #    :py:class:`~pynwb.behavior.SpatialSeries` data should be stored as one continuous stream,
 #    as it is acquired, not by trial as is often reshaped for analysis.
-#    Data can be trial-aligned on-the-fly using the trials table. See the :ref:`basic_trials` tutorial
+#    Data can be trial-aligned on-the-fly using the trials table. See the :ref:`time_intervals` tutorial
 #    for further information.
 #
 # For position data ``reference_frame`` indicates the zero-position, e.g.
diff --git a/docs/gallery/general/plot_file.py b/docs/gallery/general/plot_file.py
@@ -122,10 +122,11 @@
 NWB organizes data into different groups depending on the type of data. Groups can be thought of
 as folders within the file. Here are some of the groups within an :py:class:`~pynwb.file.NWBFile` and the types of
 data they are intended to store:
- * **acquisition**: raw, acquired data that should never change
- * **processing**: processed data, typically the results of preprocessing algorithms and could change
- * **analysis**: results of data analysis
- * **stimuli**: stimuli used in the experiment (e.g., images, videos, light pulses)
+
+* **acquisition**: raw, acquired data that should never change
+* **processing**: processed data, typically the results of preprocessing algorithms and could change
+* **analysis**: results of data analysis
+* **stimuli**: stimuli used in the experiment (e.g., images, videos, light pulses)
 
 The following examples will reference variables that may not be defined within the block they are used in. For
 clarity, we define them here:
diff --git a/docs/gallery/general/plot_read_basics.py b/docs/gallery/general/plot_read_basics.py
@@ -246,7 +246,7 @@
 # and additional metadata.
 #
 # .. seealso::
-#     You can learn more about trials in the :ref:`basic_trials` tutorial section.
+#     You can learn more about trials in the :ref:`time_intervals` tutorial.
 #
 # Similarly to :py:class:`~pynwb.misc.Units`, we can view trials as a :py:class:`pandas.DataFrame`.
 
diff --git a/docs/source/testing/mock.rst b/docs/source/testing/mock.rst
@@ -42,9 +42,9 @@ If you want to create objects and automatically add them to an :py:class:`~pynwb
 
 Now this NWBFile contains an :py:class:`~pynwb.ophys.RoiResponseSeries` and all the upstream classes:
 
-.. code-block:: python
+.. code-block::
 
-    print(nwbfile)
+    >>> print(nwbfile)
 
     root pynwb.file.NWBFile at 0x4335131760
     Fields:

Original file line number	Diff line number	Diff line change
`@@ -100,7 +100,7 @@`
`100`	`100`	`# .. note::`
`101`	`101`	# :py:class:`~pynwb.behavior.SpatialSeries` data should be stored as one continuous stream,
`102`	`102`	`# as it is acquired, not by trial as is often reshaped for analysis.`
`103`		-# Data can be trial-aligned on-the-fly using the trials table. See the :ref:`basic_trials` tutorial
	`103`	+# Data can be trial-aligned on-the-fly using the trials table. See the :ref:`time_intervals` tutorial
`104`	`104`	`# for further information.`
`105`	`105`	`#`
`106`	`106`	# For position data ``reference_frame`` indicates the zero-position, e.g.
Original file line number	Diff line number	Diff line change
`@@ -246,7 +246,7 @@`
`246`	`246`	`# and additional metadata.`
`247`	`247`	`#`
`248`	`248`	`# .. seealso::`
`249`		-# You can learn more about trials in the :ref:`basic_trials` tutorial section.
	`249`	+# You can learn more about trials in the :ref:`time_intervals` tutorial.
`250`	`250`	`#`
`251`	`251`	# Similarly to :py:class:`~pynwb.misc.Units`, we can view trials as a :py:class:`pandas.DataFrame`.
`252`	`252`