Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .github/workflows/run_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,15 @@ on:
types: [opened, synchronize, reopened, ready_for_review]
branches:
- main
push:
branches:
- main
paths-ignore:
- "*.md"
- "*.codespellrc"
- ".github/**"
- "!.github/workflows/run_tests.yml"
- "docs/**"
push:
branches:
- main
workflow_dispatch:
inputs:
test_all_matlab_releases:
Expand Down
13 changes: 4 additions & 9 deletions docs/source/pages/concepts/file_read.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,17 +15,12 @@ This command performs several important tasks behind the scenes:
2. **Automatically generates MATLAB classes** needed to work with the data
3. **Returns an NwbFile object** representing the entire file

The returned `NwbFile` object is the primary access point for all the data in the file. In the :ref:`next section<matnwb-read-nwbfile-intro>`, we will examine the structure of this object in detail, covering how to explore it using standard MATLAB dot notation to access experimental metadata, raw recordings, processed data, and analysis results, as well as how to search for specific data types.
The returned :class:`NwbFile` object is the primary access point for all the data in the file. In the :ref:`next section<matnwb-read-nwbfile-intro>`, we will examine the structure of this object in detail, covering how to explore it using standard MATLAB dot notation to access experimental metadata, raw recordings, processed data, and analysis results, as well as how to search for specific data types.

.. important::
**Lazy Loading:** MatNWB uses lazy reading to efficiently work with large datasets. When you access a dataset through the `NwbFile` object, MatNWB returns a :class:`types.untyped.DataStub` object instead of loading the entire dataset into memory. This allows you to:

- Work with files larger than available RAM
- Read only the portions of data you need
- Index into datasets using standard MATLAB array syntax
- Load the full dataset explicitly using the ``.load()`` method

For more details, see :ref:`DataStubs and DataPipes<matnwb-read-untyped-datastub-datapipe>`.
**Lazy Loading:** MatNWB uses lazy reading to efficiently handle large files. When you read an NWB file using :func:`nwbRead`, only the file structure and metadata are initially loaded into memory. This approach enables quick access to the file’s contents and makes it possible to work with files larger than the system’s available RAM.

To learn how to load data from non-scalar or multidimensional datasets into memory, see :ref:`DataStubs and DataPipes<matnwb-read-untyped-datastub-datapipe>`.

.. note::
The :func:`nwbRead` function currently does not support reading NWB files stored in Zarr format.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ You can check a file's schema version:
How MatNWB Generates Classes
----------------------------

When you call ``nwbRead``, MatNWB performs several steps behind the scenes:
When you call :func:`nwbRead`, MatNWB performs several steps behind the scenes:

1. **Reads the file's embedded schema** information
2. **Generates MATLAB classes** for neurodata types defined by the schema version used to create the file
Expand Down
36 changes: 33 additions & 3 deletions docs/source/pages/concepts/file_read/untyped.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ Utility Types in MatNWB

Documentation for "untyped" types will be added soon


"Untyped" Utility types are tools which allow for both flexibility as well as limiting certain constraints that are imposed by the NWB schema. These types are commonly stored in the ``+types/+untyped/`` package directories in your MatNWB installation.

.. _matnwb-read-untyped-sets-anons:
Expand Down Expand Up @@ -36,13 +35,44 @@ The **Anon** type (``types.untyped.Anon``) can be understood as a Set type with
DataStubs and DataPipes
~~~~~~~~~~~~~~~~~~~~~~~

**DataStubs** serves as a read-only link to your data. It allows for MATLAB-style indexing to retrieve the data stored on disk.
When working with NWB files, datasets can be very large (gigabytes or more). Loading all this data into memory at once would be impractical or impossible. MatNWB uses two types to handle on-disk data efficiently: **DataStubs** and **DataPipes**.

DataStubs (Read only)
^^^^^^^^^^^^^^^^^^^^^

A **DataStub** (``types.untyped.DataStub``) represents a read-only reference to data stored in an NWB file. When you read an NWB file, non-scalar and multi-dimensional datasets are automatically represented as DataStubs rather than loaded into memory.

.. image:: https://github.com/NeurodataWithoutBorders/nwb-overview/blob/main/docs/source/img/matnwb_datastub.png?raw=true

Key characteristics:

- **Lazy loading**: Data remains on disk until you explicitly access it
- **Memory efficient**: Only the portions you request are loaded
- **MATLAB-style indexing**: Access data using familiar syntax like ``dataStub(1:100, :)``
- **Read-only**: Cannot be used to modify or write data

You'll encounter DataStubs whenever you read existing NWB files containing non-scalar or multi-dimensional datasets.

DataPipes (read and write)
^^^^^^^^^^^^^^^^^^^^^^^^^^

A **DataPipe** (``types.untyped.DataPipe``) extends the concept of lazy data access to support **writing** as well as reading. While DataStubs are created automatically when reading files, you create DataPipes explicitly when writing data.

Key characteristics:

- **Bidirectional**: Supports both reading and writing operations
- **Incremental writing**: Stream data to disk in chunks rather than all at once
- **Compression support**: Apply HDF5 compression and chunking strategies
- **Write optimization**: Configure how data is stored on disk for better performance

DataPipes solve the problem of writing datasets that are too large to fit in memory, or when you want fine-grained control over how data is stored in the HDF5 file.

.. seealso::

**DataPipes** are similar to DataStubs in that they allow you to load data from disk; however, they also provide a wide array of features that allow the user to write data to disk, either by streaming parts of data in at a time or by compressing the data before writing. The DataPipe is an advanced type and users looking to leverage DataPipe's capabilities to stream/iteratively write or compress data should read the :doc:`Advanced Data Write Tutorial </pages/tutorials/dataPipe>`
- For detailed guidance on creating and configuring DataPipes, see :doc:`Advanced Data Write Tutorial </pages/tutorials/dataPipe>`

.. todo
- For practical examples of reading data via DataStubs, see :ref:`How to Read Data </pages/how-to/read-on-demand>`

.. _matnwb-read-untyped-links-views:

Expand Down