Pre 6.2 (#239)

PeterKraus · web-flow · commit 6859554c8cf2 · 2025-08-20T08:22:37.000+02:00
* minor version info changes

* add Graham to author list

* Minor docs changes.
diff --git a/docs/source/devdocs.rst b/docs/source/devdocs.rst
@@ -1,6 +1,10 @@
 Developer documentation
 -----------------------
 
+.. note::
+
+    If you want to use yadg in your project, you probably want to see :ref:`the API of the extractor mode <extractor api>` or process a `dataschema` using :func:`yadg.core.process_schema`.
+
 The project follows fairly standard developer practices. Every new feature should be associated with a test, and every PR requires linting and formatting using ``ruff``.
 
 Testing
@@ -23,4 +27,4 @@ Implementing new features
 - adding their schema into :class:`dgbowl_schemas.yadg.dataschema.DataSchema`
 - adding their implementation in a separate Python package under :mod:`yadg.extractors`
 
-Each extractor should be documented by adding a structured docstring at the top of the file. This documentation should describe the application and usage of the extractor, and refer to the Pydantic audotocs via :obj:`~dgbowl_schemas.yadg.dataschema` to discuss the features exposed via the parameters dictionary. If the filetype extracted is binary, a description of the file structure should be provided in the docstring. Every new filetype will have to be added into the :mod:`~dgbowl_schemas.yadg.dataschema.filetype` module as well.
+Each extractor should be documented by adding a structured docstring at the top of the file. This documentation should describe the application and usage of the extractor, and refer to the Pydantic audotocs via :obj:`~dgbowl_schemas.yadg.dataschema` to discuss the features exposed via the parameters dictionary. If the filetype extracted is binary, a description of the file structure should be provided in the docstring. Every new filetype will have to be added into the :mod:`~dgbowl_schemas.yadg.dataschema.filetype` module as well.
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -18,6 +18,7 @@ Contributors
 - `Peter Kraus <https://github.com/PeterKraus>`_
 - `Nicolas Vetsch <https://github.com/vetschn>`_
 - `Carla Terboven <https://github.com/carla-terboven>`_
+- `Graham Kimbell <https://github.com/g-kimbell>`_
 
 Acknowledgements
 ````````````````
diff --git a/docs/source/usage.rst b/docs/source/usage.rst
@@ -40,6 +40,25 @@ The ``infile`` will be then parsed using **yadg** into a :class:`~xarray.DataTre
 
         yadg extract --locale=de_DE --encoding=utf-8 --timezone=Europe/Berlin filetype infile [outfile]
 
+.. _extractor api:
+
+API endpoint for `extractor` mode
+`````````````````````````````````
+If you want to use **yadg** in your own code, you should use the common extractors API available in the :mod:`yadg.extractors` module:
+
+.. autofunction:: yadg.extractors.extract
+    :no-index:
+
+.. autofunction:: yadg.extractors.extract_from_path
+    :no-index:
+
+.. autofunction:: yadg.extractors.extract_from_bytes
+    :no-index:
+
+.. warning::
+
+    Please do not use the :func:`extract` functions from each extractor (e.g. :func:`yadg.extractors.eclab.mpr.extract_from_path`) directly. Those are not part of the user-facing API and their function signatures may change between minor or point versions.
+
 
 Metadata-only extraction
 ````````````````````````
@@ -103,6 +122,13 @@ If you'd like to update a `dataschema` from a previous version of **yadg** to th
 
 This will update the `dataschema` specified in ``infile`` and save it to ``outfile``, if provided.
 
+API for processing `dataschema`
+```````````````````````````````
+
+
+.. autofunction:: yadg.core.process_schema
+    :no-index:
+
 
 .. _NetCDF: https://www.unidata.ucar.edu/software/netcdf/
 
diff --git a/docs/source/version.6_2.rst b/docs/source/version.6_2.rst
@@ -1,19 +1,16 @@
-**yadg** version next
-`````````````````````
-..
-  .. image:: https://img.shields.io/static/v1?label=yadg&message=v6.1&color=blue&logo=github
-    :target: https://github.com/PeterKraus/yadg/tree/6.1
-  .. image:: https://img.shields.io/static/v1?label=yadg&message=v6.1&color=blue&logo=pypi
-    :target: https://pypi.org/project/yadg/6.1/
-  .. image:: https://img.shields.io/static/v1?label=release%20date&message=2025-06-03&color=red&logo=pypi
+**yadg** version 6.2
+````````````````````
 
+.. image:: https://img.shields.io/static/v1?label=yadg&message=v6.2&color=blue&logo=github
+  :target: https://github.com/PeterKraus/yadg/tree/6.2
+.. image:: https://img.shields.io/static/v1?label=yadg&message=v6.2&color=blue&logo=pypi
+  :target: https://pypi.org/project/yadg/6.2/
+.. image:: https://img.shields.io/static/v1?label=release%20date&message=2025-08-20&color=red&logo=pypi
 
-Developed in the `ConCat Lab <https://tu.berlin/en/concat>`_ at Technische Universität Berlin (Berlin, DE).
-
-New features in ``yadg-next`` are:
 
+Developed in the `ConCat Lab <https://tu.berlin/en/concat>`_ at Technische Universität Berlin (Berlin, DE).
 
-Breaking changes in ``yadg-next`` are:
+Breaking changes in ``yadg-6.2`` are:
 
   - Some column names in :mod:`yadg.extractors.eclab.mpr` files might have changed, as EC-Lab 11.62 has a new naming convention for derived quantities. In particular:
 
@@ -22,7 +19,7 @@ Breaking changes in ``yadg-next`` are:
     - ``P`` is now ``Pwe``,
     - ``R`` is now ``Rwe``.
 
-    This will also unfortunately affect processing older ``mpr`` files. Depending on which version of EC-Lab was used to convert the ``mpr`` file to the ``mpt`` file, the ``mpt`` file will contain the old (i.e. ``P`` or ``Energy charge``) or the new (i.e. ``Pwe`` or ``Energy we charge``) column names. For yadg internal consistency testing, we still attempt an exact match between ``mpr`` and ``mpt`` columns; if the ``mpr`` column is not present in the ``mpt`` file, we look for an equivalent column without the ``we`` annotation.
+    This will also unfortunately affect processing older ``mpr`` files. Depending on which version of EC-Lab was used to convert the ``mpr`` file to the ``mpt`` file, the ``mpt`` file will contain the old (i.e. ``P`` or ``Energy charge``) or the new (i.e. ``Pwe`` or ``Energy we charge``) column names. For yadg internal test-suite, we still attempt an exact match between ``mpr`` and ``mpt`` columns; if the ``mpr`` column is not present in the ``mpt`` file, we look for an equivalent column without the ``we`` annotation.
 
   - The ``control/V/mA`` column and the ``mode`` column in :mod:`~yadg.extractors.eclab.mpr` as well as :mod:`~yadg.extractors.eclab.mpr` files is now used to create the ``control_V`` (units ``V``) and ``control_I`` (units ``mA``) columns in both kinds of files:
 
diff --git a/pyproject.toml b/pyproject.toml
@@ -6,9 +6,10 @@ build-backend = "setuptools.build_meta"
 dynamic = ["version"]
 name = "yadg"
 authors = [
-    {name = "Peter Kraus", email = "peter.kraus@tu-berlin.de"},
+    {name = "Peter Kraus"},
     {name = "Nicolas Vetsch"},
     {name = "Carla Terboven"},
+    {name = "Graham Kimbell"},
 ]
 maintainers = [
     {name = "Peter Kraus", email = "peter.kraus@tu-berlin.de"},
diff --git a/src/yadg/core.py b/src/yadg/core.py
@@ -11,10 +11,21 @@
 
 def process_schema(dataschema: DataSchema, strict_merge: bool = False) -> DataTree:
     """
-    The main processing function of yadg.
+    The main :class:`DataSchema` processing function of yadg.
 
-    Takes in a :class:`DataSchema` object and returns a single :class:`DataTree` created
-    from the :class:`DataSchema`.
+    Takes in a :class:`DataSchema` object, updates it to the latest version compatible
+    with the installed version of yadg, processes each `step`, and returns a single
+    :class:`DataTree` created from the :class:`DataSchema`.
+
+    Parameters
+    ----------
+
+    dataschema:
+        A :class:`DataSchema` object describing the extraction process.
+
+    strict_merge:
+        A :class:`bool` indicating whether metadata of the files processed in a single `step`
+        has to be identical. Defaults to ``False`` which means conflicts will be dropped.
 
     """
     if strict_merge:
diff --git a/src/yadg/extractors/__init__.py b/src/yadg/extractors/__init__.py
@@ -47,7 +47,7 @@ def extract(
     Extract data and metadata from a path using the supplied filetype.
 
     A wrapper around the :func:`extract_from_path` worker function, which creates a
-    default extractor object. Coerces any :class:`str` provided to :class:`Path`.
+    default extractor object. Coerces any :class:`str`s provided as ``path`` to :class:`Path`.
 
     Parameters
     ----------
@@ -56,7 +56,7 @@ def extract(
         Specifies the filetype. Has to be a filetype supported by the dataschema.
 
     path:
-        A :class:`pathlib.Path` object pointing to the file to be extracted.
+        A :class:`Path` object pointing to the file to be extracted.
 
     timezone:
         A :class:`str` containing the TZ identifier, e.g. "Europe/Berlin".
@@ -93,6 +93,15 @@ def extract_from_path(
     the returned objects are flattened using json serialisation. The returned objects
     have a :func:`to_netcdf` as well as a :func:`to_dict` method, which can be used to
     write the returned object into a file.
+
+    Parameters
+    ----------
+
+    source:
+        A :obj:`Path` pointing to the extracted file.
+
+    extractor:
+        A :class:`FileType` object describing the extraction process.
     """
 
     m = importlib.import_module(f"yadg.extractors.{extractor.filetype}")
@@ -127,6 +136,15 @@ def extract_from_bytes(
     the returned objects are flattened using json serialisation. The returned objects
     have a :func:`to_netcdf` as well as a :func:`to_dict` method, which can be used to
     write the returned object into a file.
+
+    Parameters
+    ----------
+
+    source:
+        A :obj:`bytes` object containing the raw data to be extracted.
+
+    extractor:
+        A :class:`FileType` object describing the extraction process.
     """
 
     m = importlib.import_module(f"yadg.extractors.{extractor.filetype}")