Skip to content

Conversation

@dmdunla
Copy link
Collaborator

@dmdunla dmdunla commented Jul 10, 2025

This PR is aimed at creating a consistent documentation across the pyttb classes and algorithms.

To that end, here are some tasks to complete:

  • Evaluate the use of autosummary in the Sphinx documentation
  • Create consistent documentation across the classes
  • Create consistent documentation across the algorithms

📚 Documentation preview 📚: https://pyttb--444.org.readthedocs.build/en/444/

@dmdunla dmdunla linked an issue Jul 10, 2025 that may be closed by this pull request
@dmdunla
Copy link
Collaborator Author

dmdunla commented Jul 10, 2025

First attempt at using autosummary ...

Considerations:

  • I followed the pydata.sparse documentation approach to start
    • We can consider others, if desired. This was just one we discussed and is pretty straightforward.
  • For the Reference documentation, I opted to let autosummary create all *.rst stubs for all classes and methods
    • Current: When using class templates (docs/source/_templates/autosummary), there is no customization allowed across classes (e.g., no class name customization, like "Dense Tensor (:class:tensor)"
    • Option: Create class RST files (e.g., tensor.rst) manually for each class and then create the attribute and method stubs using autosummary. Some manual maintenance would then be needed.
  • There are currently a lot of warnings -- we would need to address the path issues.
  • Attributes that are defined in __slots__ are not documented, which creates inconsistency in the Attributes tables.
    • Option: We can define internal attributes in __slots__ (e.g., _data) and then create an associated property (e.g., data) with getter/setter that can be documented. This is one approach that is recommended online.

@tgkolda
Copy link
Collaborator

tgkolda commented Jul 10, 2025

I notice the side menus are missing. I'm also not sure where functions like tenones are documented. Maybe they aren't. They don't seem to be linked. I also don't seem to be able to click through on the functions in the tables to their detailed documentation. I guess it's all there, but it's hard to get to.

@dmdunla
Copy link
Collaborator Author

dmdunla commented Jul 10, 2025

It seems like something is different on RTD. Locally, the nav bar (toctree) is available. Just not online. I'll work to fix that.

@dmdunla
Copy link
Collaborator Author

dmdunla commented Jul 10, 2025

Getting closer on the docs with the nav bar (toctree) being populated now. But there are still warnings in doc generation and all of the method documentation is still appearing in the class documentation (as opposed to just the autosummaries).

@dmdunla
Copy link
Collaborator Author

dmdunla commented Jul 10, 2025

One other issue is that the autosummary tables for methods do not pick up the special members, so dunder methods will not appear in the table.

@dmdunla dmdunla requested review from ntjohnson1 and tgkolda July 10, 2025 22:45
@dmdunla
Copy link
Collaborator Author

dmdunla commented Jul 10, 2025

In Reference in the docs, just look at pyttb.tensor for now, as that is the main class I've been working on with the switch to autosummary.

@ntjohnson1
Copy link
Collaborator

My primary concern is on the level of upkeep. I'm fine on whatever aesthetics people agree on as long as there is some semi-automated way to enforce it.

For upkeep this looks great. If we wanted to go one level deeper (specify an initial page for each class so we can get the nested methods), that is where I had an issue with our file/class naming indirection issues.

I generated a tensor.rst to mirror https://github.com/pydata/sparse/blob/0.15.1/docs/generated/sparse.COO.rst and couldn't figure out how to convince autosummary to parse the class vs the module.

If we think the current level of nesting is fine then no concern from me.

@tgkolda
Copy link
Collaborator

tgkolda commented Jul 11, 2025

I'm not sure the advantage of autosummary. I see several disadvantages.

The main advantage I anticipated is that it would put each function on its own page, but it didn't do that.

The disadvantage is that you've now lost the ability to easily document the functions that are in the file but not in the class (like tenones).

It looks also like like the init help has been moved up to the class level. That shouldn't be necessary. We talked about just having the one-liner at the class level and then the rest at the function level. Why the change?

Additionally, the 'attributes' section was removed and replaced by 'parameters', but these are 2 different things. Now the attributes of data and shape have no description in the attribute table.

Given that there are only a handful of classes, I'm not sure the autosummary is worth the extra effort. This is assuming I'm understand what it's doing.

@dmdunla
Copy link
Collaborator Author

dmdunla commented Jul 11, 2025

Suggestion: Do no use autosummary in our docs.

I'd rather maintain control over what we want in our docs, and I think we can more easily accomplish this currently without the use of autosummary.

@ntjohnson1 I tried many combinations of the autosummary options, and I was unable to get exactly what I had wanted. I'm assuming some level of user minunderstanding/error on my part, but my assessment is that it is a challenge to support our current implementation nuances with autosummary.

@tgkolda I moved the all docs from class/__init__ to the class level, as accessing help on the class in the REPL and Jupyter notebooks pulls from the class docs. I checked several other packages and see that this is a common approach, with no docstrings for the __init__ methods. I think this is a better approach moving forward.

@tgkolda I agree about the Attributes documentation. autosummary gathered all attributes/properties into a single table (automiatically), which I like, but the lack of documentation of attributes that are defined as __slots__ is unfortunate. The workaround I suggested above is not satisfying, as it requires a lot more manual coding. I think we need to decide how we want to handle properties in the code (where should they be placed) and attributes/properties in the documentation (e.g., directly after the Parameters, after the Examples, etc.).

I plan to tag this branch as the end of the autosummary assessment and provide an updated version of our old approach that we can discuss next.

@dmdunla
Copy link
Collaborator Author

dmdunla commented Jul 11, 2025

As I move away from autosummary here, I suggest the following flow of class docs:

  1. All class docs are included in one place -- at the class level. No docstrings will be provided for __init__.
  2. The order of the elements of the class docs are as follows:
    • Short, single-line class description (required), followed by a blank line
    • Longer class description (optional)
    • Parameters
    • Examples
    • Note listing other ways to construct instances of the class
    • Note with link to tutorial (jupyter notebook)
    • Methods
  3. The nav bar (toctree) includes the following (as ordered in the source code):
    • Attributes/properties
    • All methods, including dunder methods

If we use the alphabetical ordering of methods in the toctree, attributes/properties are mixed in with methods and dunder methods appear first (which I do not like). I think that determining the flow of the toctree layout via the layout of the source code is a better option.

@tgkolda, @ntjohnson1 Please review and comment on the suggested layout

@tgkolda
Copy link
Collaborator

tgkolda commented Jul 11, 2025

Hey Danny, Great progress. I know that sphinx is painful. Here are problems I notice...

  • Documentation for tenones, etc. is still MIA
  • The main Reference page is funny. It has these arguments for the functions but they are inside parentheses and square brackets. I would vote for no arguments at all.
  • Attribute descriptions are still missing in the table for data and shape. Not sure why some appear and others don't.
  • Under Methods, init is showing.
  • Under Methods, none of the functions is clickable.
  • Under Methods, some functions have the same parentheses plus square brackets argument listing. I find it odd. I'm starting to infer that the brackets indicate optional??
  • In your item 2, it's missing attributes before methods
  • In your item 3, it's missing other methods defined in the same file (like tenones)
  • A more general question, why are ndims and nnz properties rather than functions?
  • I'm a bit confused that the autosummary is still there.

@dmdunla
Copy link
Collaborator Author

dmdunla commented Jul 11, 2025

@tgkolda Sorry, it may not have been clear in my previous message ... I have not yet changed anything from the autosummary version, which I plan to drop in the next commit. I plan to address your suggestions and others in the next commit, and I'll send a new review request when that lands on this branch.

@tgkolda As for property vs. method, we use properties to define class methods that return values and act like data members but are derived/computed values. So, for ndims and nnz methods, we decorate these as properties, since they act like data members (e.g., data or shape), which are real class variables. Moreover, we can avoid users trying to set these values (when needed) by decorating a method as a property but not defining the setter for that property. We could just define these as data members and manage their values in the methods that impact them, but then users would be able to change them directly (which we do not want).

@ntjohnson1
Copy link
Collaborator

Suggested layout seems fine to me.

Once we have a proof of concept that works probably worth documenting in the contributing docs, and see if there's an easy way to enforce it (or else checking just falls to you manually in review) but those are future steps

@dmdunla
Copy link
Collaborator Author

dmdunla commented Aug 31, 2025

@ntjohnson1 I am unable to get sphinx to build without warnings about references in nitpicky mode (-n). Are you getting this as well in your dev env?

I am following the doc build info from CONTRIBUTING.md, which was changed to the following recently in #432 :

sphinx-build ./docs/source ./docs/build -W -n --keep-going

However, the build on RTD does not use nitpicky mode (see recent build):

python -m sphinx -T -W --keep-going -b html -d _build/doctrees -D language=en . $READTHEDOCS_OUTPUT/html

Attached (link below) is what I get in nitpicky mode (also building from scratch using -E so you can see the entire build log). Note the WARNING lines after writing output, all stating some kind of reference that is missing. And there are other references just like these in other methods, so I cannot determine what is happening.

Any ideas?

pyttb-sphinx-error-1.8.4-dev.txt

@dmdunla
Copy link
Collaborator Author

dmdunla commented Jan 9, 2026

@ntjohnson1 I have completed the doc overhaul of tensor, sptensor, and ktensor. I'd like to land this PR at this point and then create individual issues for the remainder of the doc updates (other data classes, helper functions, tutorials, etc.). Having a single massive PR is overwhelming and hard to keep in sync with other changes. If you can review the completed data class documentation, that would be really helpful.

Copy link
Collaborator

@ntjohnson1 ntjohnson1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed everything but sptensor and ktensor. Should hopefully have a chance to review those later this weekend.

Copy link
Collaborator

@ntjohnson1 ntjohnson1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed ktensor still need to check sptensor.

Unfortunately codespell and typos both don't seem to inspect mutliline doc strings 😢
maybe a follow up to see if there's a nice to to check those

@dmdunla
Copy link
Collaborator Author

dmdunla commented Jan 10, 2026

Ok, thanks.

BTW, two follow-ups as you’re looking through these:

  1. I have not worked through the tutorials yet. But I can if you think these should land at the same altimeter as the data class docs.
  2. I am ready to jettison the Returns sections in the docstrings in favor of type hints. If there needs to be some info about what is returned, I’d rather put that info in the main doctoring block. But I see that I have not been consistent, especially with the tensor class. Unless I hear otherwise, I will remove the sprinkling of Returns that remain.

@ntjohnson1
Copy link
Collaborator

  • I think it's find to land this first.
  • Big fan of not repeating things (types vs docstrings types). Returns should detail additional useful context. If we think most of the returns aren't useful fine removing

Copy link
Collaborator

@ntjohnson1 ntjohnson1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty much all just NITs a little annoyed our typo tool isn't checking doc strings but that's definitely a follow up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Documentation overhaul

4 participants