Skip to content

Autogenerating Documentation #98

Open
@mfripp

Description

@mfripp

I would really like to have documentation for each module with a number of features:

  • compact but comprehensive list of components defined in that module
  • descriptions of the components where needed
  • concise representation of the rules used to define each component

We have a lot of this in the Supplemental Information file for the Switch 2.0 paper. But ideally these elements would be automatically extracted from the source code, to make sure we cover everything. In the near term, this would be helpful for cross-checking that everything is covered in the Supplemental Information file, and possibly to add some extra detail there (cross-reference Python names for the Latex terms, list the tables that parameters are defined in, etc.) In the longer term, this could help us create web-based documentation that uses Python terms rather than Latex, is more readable than our current source code, and allows cross-referencing terms between modules.

I've been playing around this week to see what might be possible along these lines. I'm pretty sure now that we could do this by inserting our detailed comments throughout the main code in each module, using docstring format (triple-quotes). This text would be similar (often identical) to the comments currently written at the top of each module, but would be dispersed throughout the module instead. Once that is done, I'm pretty sure we (I) could automatically generate documentation pages for each module by following these steps:

  • read the module using AST, then scan through all the standard callback functions in the module and...
  • add docstrings directly to a reStructuredText (rst) file (in order)
  • convert Constraint, Expression, Set, Param, Param.default and Var.bound rules into easier-to-read Python-style expressions and insert them into the rst file (in sequence with the docstrings)
  • insert function definitions into the rst file (in sequence with the docstrings)
  • move component definition expressions just above the rule functions they rely on and add "where Constraint_rule is given by:" to improve readability
  • accumulate lists of Params, Expressions, Vars and Sets and relevant descriptors (indexing set, domain, has default value, etc.)
  • accumulate extra information on Params from the load_aug calls in the load_inputs function (filename where params come from)
  • write summary tables of all components defined in the module at the top of the rst file.

Once this is done, it's fairly easy to convert the rst file into HTML, Tex, PDF, etc.

At a later stage, we may be able to use standard translations to convert the Python component names (and eventually maybe even the sum()-type expressions) from this rst file into equivalent Tex terms (i.e., shorter variable names), and write additional Tex-oriented rst files. Then it might be pretty quick work to tweak that to good Tex code and/or we could use a system to retain any translated code that has been manually tweaked, until the corresponding Python code changes (this would probably be something for later in the year).

To help you see how this could work, I have attached a zip file commit_autodoc.zip (but see new version in comment below) containing three example files:

  • a Python module (generators.core.commit.operate) which has had about 2/3 of the comments interleaved in the body
  • an rst file which I created manually from this, but which I think I could produce automatically after a day or two of work
  • an html file generated from the rst file (needs a better stylesheet, but you get the idea)

So my questions are:

  1. does this sound like a good idea?
    • pro: unified management of code and documentation, "literate programming", less brain requirement to link the documentation (formerly at top of file) to code
    • con: code is a little more cluttered (but this may be made up by having easier to read documentation)
  2. Rodrigo has done amazing work to prepare the Tex documentation and Josiah did amazing work to write documentation at the top of all the module files. This would potentially unify those two threads of work. Is that helpful?

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions