Skip to content

[PRA-153] docs: Add autogenerated metadata description#186

Merged
izmalk merged 9 commits intocanonical:mainfrom
izmalk:docs-metadta-description
Feb 4, 2026
Merged

[PRA-153] docs: Add autogenerated metadata description#186
izmalk merged 9 commits intocanonical:mainfrom
izmalk:docs-metadta-description

Conversation

@izmalk
Copy link
Copy Markdown
Contributor

@izmalk izmalk commented Jan 24, 2026

Description

Add an autogenerated metadata description for every documentation page.

Checklist

  • I have added or updated any relevant documentation.

@izmalk izmalk self-assigned this Jan 24, 2026
Copy link
Copy Markdown
Contributor

@deusebio deusebio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this auto-generated? could you provide guidance on how you did this, such that we can also do this when creating new documentation?

Also we generally prepend the description with the Jira ticket. I believe we forgot that when merging the previous PR about docs, but could you update the title of the PR?

@izmalk izmalk changed the title docs: Add autogenerated metadata description [PRA-153] docs: Add autogenerated metadata description Jan 26, 2026
@izmalk izmalk added the documentation Improvements or additions to documentation label Jan 26, 2026
@izmalk
Copy link
Copy Markdown
Contributor Author

izmalk commented Jan 26, 2026

I used agentic Copilot (company licence, Claude Sonnet 4.5 model) in VScode with a pre-designed long prompt (which works great for that specific task). It detects all the documentation source files (by file extension and location), creates a summary and formats it in a specific way (frontmatter syntax for MyST). It also checks that the summary roughly fits expected length, that docs can be built without errors, that summary is there in the HTML header, etc.

I'm not sure this can be very useful for the generic documentation task, as this one is very simple:

  • Create a list of all documentation files with a simple bash command
  • Summarise contents of each page in a text roughly 120-160 character long
  • Add the summary in a specific syntax at the very beginning of the relevant page
  • Test thoroughly

Creating a summary of existing content is something that LLMs are very good at. And I still find some problems later during review that I fix manually. The same approach can't be directly applied to create useful new documentation. It's just a lot of menial work here done in a mostly automatic way. I can do a Tech Talk on this use case if you want.

Copy link
Copy Markdown
Member

@theoctober19th theoctober19th left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, Vladimir. The automation to generate this seems interesting -- and to be honest I found it quite efficient and precise in most cases. I have a few minor comments here and there.

Comment thread docs/explanation/configuration.md
Comment thread docs/how-to/apache-kyuubi/back-up-and-restore.md Outdated
Comment thread docs/how-to/deploy/kyuubi.md Outdated
Comment thread docs/how-to/manage-service-accounts/using-integration-hub.md Outdated
Comment thread docs/how-to/manage-service-accounts/using-spark-client-snap.md Outdated
Comment thread docs/how-to/self-signed-certificates.md Outdated
Comment thread docs/reference/releases/revision-2.md
Comment thread docs/tutorial/2-distributed-data-processing.md
Co-authored-by: Bikalpa Dhakal <theoctober19th@gmail.com>
Signed-off-by: Vladimir Izmalkov <48120135+izmalk@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

@izmalk izmalk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments for review suggestions.

Comment thread docs/explanation/configuration.md
Comment thread docs/how-to/apache-kyuubi/back-up-and-restore.md Outdated
Comment thread docs/how-to/deploy/kyuubi.md Outdated
Comment thread docs/how-to/manage-service-accounts/using-integration-hub.md Outdated
Comment thread docs/how-to/manage-service-accounts/using-spark-client-snap.md Outdated
Comment thread docs/tutorial/2-distributed-data-processing.md
Signed-off-by: Vladimir Izmalkov <48120135+izmalk@users.noreply.github.com>
Copy link
Copy Markdown
Member

@theoctober19th theoctober19th left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the enhancement!

@deusebio
Copy link
Copy Markdown
Contributor

I used agentic Copilot (company licence, Claude Sonnet 4.5 model) in VScode with a pre-designed long prompt (which works great for that specific task).

Oh I see. Yes, that's a very good use-case.

I can do a Tech Talk on this use case if you want.

I believe it would be nice that, or also just showing it on a Demo session. Either way, I believe it is worth sharing :D

Comment thread docs/explanation/index.md
Comment thread docs/how-to/apache-kyuubi/index.md Outdated
Comment thread docs/how-to/deploy/kyuubi.md Outdated
Comment thread docs/how-to/deploy/spark.md Outdated
Comment thread docs/how-to/manage-service-accounts/using-integration-hub.md Outdated
Copy link
Copy Markdown
Contributor

@deusebio deusebio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I have just a suggestions to revise some of the headings to make it more compliant with Diataxis (that LLMs do not seem to like :D). For instance, I would avoid "learn" in how-to guides, and also provide references to Diataxis in the main indices

But these are mostly non-blocking minor improvements. So up to you whether you feel are sensible to follow up

izmalk and others added 4 commits January 30, 2026 10:18
Co-authored-by: deusebio <edeusebio85@gmail.com>
Signed-off-by: Vladimir Izmalkov <48120135+izmalk@users.noreply.github.com>
@izmalk izmalk merged commit 6276026 into canonical:main Feb 4, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants