Skip to content

[New Feature] Incremental builds — only rebuild what changed #26

@veeso

Description

@veeso

Summary

Currently, blogatto.build() deletes the entire output directory and rebuilds everything from scratch on every invocation. For small blogs this is fine, but as a site grows (many posts, heavy templates, asset copying), full rebuilds become wasteful. An incremental build mode would skip unchanged posts and only regenerate what's needed.

Motivation

  • The dev server (blogatto/dev) triggers a full rebuild on every file change — incremental builds would make the feedback loop significantly faster.
  • Blogs with hundreds of posts and large static assets pay a linear cost on every save, even when only one post changed.
  • Feeds, sitemaps, and static pages that depend on the post list only need regeneration when a post is added, removed, or its metadata changes — not when a typo is fixed in the body of an unrelated post.

Proposed Design

The idea is to introduce a manifest file (.blogatto_manifest.json) in the output directory that records the state of every source file (path + content hash) from the last successful build. On the next build, blogatto compares the current source files against the manifest to determine what changed.

Manifest structure

{
  "version": 1,
  "src_hash": "sha256:...",
  "posts": {
    "blog/hello-world/index.md": { "hash": "sha256:..." },
    "blog/hello-world/index-it.md": { "hash": "sha256:..." },
    "blog/hello-world/cover.jpg": { "hash": "sha256:..." }
  },
  "static": {
    "static/style.css": { "hash": "sha256:..." }
  }
}

The src_hash field is a combined hash of all files under src/. If it changes, all posts and static pages are invalidated — this is necessary because there's no other way to detect whether template or view functions changed.

Build behaviour with incremental mode

  1. Clean step is skipped — the output directory is preserved.
  2. Manifest is loaded from {output_dir}/.blogatto_manifest.json. If missing or corrupt, fall back to a full rebuild.
  3. Compute src_hash — hash all files under src/. If it differs from the manifest, mark all posts and static pages as dirty.
  4. For each post directory, compare file content hashes against the manifest:
    • Unchanged (and src_hash unchanged) — skip parsing and rendering; load cached Post(msg) metadata from the manifest (title, slug, date, description, excerpt, url, language, extras) so it's available for static pages and feeds. The contents field (rendered Lustre elements) is not cached — it's only needed for the post's own HTML page, which isn't being regenerated.
    • Changed — re-parse, re-render, re-write HTML and copy assets.
    • New (not in manifest) — full parse and build.
    • Deleted (in manifest but missing on disk) — remove corresponding output files.
  5. Static pages — regenerated only if src_hash changed or any post was added, removed, or had metadata changes. If nothing changed, static pages are skipped entirely.
  6. Feeds and sitemap — regenerated only if any post was added, removed, or had metadata changes. If only a post body changed (same title/date/description/slug), feeds can be skipped.
  7. Static assets (static_dir) — use the same hash comparison; only copy changed or new files, remove deleted ones.
  8. Write updated manifest after successful build.

To force a full rebuild when incremental mode is enabled, users can simply delete the output directory (or call a clean utility that does so).

API

A single new config option:

/// Enable incremental builds. Default: False.
/// When enabled, only changed files are rebuilt. A manifest file
/// is stored in the output directory to track file state.
pub fn incremental(config: Config(msg), enabled: Bool) -> Config(msg)

Users opt in explicitly. The default remains full rebuilds for predictability.

To force a full rebuild, users delete the output directory or use blogatto.clean(config):

/// Delete the output directory, forcing a full rebuild on the next call to `build()`.
pub fn clean(config: Config(msg)) -> Result(Nil, BlogattoError)

Key challenges

  1. Post metadata caching — Unchanged posts still need their Post(msg) available (minus contents) for static page views and feeds. The manifest must store enough metadata to reconstruct a "skeleton" Post(msg).
  2. Template / view changes — If the user changes their blog template function or static page views, all posts and pages should be re-rendered even though the markdown didn't change. Detecting this requires hashing all files under src/. This is coarse but correct — there's no other way to know whether a Gleam source change affects rendering.
  3. Config changes — If site_url, route_prefix, or route_builder change, all posts need re-routing. The manifest could store a hash of relevant config fields to detect this.
  4. contents field typePost(msg) contains contents: List(Element(msg)), which is a Lustre virtual DOM — not trivially serializable. Cached posts would have contents: [] and only posts being actively rendered would have their contents populated.

Implementation plan

  1. Define manifest types and JSON serialization (via gleam_json)
  2. Add incremental field to Config(msg) with builder function
  3. Add clean() public function to blogatto
  4. Implement file hashing (sha256 via Erlang :crypto) and src/ combined hash
  5. Refactor blogatto.build() to branch on incremental mode
  6. Implement incremental blog builder (diff against manifest, skip unchanged, reconstruct skeleton posts)
  7. Implement incremental static asset copying (diff, copy new/changed, remove deleted)
  8. Implement conditional static page regeneration (skip if no src or post changes)
  9. Implement conditional feed/sitemap regeneration (skip if no post metadata changes)
  10. Implement manifest writing after successful build
  11. Add tests for incremental rebuild scenarios (unchanged, changed, new, deleted posts; src changes; static asset changes)
  12. Document in README

Out of scope

  • Partial template re-rendering (only re-render changed sections within a page)
  • Dependency graph tracking (e.g., "post A includes post B's excerpt")
  • Watch-mode integration (dev server would simply call the same build() which is now incremental-aware)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions