feat: A native pipeline for content#16149
Conversation
🦋 Changeset detectedLatest commit: 1f15ab6 The changes in this PR will be included in the next version bump. This PR includes changesets to release 428 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
commit: |
📊 Dependency Size ChangesWarning This PR adds 5.8 MB of new dependencies, which exceeds the threshold of 100 kB.
Total size change: 5.8 MB |
Merging this PR will not alter performance
Comparing Footnotes
|
a9ca5d7 to
bcce353
Compare
| @@ -0,0 +1,86 @@ | |||
| import yaml from 'js-yaml'; | |||
There was a problem hiding this comment.
I didn't want to change this (this is not a new file) in case not doing it this way would ripple through a million files, but some (Sätteri) Markdown processors can return both the frontmatter and the content in one very fast call, so extracting it like that is inefficient.
| const firstPos = 'position' in first ? first.position : undefined; | ||
| const lastPos = 'position' in last ? last.position : undefined; |
There was a problem hiding this comment.
This seems unrelated at first glance, but our Vite plugin for .html imports use rehype, so global augmentation to the hast module by Sätteri and Rehype ripples here.
| markdown: { | ||
| syntaxHighlight: false, | ||
| // Keep straight quotes so assertions can match `import.meta.env` output literally | ||
| processor: satteri({ features: { smartPunctuation: false } }), |
There was a problem hiding this comment.
Sätteri's smartypants implementation is better than the popular remark plugin, this caused the test to fail because it expected quotes to stay straight, see https://satteri.bruits.org/docs/divergences/#smart-punctuation-pairing-across-nodes
| // Coerce element/null to boolean before asserting — assert.equal on a raw | ||
| // linkedom Element vs null pulls util.inspect through the DOM's circular | ||
| // refs and can take down the test process on failure. | ||
| assert.equal( | ||
| selectRemarkExample(document) !== null, | ||
| true, | ||
| 'MDX remark plugins not applied.', | ||
| ); |
There was a problem hiding this comment.
This is the thing that I wasted a day chasing a memory leak for, if the previous test failed, it would instantly take 25gb of RAM and crash everything. This form avoids this problem
| while ((match = regex.exec(html)) !== null) { | ||
| const matchKey = ${rawUrl} + '_' + occurrenceCounter; | ||
| const imageProps = JSON.parse(match[1].replace(/"/g, '"').replace(/'/g, "'")); | ||
| const imageProps = JSON.parse(match[1].replace(/&(?:#x22|quot);/g, '"').replace(/&(?:#x27|apos);/g, "'")); |
There was a problem hiding this comment.
Same thing regarding named vs number entities.
| @@ -0,0 +1,47 @@ | |||
| // TODO: This is a workaround around a missing API in Sätteri. The visitor architecture naturally does not provide | |||
There was a problem hiding this comment.
This will be gone before the actual release, it's not a bug or anything, just haven't decided of an API shape for this usecase yet.
| @@ -0,0 +1,158 @@ | |||
| import type * as hast from 'hast'; | |||
There was a problem hiding this comment.
I moved all the processor-agnostic Markdown stuff into this package, because it was the common point of import between everything (and cannot live in Astro itself, recursive deps and all that). I don't think it's a problem, but we could also have some sort of @astrojs/internal-markdown dedicated to this, but I didn't see the point.
|
|
||
| The top-level `markdown.remarkPlugins`, `markdown.rehypePlugins`, and `markdown.remarkRehype` options are deprecated. They'll continue to work when `@astrojs/markdown-remark` is installed for now, but this will be removed in the next major. | ||
|
|
||
| The top-level `markdown.gfm` and `markdown.smartypants` options are also deprecated. Move them onto your processor instead — `satteri({ features: { gfm: false, smartPunctuation: false } })`, or `unified({ gfm: false, smartypants: false })`: |
There was a problem hiding this comment.
In theory, those two options could be fully removed and we could ask users to manually install the Remark plugins, but I figured for a better migration story right now we can just support them in the processor itself.
It's not relevant for Sätteri because it's supported natively there.
|
|
||
| export default defineConfig({ | ||
| markdown: { processor: satteri() }, | ||
| integrations: [mdx({ processor: unified({ remarkPlugins: [/* ... */] }) })], |
There was a problem hiding this comment.
The only thing a little weird here, if you wanted to use unified for both, would that mean you'd do:
import { defineConfig, satteri } from 'astro/config';
import mdx from '@astrojs/mdx';
import { unified } from '@astrojs/markdown-remark';
export default defineConfig({
markdown: { processor: unified({ remarkPlugins: [/* ... */] }) },
integrations: [mdx({ processor: unified({ remarkPlugins: [/* ... */] }) })],
});Or would you keep the processor in a variable and reference it in both places? Do they need to share state?
There was a problem hiding this comment.
No, it's the same as the current options, MDX inherits the options from markdown by default: https://docs.astro.build/en/guides/integrations-guide/mdx/#extendmarkdownconfig
It's a weird behavior of MDX in my opinion... but in line with what we currently have, at least. They don't need to share state, the things that could be shared are always module-level (e.g. Shiki)
Changes
Implements withastro/roadmap#1364
Testing
Added tests, updated tests that depends on remark behavior to use the unified pipeline
Docs
withastro/docs#13919