Skip to content

Latest commit

 

History

History
 
 

README.md

stream-markdown-parser

NPM version 中文版 NPM downloads Bundle size License

Pure markdown parser and renderer utilities with streaming support - framework agnostic.

This package contains the core markdown parsing logic extracted from markstream-vue, making it usable in any JavaScript/TypeScript project without Vue dependencies.

Features

  • 🚀 Pure JavaScript - No framework dependencies
  • 📦 Lightweight - Minimal bundle size
  • 🔧 Extensible - Plugin-based architecture
  • 🎯 Type-safe - Full TypeScript support
  • Fast - Optimized for performance
  • 🌊 Streaming-friendly - Progressive parsing support

ℹ️ We now build on top of markdown-it-ts, a TypeScript-first distribution of markdown-it. The API stays the same, but we only rely on its parsing pipeline and ship richer typings for tokens and hooks.

Documentation

The full usage guide lives alongside the markstream-vue docs:

This README highlights the parser-specific APIs; visit the docs for end-to-end integration tutorials (VitePress, workers, Tailwind, troubleshooting, etc.).

Installation

pnpm add stream-markdown-parser
# or
npm install stream-markdown-parser
# or
yarn add stream-markdown-parser

Quick API (TL;DR)

  • getMarkdown(msgId?, options?) — create a markdown-it-ts instance with built-in plugins (task lists, sub/sup, math helpers, etc.). Also accepts plugin, apply, and i18n options.
  • registerMarkdownPlugin(plugin) / clearRegisteredMarkdownPlugins() — add or remove global plugins that run for every getMarkdown() call (useful for feature flags or tests).
  • parseMarkdownToStructure(markdown, md, parseOptions) — convert Markdown into the streaming-friendly AST consumed by markstream-vue and other renderers.
  • processTokens(tokens) / parseInlineTokens(children, content) — low-level helpers if you want to bypass the built-in AST pipeline.
  • applyMath, applyContainers, normalizeStandaloneBackslashT, findMatchingClose, etc. — utilities for custom parsing or linting workflows.

Usage

Streaming-friendly pipeline

Markdown string
   ↓ getMarkdown() → markdown-it-ts instance with plugins
parseMarkdownToStructure() → AST (ParsedNode[])
   ↓ feed into your renderer (markstream-vue, custom UI, workers)

Reuse the same md instance when parsing multiple documents—plugin setup is the heaviest step. When integrating with markstream-vue, pass the AST to <MarkdownRender :nodes="nodes" /> or supply raw content and hand it the same parser options.

Incremental / streaming example

When consuming an AI or SSE stream you can keep appending to a buffer, parse with the same md instance, and send the AST to the UI (e.g., markstream-vue) on every chunk:

import { getMarkdown, parseMarkdownToStructure } from 'stream-markdown-parser'

const md = getMarkdown()
let buffer = ''

async function handleChunk(chunk: string) {
  buffer += chunk
  const nodes = parseMarkdownToStructure(buffer, md)
  postMessage({ type: 'markdown:update', nodes })
}

In the UI layer, render nodes with <MarkdownRender :nodes="nodes" /> to avoid re-parsing. See the docs usage guide for end-to-end wiring.

Basic example

import { getMarkdown, parseMarkdownToStructure } from 'stream-markdown-parser'

// Create a markdown-it-ts instance with default plugins
const md = getMarkdown()

// Parse markdown to our streaming-friendly AST structure
const nodes = parseMarkdownToStructure('# Hello World', md)
console.log(nodes)
// [{ type: 'heading', level: 1, children: [...] }]

// markdown-it-ts still exposes render() if you need HTML output,
// but this package now focuses on the token -> AST pipeline.
const html = md.render?.('# Hello World\n\nThis is **bold**.')

With Math Options

import { getMarkdown, setDefaultMathOptions } from 'stream-markdown-parser'

// Set global math options
setDefaultMathOptions({
  commands: ['infty', 'perp', 'alpha'],
  escapeExclamation: true
})

const md = getMarkdown()

With Custom i18n

import { getMarkdown } from 'stream-markdown-parser'

// Using translation map
const md = getMarkdown('editor-1', {
  i18n: {
    'common.copy': '复制',
  }
})

// Or using a translation function
const md = getMarkdown('editor-1', {
  i18n: (key: string) => translateFunction(key)
})

With Plugins

import customPlugin from 'markdown-it-custom-plugin'
import { getMarkdown } from 'stream-markdown-parser'

const md = getMarkdown('editor-1', {
  plugin: [
    [customPlugin, { /* options */ }]
  ]
})

Advanced: Custom Rules

import { getMarkdown } from 'stream-markdown-parser'

const md = getMarkdown('editor-1', {
  apply: [
    (md) => {
      // Add custom inline rule
      md.inline.ruler.before('emphasis', 'custom', (state, silent) => {
        // Your custom logic
        return false
      })
    }
  ]
})

Extending globally

Need to add a plugin everywhere without touching each call site? Use the helper exports:

import {
  clearRegisteredMarkdownPlugins,
  registerMarkdownPlugin,
} from 'stream-markdown-parser'

registerMarkdownPlugin(myPlugin)

const md = getMarkdown()
// md now has `myPlugin` enabled in addition to anything passed via options

// For tests or teardown flows:
clearRegisteredMarkdownPlugins()
  • plugin option → array of md.use invocations scoped to a single getMarkdown call.
  • apply option → imperatively mutate the instance (md.inline.ruler.before(...)). Wrap in try/catch if you need to surface errors differently; the helper logs to console to preserve legacy behaviour.
  • registerMarkdownPlugin → global singleton registry (handy in SSR or worker contexts where you want feature flags to apply everywhere).

API

Main Functions

getMarkdown(msgId?, options?)

Creates a configured markdown-it-ts instance (API-compatible with markdown-it).

Parameters:

  • msgId (string, optional): Unique identifier for this instance. Default: editor-${Date.now()}
  • options (GetMarkdownOptions, optional): Configuration options

Options:

interface GetMarkdownOptions {
  // Array of markdown-it/markdown-it-ts plugins to use
  plugin?: Array<Plugin | [Plugin, any]>

  // Array of functions to mutate the md instance
  apply?: Array<(md: MarkdownIt) => void>

  // Translation function or translation map
  i18n?: ((key: string) => string) | Record<string, string>
}

parseMarkdownToStructure(content, md?, options?)

Parses markdown content into a structured node tree.

Parameters:

  • content (string): The markdown content to parse
  • md (MarkdownItCore, optional): A markdown-it-ts instance. If not provided, creates one using getMarkdown()
  • options (ParseOptions, optional): Parsing options with hooks

Returns: ParsedNode[] - An array of parsed nodes

processTokens(tokens)

Processes raw markdown-it tokens into a flat array.

parseInlineTokens(tokens, md)

Parses inline markdown-it-ts tokens.

Configuration Functions

setDefaultMathOptions(options)

Set global math rendering options.

Parameters:

  • options (MathOptions): Math configuration options
interface MathOptions {
  commands?: readonly string[] // LaTeX commands to escape
  escapeExclamation?: boolean // Escape standalone '!' (default: true)
}

Parse hooks (fine-grained transforms)

Both parseMarkdownToStructure() and <MarkdownRender :parse-options> accept the same hook signature:

interface ParseOptions {
  preTransformTokens?: (tokens: Token[]) => Token[]
  postTransformTokens?: (tokens: Token[]) => Token[]
  postTransformNodes?: (nodes: ParsedNode[]) => ParsedNode[]
}

Example — flag AI “thinking” blocks:

const parseOptions = {
  postTransformNodes(nodes) {
    return nodes.map(node =>
      node.type === 'html_block' && /<thinking>/.test(node.value)
        ? { ...node, meta: { type: 'thinking' } }
        : node,
    )
  },
}

Use the metadata in your renderer to show custom UI without mangling the original Markdown.

Utility Functions

isMathLike(content)

Heuristic function to detect if content looks like mathematical notation.

Parameters:

  • content (string): Content to check

Returns: boolean

findMatchingClose(src, startIdx, open, close)

Find the matching closing delimiter in a string, handling nested pairs.

Parameters:

  • src (string): Source string
  • startIdx (number): Start index to search from
  • open (string): Opening delimiter
  • close (string): Closing delimiter

Returns: number - Index of matching close, or -1 if not found

Tips & troubleshooting

  • Reuse parser instances: cache getMarkdown() results per worker/request to avoid re-registering plugins.
  • Server-side parsing: run parseMarkdownToStructure on the server, ship the AST to the client, and hydrate with markstream-vue for deterministic output.
  • Custom HTML widgets: pre-extract <MyWidget> blocks before parsing (replace with placeholders) and reinject them during rendering instead of mutating html_block nodes post-parse.
  • Styling: when piping nodes into markstream-vue, follow the docs CSS checklist so Tailwind/UnoCSS don’t override library styles.
  • Error handling: the apply hook swallows exceptions to maintain backwards compatibility. If you want strict mode, wrap your mutators before passing them in and rethrow/log as needed.

parseFenceToken(token)

Parse a code fence token into a CodeBlockNode.

Parameters:

  • token (MarkdownToken): markdown-it token

Returns: CodeBlockNode

normalizeStandaloneBackslashT(content, options?)

Normalize backslash-t sequences in math content.

Parameters:

  • content (string): Content to normalize
  • options (MathOptions, optional): Math options

Returns: string

Lower-level helpers

If you need full control over how tokens are transformed, you can import the primitive builders directly:

import type { MarkdownToken } from 'stream-markdown-parser'
import {

  parseInlineTokens,
  processTokens
} from 'stream-markdown-parser'

const tokens: MarkdownToken[] = md.parse(markdown, {})
const nodes = processTokens(tokens)
// or operate at inline granularity:
const inlineNodes = parseInlineTokens(tokens[0].children ?? [], tokens[0].content ?? '')

processTokens is what parseMarkdownToStructure uses internally, so you can remix the AST pipeline without reimplementing the Markdown-it loop.

Plugin Functions

applyMath(md, options?)

Apply math plugin to markdown-it instance.

Parameters:

  • md (MarkdownIt): markdown-it instance
  • options (MathOptions, optional): Math rendering options

applyContainers(md)

Apply container plugins to markdown-it instance.

Parameters:

  • md (MarkdownIt): markdown-it instance

Constants

KATEX_COMMANDS

Array of common KaTeX commands for escaping.

TEX_BRACE_COMMANDS

Array of TeX commands that use braces.

ESCAPED_TEX_BRACE_COMMANDS

Escaped version of TEX_BRACE_COMMANDS for regex use.

Types

All TypeScript types are exported:

import type {
  // Node types
  CodeBlockNode,
  GetMarkdownOptions,
  HeadingNode,
  ListItemNode,
  ListNode,
  MathOptions,
  ParagraphNode,
  ParsedNode,
  ParseOptions,
  // ... and more
} from 'stream-markdown-parser'

Node Types

The parser exports various node types representing different markdown elements:

  • TextNode, HeadingNode, ParagraphNode
  • ListNode, ListItemNode
  • CodeBlockNode, InlineCodeNode
  • LinkNode, ImageNode
  • BlockquoteNode, TableNode
  • MathBlockNode, MathInlineNode
  • And many more...

Default Plugins

This package comes with the following markdown-it plugins pre-configured:

  • markdown-it-sub - Subscript support (H~2~O)
  • markdown-it-sup - Superscript support (x^2^)
  • markdown-it-mark - Highlight/mark support (==highlighted==)
  • markdown-it-task-checkbox - Task list support (- [ ] Todo)
  • markdown-it-ins - Insert tag support (++inserted++)
  • markdown-it-footnote - Footnote support
  • markdown-it-container - Custom container support (::: warning, ::: tip, etc.)
  • Math support - LaTeX math rendering with $...$ and $$...$$

Framework Integration

While this package is framework-agnostic, it's designed to work seamlessly with:

  • Node.js - Server-side rendering
  • Vue 3 - Use with markstream-vue (or your own renderer)
  • React - Use parsed nodes for custom rendering
  • Vanilla JS - Direct HTML rendering
  • Any framework - Parse to AST and render as needed

Example — dedicated worker feeding markstream-vue

Offload parsing to a Web Worker while the UI renders via markstream-vue:

// worker.ts
import { getMarkdown, parseMarkdownToStructure } from 'stream-markdown-parser'

const md = getMarkdown()
let buffer = ''

globalThis.addEventListener('message', (event) => {
  if (event.data.type === 'chunk') {
    buffer += event.data.value
    const nodes = parseMarkdownToStructure(buffer, md)
    globalThis.postMessage({ type: 'update', nodes })
  }
})
// ui.ts
const worker = new Worker(new URL('./worker.ts', import.meta.url), { type: 'module' })
worker.addEventListener('message', (event) => {
  if (event.data.type === 'update')
    nodes.value = event.data.nodes
})
<MarkdownRender :nodes="nodes" />

This pattern keeps Markdown-it work off the main thread and lets you reuse the same AST in any framework.

Migration from markstream-vue (parser exports)

If you're currently importing parser helpers from markstream-vue, you can switch to the dedicated package:

// before
import { getMarkdown } from 'markstream-vue'

// after
import { getMarkdown } from 'stream-markdown-parser'

All APIs remain the same. See the migration guide for details.

Performance

  • Lightweight: ~65KB minified (13KB gzipped)
  • Fast: Optimized for real-time parsing
  • Tree-shakeable: Only import what you need
  • Few dependencies: markdown-it-ts + a small set of markdown-it plugins

Contributing

Issues and PRs welcome! Please read the contribution guidelines.

License

MIT © Simon He

Related