Skip to content

Commit a95bb97

Browse files
authored
feat: Added checks for broken links, images and MD linting (#33)
1 parent 4c34496 commit a95bb97

31 files changed

+2547
-54
lines changed

AGENTS.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ src/ # Source code
1616
│ ├── model/ # Bedrock client and rate limiting
1717
│ ├── prompts/ # Prompt templates for various operations
1818
│ └── mcp/ # Model Context Protocol server
19+
├── check/ # Content validation (lint, links, images)
1920
├── content/ # Content processing
2021
│ ├── tree/ # Content tree data structures
2122
│ ├── providers/ # Content providers (FileSystem, Mock)
@@ -27,6 +28,7 @@ tests/ # Test files (mirrors src/ structure)
2728
├── ai/
2829
│ ├── model/
2930
│ └── prompts/
31+
├── check/
3032
├── content/
3133
│ ├── tree/
3234
│ ├── providers/

README.md

Lines changed: 106 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -9,13 +9,14 @@ CLI tools for maintaining Markdown content like documentation and tutorials.
99

1010
## Features
1111

12-
- **🤖 AI-Powered Content Review** - Automatically review and improve your Markdown content using Amazon Bedrock
13-
- **🌍 Multi-Language Translation** - Translate content between 8+ supported languages
14-
- **❓ Intelligent Q&A** - Ask questions about your content and get AI-powered answers
15-
- **📝 Style Guide Enforcement** - Maintain consistency with custom style guides
16-
- **⚡ Rate Limiting** - Built-in rate limiting for API calls
17-
- **🎯 Context-Aware Processing** - Smart content processing with configurable context strategies
18-
- **🔌 Model Context Protocol Server** - Integrate with tools like Cursor, Cline and Q Developer with the built-in MCP server
12+
- **Content Validation** - Check Markdown files for lint issues, broken links, and missing images
13+
- **AI-Powered Content Review** - Automatically review and improve your Markdown content using Amazon Bedrock
14+
- **Multi-Language Translation** - Translate content between 8+ supported languages
15+
- **Intelligent Q&A** - Ask questions about your content and get AI-powered answers
16+
- **Style Guide Enforcement** - Maintain consistency with custom style guides
17+
- **Rate Limiting** - Built-in rate limiting for API calls
18+
- **Context-Aware Processing** - Smart content processing with configurable context strategies
19+
- **Model Context Protocol Server** - Integrate with tools like Cursor, Cline and Q Developer with the built-in MCP server
1920

2021
## Installation
2122

@@ -96,6 +97,38 @@ Include image references in the tree output:
9697
toolkit-md map ./docs --images
9798
```
9899

100+
### Check Content
101+
102+
Run non-AI validation checks on Markdown files including linting, broken link detection, and missing image detection:
103+
104+
```bash
105+
toolkit-md check ./docs
106+
```
107+
108+
Skip external link validation for offline or faster checks:
109+
110+
```bash
111+
toolkit-md check ./docs --skip-external-links
112+
```
113+
114+
Ignore specific markdownlint rules:
115+
116+
```bash
117+
toolkit-md check ./docs --ignore-rule MD013 --ignore-rule MD033
118+
```
119+
120+
Only report errors (skip warnings):
121+
122+
```bash
123+
toolkit-md check ./docs --min-severity error
124+
```
125+
126+
Run only specific check categories:
127+
128+
```bash
129+
toolkit-md check ./docs --category lint --category link
130+
```
131+
99132
## Configuration
100133

101134
Toolkit for Markdown supports configuration through:
@@ -120,19 +153,26 @@ Toolkit for Markdown supports configuration through:
120153
| `ai.exemplars` | `--exemplar` | `TKMD_AI_EXEMPLAR_*` | Path to directory of content to use as an example to follow, can be specified multiple times | `[]` |
121154
| `ai.styleGuides` | `--style-guide` | `TKMD_AI_STYLE_GUIDE_*` | Path to style guide file, can be specified multiple times | `[]` |
122155
| `ai.includeImages` | `--include-images` | `TKMD_AI_INCLUDE_IMAGES` | Include images from markdown files in AI review | `false` |
123-
| `ai.imageBasePath` | `--image-base-path` | `TKMD_AI_IMAGE_BASE_PATH` | Base path for resolving absolute image paths | `contentDir` |
124156
| `ai.maxImages` | `--max-images` | `TKMD_AI_MAX_IMAGES` | Maximum number of images to include per file | `5` |
125157
| `ai.maxImageSize` | `--max-image-size` | `TKMD_AI_MAX_IMAGE_SIZE` | Maximum image file size in bytes | `3145728` (3MB) |
126158
| `ai.review.instructions` | `--instructions` | `TKMD_AI_REVIEW_INSTRUCTIONS` | Additional instructions for the model | `undefined` |
127159
| `ai.review.summaryFile` | `--summary-file` | `TKMD_AI_REVIEW_SUMMARY_PATH` | Write a summary of the review changes to the provided file path in Markdown format | `""` |
128160
| `ai.review.diffFile` | `--diff-file` | `TKMD_AI_REVIEW_DIFF_FILE` | Path to unified diff file for filtering review suggestions | `undefined` |
129161
| `ai.review.diffContext` | `--diff-context` | `TKMD_AI_REVIEW_DIFF_CONTEXT` | Number of context lines around changed lines to include (symmetric) | `3` |
162+
| `ai.review.runChecks` | `--review-check` | `TKMD_AI_REVIEW_CHECK` | Run content checks and include results in the review prompt | `true` |
130163
| `ai.translation.force` | `--force` | `TKMD_AI_FORCE_TRANSLATION` | Force translation even if source unchanged | `false` |
131164
| `ai.translation.check` | `--check` | `TKMD_AI_CHECK_TRANSLATION` | Only check if translation needed | `false` |
132165
| `ai.translation.directory` | `--translation-dir` | `TKMD_AI_TRANSLATION_DIRECTORY` | Directory where translated content is stored, if not specified defaults to source directory | `undefined` |
133166
| `ai.translation.skipFileSuffix` | `--skip-file-suffix` | `TKMD_AI_TRANSLATION_SKIP_FILE_SUFFIX` | Omit the language code suffix for translated files ('example.fr.md' becomes 'example.md') | `false` |
167+
| `check.minSeverity` | `--min-severity` | `TKMD_CHECK_MIN_SEVERITY` | Minimum severity level to report (error, warning) | `"warning"` |
168+
| `check.categories` | `--category` | `TKMD_CHECK_CATEGORY_*` | Check categories to run (lint, link, image), can be specified multiple times | `["lint", "link", "image"]` |
169+
| `check.links.timeout` | `--link-timeout` | `TKMD_CHECK_LINK_TIMEOUT` | Timeout in milliseconds for HTTP link and image checks | `5000` |
170+
| `check.links.skipExternal` | `--skip-external-links` | `TKMD_CHECK_SKIP_EXTERNAL_LINKS` | Skip validation of external HTTP/HTTPS links and images | `false` |
171+
| `check.lint.ignoreRules` | `--ignore-rule` | `TKMD_CHECK_LINT_IGNORE_RULE_*` | Markdownlint rule names or aliases to ignore, can be specified multiple times | `[]` |
172+
| `staticPrefix` | `--static-prefix` | `TKMD_STATIC_PREFIX` | URL prefix indicating a link points to a file in the static directory | `undefined` |
173+
| `staticDir` | `--static-dir` | `TKMD_STATIC_DIR` | Directory relative to the cwd where static assets are stored, used with staticPrefix | `undefined` |
134174

135-
**Note:** For array values (exemplars, styleGuides), the environment variable referenced above is treated as a prefix: `TKMD_AI_EXEMPLAR_FIRST`, `TKMD_AI_EXEMPLAR_SECOND`, etc.
175+
**Note:** For array values (exemplars, styleGuides, ignoreRules), the environment variable referenced above is treated as a prefix: `TKMD_AI_EXEMPLAR_FIRST`, `TKMD_AI_EXEMPLAR_SECOND`, etc.
136176

137177
### Configuration File Format
138178

@@ -155,14 +195,29 @@ Create a `.toolkit-mdrc` file in JSON format:
155195
"exemplars": ["./examples/good-example1", "./examples/good-example2"],
156196
"styleGuides": ["./guides/style-guide.md", "./guides/aws-terminology.md"],
157197
"includeImages": true,
158-
"imageBasePath": "./assets",
159198
"maxImages": 5,
160199
"maxImageSize": 3145728,
161200
"translation": {
162201
"force": false,
163202
"check": false
203+
},
204+
"review": {
205+
"runChecks": true
164206
}
165-
}
207+
},
208+
"check": {
209+
"minSeverity": "warning",
210+
"categories": ["lint", "link", "image"],
211+
"links": {
212+
"timeout": 5000,
213+
"skipExternal": false
214+
},
215+
"lint": {
216+
"ignoreRules": ["MD013"]
217+
}
218+
},
219+
"staticPrefix": "/static/",
220+
"staticDir": "./static"
166221
}
167222
```
168223

@@ -317,7 +372,7 @@ The review command can extract and include images referenced in markdown files f
317372
**Image Path Resolution:**
318373

319374
- **Relative paths** (e.g., `./images/diagram.png`, `../assets/photo.jpg`) are resolved from the markdown file's directory
320-
- **Absolute paths** (e.g., `/images/diagram.png`) are resolved from the `imageBasePath` configuration (defaults to `contentDir`, resolved from current working directory)
375+
- **Absolute paths** (e.g., `/images/diagram.png`) are resolved against the `staticDir` directory. If `staticPrefix` is configured and the path starts with it, the prefix is stripped before resolution
321376
- **Remote URLs** (e.g., `https://example.com/image.png`) are excluded from processing
322377

323378
**Image Limits:**
@@ -329,7 +384,7 @@ The review command can extract and include images referenced in markdown files f
329384
**Example:**
330385

331386
```bash
332-
toolkit-md review ./docs --include-images --max-images 10 --image-base-path ./assets
387+
toolkit-md review ./docs --include-images --max-images 10 --static-dir ./assets
333388
```
334389

335390
Images that cannot be loaded (missing files, unsupported formats, or exceeding size limits) will generate warnings but won't stop the review process.
@@ -386,8 +441,8 @@ toolkit-md review --diff-file changes.diff --summary-file review-summary.md --di
386441
- `--instructions`
387442
- `--diff-file`
388443
- `--diff-context`
444+
- `--review-check`
389445
- `--include-images`
390-
- `--image-base-path`
391446
- `--max-images`
392447
- `--max-image-size`
393448

@@ -467,6 +522,42 @@ toolkit-md map ./docs --images
467522
- `--content-dir`
468523
- `--cwd`
469524

525+
### `check`
526+
527+
Validates Markdown content without AI by running linting checks (via markdownlint), verifying that local link targets exist, and confirming that referenced images are present. Remote links and images are validated with HTTP HEAD requests. This command requires no AWS credentials and is suitable for CI pipelines. Exits with code 1 if any errors are found.
528+
529+
**Example:**
530+
531+
```bash
532+
toolkit-md check ./docs
533+
```
534+
535+
**Skip external link and image validation:**
536+
537+
```bash
538+
toolkit-md check ./docs --skip-external-links
539+
```
540+
541+
**Ignore specific markdownlint rules:**
542+
543+
```bash
544+
toolkit-md check ./docs --ignore-rule MD013 --ignore-rule MD033
545+
```
546+
547+
**Options:**
548+
549+
- `--link-timeout`
550+
- `--skip-external-links`
551+
- `--ignore-rule`
552+
- `--min-severity`
553+
- `--category`
554+
- `--static-prefix`
555+
- `--static-dir`
556+
- `--language`
557+
- `--default-language`
558+
- `--content-dir`
559+
- `--cwd`
560+
470561
### `mcp`
471562

472563
Starts an MCP server that exposes tool features to MCP clients. See below for further information.
@@ -530,6 +621,7 @@ The following MCP tools are provided:
530621
| `content_best_practices` | Response contains style guide and exemplar content as configured for the specified project. It the `targetLanguage` is provided it will also load style guides for that language and provide them in the response. |
531622
| `content_review_guidance` | Response contains guidance for the model to systematically review Markdown content for a given project for general issues and best practices. |
532623
| `content_translation_guidance` | Response contains guidance for the model to translate Markdown for a given project to another language. It helps the model locate both source content as well as existing translated content to use for context. |
624+
| `run_checks` | Runs lint, link, and image checks on specified Markdown content files relative to the content directory. Supports filtering by severity and category. |
533625

534626
## Development
535627

package.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@
2727
"debug": "tsx --inspect src/cli.ts",
2828
"test": "vitest run",
2929
"test:watch": "vitest",
30-
"license-check": "license-checker --production --onlyAllow 'MIT;Apache-2.0;BSD-2-Clause;BSD-3-Clause;ISC;0BSD' --summary"
30+
"license-check": "license-checker --production --onlyAllow 'MIT;Apache-2.0;BSD-2-Clause;BSD-3-Clause;ISC;0BSD;Python-2.0' --summary"
3131
},
3232
"keywords": [],
3333
"author": "",
@@ -40,6 +40,7 @@
4040
"globby": "^14.0.2",
4141
"gray-matter": "^4.0.3",
4242
"handlebars": "^4.7.8",
43+
"markdownlint": "^0.37.4",
4344
"ora": "^8.0.1",
4445
"remark": "^15.0.1",
4546
"remark-directive": "^4.0.0",

src/ai/prompts/reviewPrompt.ts

Lines changed: 23 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616

1717
import { dirname } from "node:path";
1818
import Handlebars from "handlebars";
19+
import type { CheckIssue } from "../../check/types.js";
1920
import type { ContentNode, ContentTree } from "../../content/index.js";
2021
import { loadImage } from "../../content/utils/markdownUtils.js";
2122
import type { Language } from "../../languages/index.js";
@@ -33,11 +34,28 @@ const template = `Your task is to review the content provided for file "{{file}}
3334
The images for the file to review have been included as attachments. Ensure that the descriptions of the images in the Markdown match the contents of each image.
3435
{{/if}}
3536
37+
{{#if checkIssues}}
38+
The following issues were detected by automated content checks:
39+
40+
{{#each checkIssues}}
41+
- Line {{this.line}}: [{{this.severity}}] {{this.message}} ({{this.category}}/{{this.rule}})
42+
{{/each}}
43+
44+
The above issues will highlight if any images or links in the content failed to resolve.
45+
{{/if}}
46+
3647
{{#if instructions}}
3748
In additional you've been provided the following additional instructions:
3849
{{{instructions}}}
3950
{{/if}}
4051
52+
For any finding which cannot be reliably remediated, such as missing images or broken links, leave the offending Markdown but insert a comment above it like so:
53+
54+
<example_comment>
55+
<!-- TMD finding: This image '/pods1.png' could not be loaded -->
56+
[Some screenshot](/pods1.png)
57+
</example_comment>
58+
4159
Write the output as markdown in a similar style to the example content. Respond with the resulting file enclosed in <file></file> including the path to the file as an attribute.
4260
4361
ONLY respond with the content between the "<file></file>" tags.`;
@@ -49,11 +67,13 @@ export async function buildReviewPrompt(
4967
contextStrategy: ContextStrategy,
5068
styleGuides: string[],
5169
exemplars: Exemplar[],
52-
imageBasePath: string,
5370
instructions?: string,
5471
includeImages: boolean = false,
5572
maxImages: number = 5,
5673
maxImageSize: number = 3145728,
74+
staticPrefix?: string,
75+
staticDir?: string,
76+
checkIssues?: CheckIssue[],
5777
): Promise<Prompt> {
5878
const promptTemplate = Handlebars.compile(template);
5979

@@ -72,6 +92,7 @@ export async function buildReviewPrompt(
7292
file: currentNode.filePath,
7393
instructions,
7494
includeImages,
95+
checkIssues: checkIssues && checkIssues.length > 0 ? checkIssues : null,
7596
}),
7697
sampleOutput: currentNode.content || undefined,
7798
prefill: `<file path="${currentNode.filePath}">`,
@@ -94,7 +115,7 @@ export async function buildReviewPrompt(
94115
.filter((img) => !img.remote)
95116
.slice(0, maxImages)
96117
.map((img) =>
97-
loadImage(img.path, baseDir, imageBasePath, maxImageSize),
118+
loadImage(img.path, baseDir, maxImageSize, staticPrefix, staticDir),
98119
),
99120
);
100121

0 commit comments

Comments
 (0)