@@ -22,13 +22,13 @@ links to the cache file. This allows you to:
2222 separately
2323
2424Additionally, when checking is enabled (` CheckExternal: true ` ), this option also
25- caches timeout errors (408 ), which were previously not cached.
25+ caches timeout errors (as ` StatusTimeout ` ), which were previously not cached.
2626
2727Example workflow:
2828
2929- Run htmltest with ` CheckExternal: false ` and ` CacheAllExternal: true ` during
3030 development
31- - External links are saved to refcache with status code 0 (unchecked )
31+ - External links are saved to refcache with ` StatusUnchecked ` (not checked )
3232- Later, extract the list of unchecked links from the cache for validation or
3333 reporting
3434- Optionally run htmltest with ` CheckExternal: true ` to validate cached links
@@ -46,24 +46,27 @@ Example workflow:
4646
4747### Complete Behavior Matrix
4848
49- | CheckExternal | CacheAllExternal | RetryCachedErrors | Links Checked? | Errors Retried? | What Gets Cached | Use Case |
50- | ------------- | ---------------- | ----------------- | -------------- | --------------- | ------------------ | ------------------------------------ |
51- | ` true ` | ` false ` | ` true ` | ✓ | ✓ | 200, 404 (NOT 408) | ** Default/Legacy** |
52- | ` true ` | ` false ` | ` false ` | ✓ | ✗ | 200, 404 (NOT 408) | Reuse errors, retry timeouts |
53- | ` true ` | ` true ` | ` true ` | ✓ | ✓ | 200, 404, 408 | Cache timeouts, but retry |
54- | ` true ` | ` true ` | ` false ` | ✓ | ✗ | 200, 404, 408 | ** Fast re-runs** - cache & reuse all |
55- | ` false ` | ` false ` | (N/A) | ✗ | N/A | Nothing | ** Default skip** - no cache |
56- | ` false ` | ` true ` | (N/A) | ✗ | N/A | 0 (unchecked) | ** Link discovery** |
49+ | CheckExternal | CacheAllExternal | RetryCachedErrors | Links Checked? | Errors Retried? | What Gets Cached | Use Case |
50+ | ------------- | ---------------- | ----------------- | -------------- | --------------- | ----------------- | ------------------------------------ |
51+ | ` true ` | ` false ` | ` true ` | ✓ | ✓ | 200, 4XX | ** Default/Legacy** |
52+ | ` true ` | ` false ` | ` false ` | ✓ | ✗ | 200, 4XX | Reuse errors, retry timeouts |
53+ | ` true ` | ` true ` | ` true ` | ✓ | ✓ | 200, 4XX, TSC[ ^ 1 ] | Cache timeouts, but retry |
54+ | ` true ` | ` true ` | ` false ` | ✓ | ✗ | 200, 404, TSC[ ^ 1 ] | ** Fast re-runs** - cache & reuse all |
55+ | ` false ` | ` false ` | (N/A) | ✗ | N/A | Nothing | ** Default skip** - no cache |
56+ | ` false ` | ` true ` | (N/A) | ✗ | N/A | unchecked links | ** Link discovery** |
57+
58+ [ ^ 1 ] :
59+ TSC = Tool-specific status code used by htmltest. See
60+ ` @docs/tasks/design.md ` for details.
5761
5862### Key Insights
5963
6064- ` CacheAllExternal ` extends caching behavior in ** both** modes:
61- - When ` CheckExternal: true ` → Also caches timeouts (408 )
62- - When ` CheckExternal: false ` → Caches discovered links (0 )
65+ - When ` CheckExternal: true ` → Also caches timeouts (as ` StatusTimeout ` )
66+ - When ` CheckExternal: false ` → Caches discovered links (as ` StatusUnchecked ` )
6367- When ` CheckExternal: false ` , ` RetryCachedErrors ` has no effect (nothing to
6468 retry)
65- - Status code ` 0 ` indicates "unchecked/unknown" status
66- - Status code ` 408 ` indicates timeout error
69+ - See ` @docs/tasks/design.md ` for status code details
6770
6871## Implementation Steps (TDD Approach)
6972
@@ -73,14 +76,14 @@ Example workflow:
7376
7477Add test cases to verify:
7578
76- - ** Discovery mode** : Links cached with status 0 when ` CheckExternal: false ` and
77- ` CacheAllExternal: true `
79+ - ** Discovery mode** : Links cached with ` StatusUnchecked ` when
80+ ` CheckExternal: false ` and ` CacheAllExternal: true `
7881- ** Default behavior** : Links NOT cached when ` CacheAllExternal: false `
7982- ** Ignored URLs** : Ignored URLs still not cached even with
8083 ` CacheAllExternal: true `
8184- ** Query string handling** : Query string stripping applied correctly
82- - ** Timeout caching** : Timeouts cached as 408 when ` CheckExternal: true ` and
83- ` CacheAllExternal: true `
85+ - ** Timeout caching** : Timeouts cached with ` StatusTimeout ` when
86+ ` CheckExternal: true ` and ` CacheAllExternal: true `
8487
8588### 2. Add Configuration Option
8689
@@ -102,7 +105,7 @@ Two modifications needed:
102105- Modify ` checkExternal() ` function (starting at line 129)
103106- When ` !hT.opts.CheckExternal ` is true:
104107 - Check if ` hT.opts.CacheAllExternal ` is enabled
105- - If so, cache discovered links with status 0
108+ - If so, cache discovered links with ` StatusUnchecked `
106109 - Apply URL processing (strip query string if configured)
107110 - Skip ignored URLs (respect ` isURLIgnored() ` check)
108111
@@ -111,17 +114,18 @@ Two modifications needed:
111114- In timeout handling code (around line 201-210)
112115- Change condition from ` if !hT.opts.RetryCachedErrors ` to
113116 ` if hT.opts.CacheAllExternal `
114- - This caches timeouts when CacheAllExternal is enabled
117+ - Save with ` StatusTimeout ` instead of not caching
115118
116119### 4. Update Documentation
117120
118121** File** : ` README.md `
119122
120123- Add new row in the configuration options table (around line 145, after
121124 ` CheckExternal ` )
122- - Format: ` | \ ` CacheAllExternal\` | Cache all external links including timeouts
123- (when checking) and unchecked links (when not checking). Useful for fast
124- re-runs and link discovery. | \` false\` |`
125+ - Description: Cache all external links including timeouts (when checking is
126+ enabled) and unchecked links (when checking is disabled). Useful for fast
127+ re-runs and link discovery.
128+ - Default: ` false `
125129
126130## Key Design Decisions
127131
@@ -131,10 +135,10 @@ Two modifications needed:
131135 naturally
132136- ** Clear semantics** : "CacheAll" clearly means "cache everything, not just
133137 successes"
134- - ** Status codes** :
135- - ` 0 ` = unchecked/unknown (discovery mode)
136- - ` 408 ` = timeout error (check mode)
137- - Other codes = actual HTTP responses
138+ - ** Status codes** : See ` @docs/tasks/design.md ` for details
139+ - ` StatusUnchecked ` = unchecked/unknown (discovery mode)
140+ - ` StatusTimeout ` = timeout error (check mode)
141+ - Positive codes = actual HTTP responses
138142- ** Respects existing patterns** :
139143 - URL ignore patterns (` IgnoreURLs ` )
140144 - Query string stripping (` StripQueryString ` )
0 commit comments