Skip to content

Commit 27ed360

Browse files
committed
Add wordlist templates and backend selection
1 parent dc05dc9 commit 27ed360

26 files changed

Lines changed: 757 additions & 162 deletions

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,8 @@
88
- Added `dirsearch[mysql]`, `dirsearch[postgresql]`, and `dirsearch[db]` install extras.
99
- Updated Docker, CI, PyInstaller, and Nuitka release builds for the 5.0.0 baseline.
1010
- Added an importable Python API with `FuzzerConfig`, `DirsearchFuzzer`, structured results, wordlists, and template wordlists.
11+
- Added template wordlist placeholders, curated templates, `--wordlist-status`, and generation limits.
12+
- Added the initial wordlist backend interface with `auto`, `python`, and reserved `native` selection.
1113
- Ability to use multiple output formats
1214
- MySQL and PostgreSQL report formats
1315
- Support variables in file path and SQL table name for saving results

README.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -126,6 +126,9 @@ Wordlists (IMPORTANT)
126126
- To apply your extensions to wordlist entries that have extensions already, use **-O** | **--overwrite-extensions** (Note: some extensions are excluded from being overwritted such as *.log*, *.json*, *.xml*, ... or media extensions like *.jpg*, *.png*)
127127
- To use multiple wordlists, you can separate your wordlists with commas. Example: `wordlist1.txt,wordlist2.txt`.
128128
- Bundled wordlist categories live in `db/categories/` and can be selected with **--wordlist-categories**. Available: `extensions`, `conf`, `vcs`, `backups`, `db`, `logs`, `keys`, `web`, `common` (use `all` to include everything).
129+
- Wordlist generation uses **--wordlist-backend=auto** by default. **python** selects the built-in backend and **native** requires a native backend build.
130+
- Template wordlists live in `db/templates/` and support placeholders such as `%SUBJECT%`, `%CRUD_OP%`, `%AUTH_OP%`, `%ADMIN_OP%`, `%ENV%`, `%DATE%`, `%API_VERSION%`, `%CATEGORY:name%`, and `%EXT%`.
131+
- Use **--wordlist-status** to preview resolved wordlist files and generated entry count before scanning. Use **--wordlist-max-size** to cap generation.
129132

130133
<details>
131134
<summary><strong>Wordlist Examples (click to expand)</strong></summary>
@@ -214,6 +217,14 @@ Options:
214217
Comma-separated wordlist category names (e.g.
215218
common,conf,web). Use 'all' to include all bundled
216219
categories
220+
--wordlist-backend=BACKEND
221+
Wordlist generation backend: auto, python, native
222+
(default: auto)
223+
--wordlist-status Show resolved wordlist files and generated entry
224+
count, then exit
225+
--wordlist-max-size=COUNT
226+
Maximum generated wordlist entries before aborting
227+
(default: 500000)
217228
-e EXTENSIONS, --extensions=EXTENSIONS
218229
Extension list separated by commas (e.g. php,asp)
219230
-f, --force-extensions

TODO.md

Lines changed: 14 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@
22

33
## Current Status
44

5-
Phase 1 is implemented for optional database dependencies. Phase 2 is implemented for the Python 3.14 / `5.0.0` release baseline. Phase 3 is implemented for the first importable Python API.
5+
Phase 1 is implemented for optional database dependencies. Phase 2 is implemented for the Python 3.14 / `5.0.0` release baseline. Phase 3 is implemented for the first importable Python API. Phase 4 is implemented for template wordlists and generation controls.
66

7-
This is a useful foundation for a future MCP server or REST API because import-time side effects are lower, base installs no longer require database drivers, and the first supported Python API surface is now available. The next work expands wordlist templates and generation controls.
7+
This is a useful foundation for a future MCP server or REST API because import-time side effects are lower, base installs no longer require database drivers, and the first supported Python API surface is now available. The next work adds the native wordlist implementation and benchmark gates.
88

99
## Completed
1010

@@ -47,6 +47,13 @@ This is a useful foundation for a future MCP server or REST API because import-t
4747
- Added isolation coverage proving two configs do not leak state in one process.
4848
- Added packaged install smoke coverage for the public API imports.
4949
- Added optional DB and importable API coverage to `testing.py`.
50+
- Added shared template wordlist expansion for CLI and Python API callers.
51+
- Added named placeholders for subjects, CRUD/auth/admin ops, envs, separators, dates, DB/archive tokens, API versions, categories, and `%EXT%`.
52+
- Added curated template wordlists under `db/templates/`.
53+
- Added `--wordlist-status`.
54+
- Added `--wordlist-max-size` generation limits.
55+
- Added template wordlist regression coverage to `testing.py`.
56+
- Added the wordlist backend interface and `auto|python|native` selection.
5057

5158
## Verification Run
5259

@@ -74,22 +81,16 @@ This is a useful foundation for a future MCP server or REST API because import-t
7481
- `docker build -t dirsearch:v5-importable-api .`
7582
- `docker run --rm --entrypoint python3 dirsearch:v5-importable-api testing.py`
7683
- `docker run --rm --entrypoint sh dirsearch:v5-importable-api -c "python3 -m pip install . && python3 tests/check_packaged_install.py"`
84+
- `/home/mauro/dirsearch/.venv/bin/python -m unittest tests.core.test_dictionary_templates tests.core.test_importable_api`
85+
- `/home/mauro/dirsearch/.venv/bin/python dirsearch.py --wordlist-status -w db/templates/crud.txt -e php`
86+
- `/home/mauro/dirsearch/.venv/bin/python dirsearch.py --wordlist-status -w db/templates/crud.txt -e php --wordlist-max-size 1`
7787

7888
## Next Phase
7989

80-
### Phase 4: Template Wordlists
81-
82-
- Add named placeholder expansion for `%SUBJECT%`, `%CRUD_OP%`, `%AUTH_OP%`, `%ADMIN_OP%`, `%ENV%`, `%SEP%`, date tokens, DB/archive tokens, `%API_VERSION%`, and `%CATEGORY:name%`.
83-
- Add curated template files under `db/templates/`.
84-
- Add `--wordlist-status`.
85-
- Add generation limits to prevent accidental huge expansions.
86-
87-
### Later Phases
88-
8990
### Phase 5: Native Wordlist Backend
9091

91-
- Add backend interface with Python backend first.
92-
- Add optional native backend with `auto|python|native` selection.
92+
- Add backend interface with Python backend first. (done)
93+
- Add optional native backend with `auto|python|native` selection. (selection done, native implementation pending)
9394
- Keep native output byte-for-byte compatible with Python output for ordering and deduplication.
9495
- Add opt-in large dictionary benchmark for generation time, iteration throughput, memory, and startup cost.
9596

config.ini

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,8 @@ capital = False
4141
#suffixes = ~,.bak
4242
#wordlists = /path/to/wordlist1.txt,/path/to/wordlist2.txt
4343
#wordlist-categories = common,conf,web
44+
wordlist-backend = auto
45+
wordlist-max-size = 500000
4446

4547
[request]
4648
http-method = get

db/templates/admin.txt

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
%ADMIN_OP%.%EXT%
2+
admin/%ADMIN_OP%
3+
%ADMIN_OP%/%SUBJECT%
4+
%ADMIN_OP%_%SUBJECT%.%EXT%
5+
%ADMIN_OP%-%SUBJECT%.%EXT%

db/templates/api.txt

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
api/%API_VERSION%/%SUBJECT%
2+
api/%API_VERSION%/%CRUD_OP%/%SUBJECT%
3+
api/%SUBJECT%/%CRUD_OP%
4+
%API_VERSION%/%SUBJECT%
5+
%SUBJECT%.json

db/templates/auth.txt

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
%AUTH_OP%.%EXT%
2+
auth/%AUTH_OP%
3+
account/%AUTH_OP%
4+
user/%AUTH_OP%
5+
oauth/%AUTH_OP%

db/templates/backups.txt

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
backup.%ARCHIVE%
2+
backup_%DATE%.%ARCHIVE%
3+
backup_%DATE_COMPACT%.%ARCHIVE%
4+
%ENV%_backup.%ARCHIVE%
5+
%SUBJECT%_backup.%ARCHIVE%

db/templates/crud.txt

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
%CRUD_OP%_%SUBJECT%.%EXT%
2+
%SUBJECT%_%CRUD_OP%.%EXT%
3+
%CRUD_OP%/%SUBJECT%.%EXT%
4+
%SUBJECT%/%CRUD_OP%.%EXT%
5+
api/%CRUD_OP%_%SUBJECT%

db/templates/db.txt

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
db.%ARCHIVE%
2+
database.%ARCHIVE%
3+
backup_%DB%.%ARCHIVE%
4+
%DB%_backup.%ARCHIVE%
5+
dump_%DATE_COMPACT%.%ARCHIVE%

0 commit comments

Comments
 (0)