Convert MediaWiki pages to GitHub flavored Markdown (or other formats supported by Pandoc). The conversion uses an XML export from MediaWiki and converts each wiki page to an individual markdown file. Directory structures will be preserved. The generated export can also include frontmatter for Github pages.
You may also be interested in a forked version of the original codebase, available at https://github.com/outofcontrol/mediawiki-to-gfm.
This is a fork of Philip Ashlock's code at https://github.com/philipashlock/mediawiki-to-markdown. Thanks for sharing Philip!
Differences from Philip's code (as of 2024-12-01):
- Replaced 
each()for PHP 8 support - Made 
mkdir()recursive - Added error handling for 
pandocfailure, and save the bad input data for debugging - Added 
Dockerfile 
- PHP
 - Pandoc
 
You'll export all your pages as a single XML file following these steps: http://en.wikipedia.org/wiki/Help:Export
You can run this code in a container (Podman / Docker), or install it locally and run from there.
https://pandoc.org/installing.html
curl -sS https://getcomposer.org/installer | phpphp composer.phar installTo run a local install of the code, use:
php convert.php --filename=<filename>.xml [extra-args]To use the container image, try:
[podman|docker] run --rm -v $PWD:/data mediawiki-to-markdown --filename=<filename>.xml [extra-args]For the examples below, let's assume a Linux shell, and create an alias, with one of these commands:
alias mwconvert='php convert.php'alias mwconvert='podman run --rm -v $PWD:/data mediawiki-to-markdown'alias mwconvert='docker run --rm -v $PWD:/data mediawiki-to-markdown'
The only required parameter is filename for the name of the xml file you exported from MediaWiki, eg:
mwconvert --filename=mediawiki.xmlYou can also use output to specify an output folder since each wiki page in the XML file will generate it's own separate markdown file.
mwconvert --filename=mediawiki.xml --output=exportYou can set indexes as true if you want pages with the same name as a directory to be renamed as index.md and placed into their directory
mwconvert --filename=mediawiki.xml --output=export --indexes=trueYou can specify whether you want frontmatter included. This is automatically set to true when the output format is gfm
mwconvert --filename=mediawiki.xml --output=export --format=markdown_phpextra --frontmatter=trueYou can specify different output formats with format. The default is gfm (GitHub Flavored Markdown).
mwconvert --filename=mediawiki.xml --output=export --format=markdown_phpextraSupported pandoc formats are:
asciidocbeamercontextdocbookdocxdokuwikidzslidesepubepub3fb2haddockhtmlhtml5icmljsonlatexmanmarkdowngfm/markdown_githubmarkdown_mmdmarkdown_phpextramarkdown_strictmediawikinativeodtopendocumentopmlorgplainrevealjsrstrtfs5slideousslidytexinfotextile