-
Notifications
You must be signed in to change notification settings - Fork 273
Docx Renderer Extension
flexmark-java Docx-Renderer extension
Renders the parsed Markdown AST to docx format using the docx4j library.
See the DocxConverterCommonMark Sample for code and Customizing Docx Rendering for an overview and information on customizing the styles.
Pegdown version can be found in DocxConverterPegdown Sample
EmojiExtension.USE_SHORTCUT_TYPE
to EmojiShortcutType.GITHUB
or
EmojiShortcutType.ANY_GITHUB_PREFERRED
which causes GitHub provided images to be used.
Renders AST generated by flexmark-java parser. No special syntax is implemented by this extension.
-
.className
on paragraph elements will set the docx styleId toclassName
if the style id is found. This allows using specific style ids to change formatting for paragraphs. - Use
{style=""}
to set attributes on text or block elements. Only the following are processed:-
color
- text color -
background-color
- shade fill color, pattern always solid. -
font-family
- not implemented -
font-size
- not implemented -
font-weight
- set/clear bold (if using numeric weights then >= 550 sets bold, less clears it) -
font-style
- set/clear italic
-
artifact: flexmark-docx-converter
The following options are available:
Defined in DocxRenderer
class:
-
STYLES_XML
defaultgetResourceString("/styles.xml")
, default styles section if missing in wordprocessing package -
NUMBERING_XML
defaultgetResourceString("/numbering.xml")
, default numbering section if missing in wordprocessing package -
RENDER_BODY_ONLY
defaultfalse
, when rendering to string will only output the body of the document part. Used for tests. -
MAX_IMAGE_WIDTH
default0
, max image width, 0 no max -
DEFAULT_LINK_RESOLVER
defaulttrue
, use default link resolver, which uses theDOC_RELATIVE_URL
andDOC_ROOT_URL
options -
DOC_RELATIVE_URL
default""
, the prefix to use for all relative URLs: not starting with protocol or/
-
DOC_ROOT_URL
default""
, the prefix to use for all absolute URLs: ones starting with/
-
LINEBREAK_ON_INLINE_HTML_BR
defaulttrue
, convert inline HTML<br>
to line break in the docx -
TABLE_CAPTION_TO_PARAGRAPH
defaulttrue
, convert table captions to paragraphs, styled withTableCaption
style id -
TABLE_CAPTION_BEFORE_TABLE
defaultfalse
, insert caption before table -
TOC_GENERATE
defaultfalse
, whether to generate TOC, even if no TOC Markdown element is present in the file -
TOC_INSTRUCTION
default"TOC \\o \"1-3\" \\h \\z \\u "
, defines the instruction string used for the TOC element -
NO_CHARACTER_STYLES
defaultfalse
, when true will not set character style but explicitly set the run values from the style -
CODE_HIGHLIGHT_SHADING
default""
, when non-empty will use this color as a highlight, also overridesNO_CHARACTER_STYLES
to true, see NOTE on Highlight Colors colors. -
DOC_EMOJI_IMAGE_VERT_OFFSET
default-0.10
, vertical offset of emoji image as a factor of line height at point of insertion. The final value is rounded to nearest pt so jumps of 1 pt for small changes of this value can occur. -
DOC_EMOJI_IMAGE_VERT_SIZE
default1.05
, size of emoji image as a factor of line height at point of insertion. -
LOCAL_HYPERLINK_MISSING_HIGHLIGHT
default"red"
, when non-empty will highlight unresolved hyperlinks local to the document with this color. see NOTE on Highlight Colors colors. -
LOCAL_HYPERLINK_MISSING_FORMAT
default"Missing target id: #%s"
, when non-empty uses String.format() on the given string with the missing ref anchor as the argument to generate a tooltip for unresolved hyperlinks -
LOCAL_HYPERLINK_SUFFIX
default""
, appends this suffix to in document hyperlink anchor reference. Needed in some cases for post processing.
Docx format requires a named color. Any color provided that does not match a named color will be converted to the closest named color.
When CODE_HIGHLIGHT_SHADING
is set to "shade"
then will use the closest named color taken
from the SourceText
shade fill color if available.
Block element styles:
-
DEFAULT_STYLE
default"Normal"
, style to use for the markdown element -
LOOSE_PARAGRAPH_STYLE
default"ParagraphTextBody"
, style to use for loose list type items -
TIGHT_PARAGRAPH_STYLE
default"BodyText"
, style to use for tight list type items -
PREFORMATTED_TEXT_STYLE
default"PreformattedText"
, style to use for fenced code and indented code -
BLOCK_QUOTE_STYLE
default"Quotations"
, style to use for block quotes -
ASIDE_BLOCK_STYLE
default"AsideBlock"
, style to use for aside blocks -
HORIZONTAL_LINE_STYLE
default"HorizontalLine"
, style to use for thematic breaks -
TABLE_CAPTION
default"TableCaption"
, style to use for table captions -
TABLE_CONTENTS
default"TableContents"
, style to use for table bodies -
TABLE_HEADING
default"TableHeading"
, style to use for table headings -
FOOTNOTE_STYLE
default"Footnote"
, style to use for footnote text -
BULLET_LIST_STYLE
default"BulletList"
, numbering list style to use for bullet list item paragraph -
NUMBERED_LIST_STYLE
default"NumberedList"
, numbering list style to use for numbered list item paragraph
Inline element styles:
-
BOLD_STYLE
default"StrongEmphasis"
, style to use for hte markdown element -
ITALIC_STYLE
default"Emphasis"
, style to use for hte markdown element -
STRIKE_THROUGH_STYLE
default"Strikethrough"
, style to use for hte markdown element -
SUBSCRIPT_STYLE
default"Subscript"
, style to use for hte markdown element -
SUPERSCRIPT_STYLE
default"Superscript"
, style to use for hte markdown element -
INS_STYLE
default"Underlined"
, style to use for hte markdown element -
INLINE_CODE_STYLE
default"SourceText"
, style to use for hte markdown element -
HYPERLINK_STYLE
default"Hyperlink"
, style to use for hte markdown element -
FOOTNOTE_ANCHOR_STYLE
default"FootnoteReference"
, style to use for hte markdown element
List Element Styles
Unordered lists use numbering list style named BulletList
while ordered lists use
NumberedList
. If these are not present then default numbering style (id = 2) is used for
unordered lists and default numbering style (id = 3) is used for ordered lists.
The following are equivalent to Renderer
properties of the same name. Included in
DocxRenderer
for convenience.
For the TOC_INSTRUCTION
string see
Docx4j GettingStarted under the
heading TOC Content Control
NOTE: Word does not handle inserted HTML very well. Any HTML not suppressed will be escaped: ie.
it will render into the document as text. The exception is for the <br>
tag which if enabled
will be rendered as a line break.
-
ESCAPE_HTML_BLOCKS
default value ofESCAPE_HTML
, escape html blocks found in the document -
ESCAPE_HTML_COMMENT_BLOCKS
default value ofESCAPE_HTML_BLOCKS
, escape html comment blocks found in the document. -
ESCAPE_HTML
defaultfalse
, escape all html found in the document -
ESCAPE_INLINE_HTML_COMMENTS
default value ofESCAPE_HTML_BLOCKS
, escape inline html found in the document -
ESCAPE_INLINE_HTML
default value ofESCAPE_HTML
, escape inline html found in the document -
PERCENT_ENCODE_URLS
defaultfalse
, percent encode urls -
RECHECK_UNDEFINED_REFERENCES
defaultfalse
, Recheck the existence of refences inParser.REFERENCES
for link and image refs marked undefined. Used when new references are added after parsing -
SUPPRESS_HTML_BLOCKS
default value ofSUPPRESS_HTML
, suppress html output for html blocks -
SUPPRESS_HTML_COMMENT_BLOCKS
default value ofSUPPRESS_HTML_BLOCKS
, suppress html output for html comment blocks -
SUPPRESS_HTML
defaultfalse
, suppress html output for all html -
SUPPRESS_INLINE_HTML_COMMENTS
default value ofSUPPRESS_INLINE_HTML
, suppress html output for inline html comments -
SUPPRESS_INLINE_HTML
default value ofSUPPRESS_HTML
, suppress html output for inline html