Skip to content

TikZ backend: Unicode sanitization in draw_text (U+2212, en/em dashes) #54

@yueswater

Description

@yueswater

Sub-issue of #50.

Problem

Matplotlib formats negative axis labels with U+2212 (), which pdfLaTeX
rejects:

! Package inputenc Error: Unicode char \u8:− not set up for use with LaTeX.

Every string flowing through draw_text must normalize U+2212 → ASCII
-, plus other Unicode punctuation commonly produced by Matplotlib
formatters (en/em dashes, prime marks, non-breaking space).

Proposed fix

_UNICODE_FIXES = str.maketrans({
    '\u2212': '-',     # MINUS SIGN
    '\u2013': '--',    # EN DASH
    '\u2014': '---',   # EM DASH
    '\u00a0': '~',     # NO-BREAK SPACE
})

def draw_text(self, gc, x, y, s, prop, angle, ismath=False, mtext=None):
    safe = s.translate(_UNICODE_FIXES)
    body = f'${safe}$' if ismath else safe
    self._commands.append(
        rf'\node[anchor=base, rotate={angle:.1f}] at '
        rf'({x*self.scale:.4f},{y*self.scale:.4f}) {{{body}}};'
    )

Acceptance criteria

  • A plot with negative axis tick labels compiles under pdfLaTeX
    without Unicode errors.
  • Math-mode text (ismath=True) is wrapped in $...$; plain text
    is not.
  • Regression test round-trips a figure with an x-axis ranging from
    −5 to 5 and asserts the emitted .tex contains no U+2212.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions