Comprehensive Watermark Characters Reference

This document provides a detailed breakdown of special characters that can be used for text watermarking purposes. Each category includes the actual characters, their Unicode code points, and information on how to use them.

Clean text file with all the characters: characters.txt

Zero-Width Characters and Invisible Separators

These characters have no visible width and can be inserted between visible characters without affecting appearance.

Character	Name	Unicode	HTML Entity	Description
	Zero Width Space	U+200B	``	Invisible space with no width
`‌`	Zero Width Non-Joiner	U+200C	`‌`	Prevents characters from joining
`‍`	Zero Width Joiner	U+200D	`‍`	Forces characters to join
`‎`	Left-to-Right Mark	U+200E	`‎`	Changes text direction to LTR
`‏`	Right-to-Left Mark	U+200F	`‏`	Changes text direction to RTL
`⁠`	Word Joiner	U+2060	`⁠`	Similar to ZWSP but doesn't break
`⁡`	Function Application	U+2061	`⁡`	Mathematical notation, invisible
`⁢`	Invisible Times	U+2062	`⁢`	Mathematical notation, invisible
`⁣`	Invisible Separator	U+2063	`⁣`	Mathematical notation, invisible
`⁤`	Invisible Plus	U+2064	`⁤`	Mathematical notation, invisible

Usage: These characters can be inserted between normal characters or words to create unique patterns. They're completely invisible but can be detected when analyzing the text code.

Various Space Characters

Unlike regular spaces, these have different widths and behaviors but appear visually similar.

Name	Unicode	HTML Entity	Description
En Space	U+2002	` `	Width of letter 'N'
Em Space	U+2003	` `	Width of letter 'M'
Three-Per-Em Space	U+2004	` `	1/3 of Em width
Four-Per-Em Space	U+2005	` `	1/4 of Em width
Six-Per-Em Space	U+2006	` `	1/6 of Em width
Figure Space	U+2007	` `	Width of a digit
Punctuation Space	U+2008	` `	Width of a period
Thin Space	U+2009	` `	1/5 of Em width
Hair Space	U+200A	` `	Thinner than thin space
Medium Mathematical Space	U+205F	` `	4/18 of Em width
Narrow No-Break Space	U+202F	` `	Non-breaking narrow space
Ideographic Space	U+3000	`　`	Width of ideographic character
No-Break Space	U+00A0	` `	Regular space that doesn't break
Ogham Space Mark	U+1680	` `	Space used in Ogham script
Mongolian Vowel Separator	U+180E	`᠎`	Used in Mongolian script

Usage: Replace normal spaces with these alternative spaces to create patterns. Each has slightly different width, which might be imperceptible visually but can be detected programmatically.

Combining Diacritical Marks

These characters combine with preceding characters and can be stacked.

Character Range	Unicode Range	Description
`̀ ́ ̂ ̃ ̄ ̅ ̆ ̇ ̈ ̉ ̊ ̋ ̌ ̍ ̎ ̏`	U+0300 - U+030F	Combining diacritical marks (accents)
`̐ ̑ ̒ ̓ ̔ ̕ ̖ ̗ ̘ ̙ ̚ ̛ ̜ ̝ ̞ ̟`	U+0310 - U+031F	More combining marks
`̠ ̡ ̢ ̣ ̤ ̥ ̦ ̧ ̨ ̩ ̪ ̫ ̬ ̭ ̮ ̯`	U+0320 - U+032F	More combining marks
`̰ ̱ ̲ ̳ ̴ ̵ ̶ ̷ ̸ ̹ ̺ ̻ ̼ ̽ ̾ ̿`	U+0330 - U+033F	More combining marks
`͂ ͅ ͆ ͇ ͈ ͉ ͊ ͋ ͌ ͍ ͎ ͏ ͐ ͑ ͒ ͓`	U+0340 - U+034F	More combining marks
`͔ ͕ ͖ ͗ ͘ ͙ ͚ ͛ ͜ ͝ ͞ ͟ ͠ ͡ ͢ ͣ`	U+0350 - U+035F	More combining marks
`ͤ ͥ ͦ ͧ ͨ ͩ ͪ ͫ ͬ ͭ ͮ ͯ`	U+0360 - U+036F	More combining marks
`҈ ҉`	U+0488 - U+0489	Combining Cyrillic marks

Usage: These can be added to regular characters without changing their appearance much. For example, a̸ appears like 'a' but contains an invisible combining mark. They can be stacked in multiple layers.

Special Punctuation

Alternative versions of common punctuation marks.

Character(s)	Name	Unicode	Description
`• ‣ ․ ‥ … ‧`	Various Dots	U+2022, U+2023, U+2024, U+2025, U+2026, U+2027	Alternative bullet points and ellipses
`‹ › « »`	Angle Quotes	U+2039, U+203A, U+00AB, U+00BB	Alternative quotation marks
`' ' ‚ ‛ " " „ ‟`	Quotation Marks	U+2018-U+201F	Various styles of quotation marks
`‐ ‑ ‒ – — ― ⁃`	Hyphens and Dashes	U+2010-U+2015, U+2043	Various lengths of dashes
`⁄`	Fraction Slash	U+2044	Different from regular slash
`⁎ ⁑ ⁂`	Unusual Asterisks	U+204E, U+2051, U+2042	Alternative asterisk-like symbols
`⁅ ⁆`	Square Bracket with Quill	U+2045, U+2046	Unusual brackets
`⁇ ⁈ ⁉`	Multiple Question/Exclamation	U+2047, U+2048, U+2049	Combined punctuation
`⁋ ⁌ ⁍`	Paragraph Marks	U+204B, U+204C, U+204D	Unusual paragraph markers
`⁏`	Reversed Semicolon	U+204F	Semicolon facing opposite direction
`⁓`	Swung Dash	U+2053	Wavy dash
`⁕`	Flower Punctuation Mark	U+2055	Flower-shaped punctuation
`⁗`	Quadruple Prime	U+2057	Four prime marks
`⁘ ⁙ ⁚ ⁛ ⁜ ⁝ ⁞`	Various Dot Punctuation	U+2058-U+205E	Various unusual dot arrangements

Usage: These can replace standard punctuation while looking very similar. For example, using the reversed semicolon instead of a normal one.

Special Symbols

Distinctive symbols that can be hidden in text.

Character(s)	Name	Unicode	Description
`† ‡ ※ ⁁ ⁊ ⁋ ⁏ ⁒`	Various Marks	U+2020, U+2021, U+203B, U+2041, U+204A, U+204B, U+204F, U+2052	Various special marks
`℠ ℡ ™ ℀ ℁ ℂ ℃ ℄ ℅ ℆`	Various Symbols	U+2120, U+2121, U+2122, U+2100-U+2106	Service marks, telephone, trademark, etc.
`⅓ ⅔ ⅕ ⅖ ⅗ ⅘ ⅙ ⅚`	Fractions	U+2153-U+215A	Various fraction symbols
`← ↑ → ↓ ↔ ↕ ↖ ↗ ↘ ↙`	Arrows	U+2190-U+2199	Various directional arrows
`∀ ∁ ∂ ∃ ∄ ∅ ∆ ∇ ∈ ∉ ∊ ∋`	Mathematical Symbols	U+2200-U+220B	Various mathematical symbols
`≈ ≉ ≠ ≡ ≢ ≣ ≤ ≥ ≦ ≧`	More Math Symbols	U+2248-U+2267	Comparison and equality symbols
`① ② ③ ④ ⑤ ⑥ ⑦ ⑧ ⑨ ⑩`	Circled Numbers	U+2460-U+2469	Numbers in circles
`⏎ ⏏ ⏐ ⏑ ⏒ ⏓ ⏔ ⏕ ⏖ ⏗`	Control Symbols	U+23CE-U+23D7	Symbols for control characters

Usage: These symbols can be used to replace letters or words in text while maintaining a similar appearance, or can be hidden in places where they might not be noticed.

Homoglyphs (Look-Alike Characters)

Characters from other alphabets that resemble Latin letters.

Character Set	Script	Description
`Α α Β β Ε ε Ζ ζ Η η Ι ι Κ κ Μ μ Ν ν Ο ο Ρ ρ Τ τ Χ χ`	Greek	Look similar to Latin A, B, E, Z, H, I, K, M, N, O, P, T, X
`А а В в Е е К к М м Н н О о Р р С с Т т У у Х х`	Cyrillic	Look similar to Latin A, B, E, K, M, H, O, P, C, T, Y, X
`ᴀ ʙ ᴄ ᴅ ᴇ ғ ɢ ʜ ɪ ᴊ ᴋ ʟ ᴍ ɴ ᴏ ᴘ ǫ ʀ s ᴛ ᴜ ᴠ ᴡ x ʏ ᴢ`	Small Caps	Smaller versions of capital letters

Usage: These can replace regular Latin characters while looking almost identical. For example, using Cyrillic 'о' instead of Latin 'o'.

Variation Selectors

Characters that modify the appearance of preceding characters.

Character(s)	Unicode Range	Description
`︀ ︁ ︂ ︃ ︄ ︅ ︆ ︇ ︈ ︉ ︊ ︋ ︌ ︍ ︎ ️`	U+FE00-U+FE0F	Variation selectors 1-16

Usage: These modify the appearance of the preceding character. For example, some emoji have different appearances when followed by variation selectors.

Special Hyphen Characters

Different types of hyphens with special behaviors.

Character	Name	Unicode	HTML Entity	Description
	Soft Hyphen	U+00AD	``	Only visible when breaking a word at end of line
`‑`	Non-Breaking Hyphen	U+2011	`‑`	Hyphen that doesn't allow line breaks

Usage: These can replace normal hyphens in text while having special properties.

Special Modifier Letters

Small letters used for phonetic notation or modification.

Character Range	Unicode Range	Description
`ʰ ʱ ʲ ʳ ʴ ʵ ʶ ʷ ʸ ʹ ʺ ʻ ʼ ʽ ʾ ʿ`	U+02B0-U+02BF	Modifier letters
`ˀ ˁ ˂ ˃ ˄ ˅ ˆ ˇ ˈ ˉ ˊ ˋ ˌ ˍ ˎ ˏ`	U+02C0-U+02CF	More modifier letters
`ː ˑ ˒ ˓ ˔ ˕ ˖ ˗ ˘ ˙ ˚ ˛ ˜ ˝ ˞ ˟`	U+02D0-U+02DF	Various modifiers and tone marks
`ˠ ˡ ˢ ˣ ˤ ˥ ˦ ˧ ˨ ˩ ˪ ˫`	U+02E0-U+02EB	More modifier letters and tone marks

Usage: These can be used as superscript-like characters or added to text in unexpected places.

Miscellaneous Technical Symbols

Various technical symbols that could be hidden in text.

Character Range	Unicode Range	Description
`⌐ ⌑ ⌒ ⌓ ⌔ ⌕ ⌖ ⌗ ⌘ ⌙ ⌚ ⌛ ⌜ ⌝ ⌞ ⌟`	U+2310-U+231F	Miscellaneous technical symbols
`⌠ ⌡ ⌢ ⌣ ⌤ ⌥ ⌦ ⌧ ⌨`	U+2320-U+2328	More technical symbols
`〈〉 ⦅ ⦆`	U+2329, U+232A, U+2985, U+2986	Various brackets

Usage: These can be used to replace certain characters or be hidden in text where they might not be noticed.

How to Use These Characters for Watermarking

Basic Pattern Watermarking:
- Insert zero-width characters between normal characters in a specific pattern
- Example: Inserting ZWJ after every third character
Space Replacement:
- Replace regular spaces with different Unicode spaces
- Example: Alternating between regular spaces and hair spaces
Homoglyph Substitution:
- Replace certain letters with identical-looking characters from other scripts
- Example: Replacing 'o' with Cyrillic 'о' (U+043E) in specific positions
Combining Mark Addition:
- Add invisible combining marks to certain characters
- Example: Adding a combining dot below (U+0323) to vowels
Invisible Sequence Patterns:
- Add sequences of invisible characters at specific locations in text
- Example: Adding [ZWSP, ZWJ, ZWNJ] after periods
Punctuation Substitution:
- Replace standard punctuation with alternative Unicode versions
- Example: Using alternative quotes or dashes

How to Create These Characters in Code

JavaScript

// Using Unicode escape sequences
const zwsp = '\u200B';  // Zero Width Space
const zwj = '\u200D';   // Zero Width Joiner
const cyrillicA = '\u0430';  // Cyrillic 'а'

// Adding watermark with zero-width characters
function addWatermark(text, pattern) {
    let result = '';
    for (let i = 0; i < text.length; i++) {
        result += text[i];
        if (i % pattern.length === 0) {
            result += pattern;
        }
    }
    return result;
}

// Example: Add a pattern of invisible characters after every 5th character
const watermarkedText = addWatermark("Hello world", "\u200B\u200D\u200C");

Python

# Using Unicode escape sequences
zwsp = '\u200B'  # Zero Width Space
zwj = '\u200D'   # Zero Width Joiner
cyrillicA = '\u0430'  # Cyrillic 'а'

# Adding watermark with zero-width characters
def add_watermark(text, pattern):
    result = ''
    for i, char in enumerate(text):
        result += char
        if i % len(pattern) == 0:
            result += pattern
    return result

# Example: Add a pattern of invisible characters after every 5th character
watermarked_text = add_watermark("Hello world", "\u200B\u200D\u200C")

HTML/CSS

<!-- Using HTML entities -->
<p>This text contains a zero-width space &#8203; here.</p>
<p>This text uses a combining mark: a&#768;</p>

<!-- Using CSS to create custom invisible watermarks -->
<style>
    .watermarked::after {
        content: '\200B\200D\200C';
        display: inline;
    }
</style>
<p class="watermarked">This text has an invisible watermark after it.</p>

Detecting Watermarks

JavaScript

// Detect zero-width characters
function detectInvisibleWatermarks(text) {
    const invisibleChars = ['\u200B', '\u200C', '\u200D', '\u2060', '\u2061', '\u2062', '\u2063', '\u2064'];
    let pattern = '';
    
    for (let i = 0; i < text.length; i++) {
        if (invisibleChars.includes(text[i])) {
            pattern += text[i];
        }
    }
    
    return pattern;
}

// Detect homoglyphs
function detectHomoglyphs(text) {
    const cyrillicMap = {
        '\u0430': 'a', '\u0435': 'e', '\u043E': 'o', 
        '\u0440': 'p', '\u0441': 'c', '\u0445': 'x'
    };
    
    let found = [];
    for (let i = 0; i < text.length; i++) {
        if (text[i] in cyrillicMap) {
            found.push({pos: i, char: text[i], latinEquiv: cyrillicMap[text[i]]});
        }
    }
    
    return found;
}

Python

# Detect zero-width characters
def detect_invisible_watermarks(text):
    invisible_chars = ['\u200B', '\u200C', '\u200D', '\u2060', '\u2061', '\u2062', '\u2063', '\u2064']
    pattern = ''
    
    for char in text:
        if char in invisible_chars:
            pattern += char
    
    return pattern

# Detect homoglyphs
def detect_homoglyphs(text):
    cyrillic_map = {
        '\u0430': 'a', '\u0435': 'e', '\u043E': 'o', 
        '\u0440': 'p', '\u0441': 'c', '\u0445': 'x'
    }
    
    found = []
    for i, char in enumerate(text):
        if char in cyrillic_map:
            found.append({'pos': i, 'char': char, 'latin_equiv': cyrillic_map[char]})
    
    return found

Conclusion

Text watermarking with Unicode characters provides a wide range of techniques for invisibly marking text. The most effective watermarks typically use a combination of these techniques to create unique, detectable patterns while maintaining the visible appearance of the text.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
README.md		README.md
characters.txt		characters.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Comprehensive Watermark Characters Reference

Zero-Width Characters and Invisible Separators

Various Space Characters

Combining Diacritical Marks

Special Punctuation

Special Symbols

Homoglyphs (Look-Alike Characters)

Variation Selectors

Special Hyphen Characters

Special Modifier Letters

Miscellaneous Technical Symbols

How to Use These Characters for Watermarking

How to Create These Characters in Code

JavaScript

Python

HTML/CSS

Detecting Watermarks

JavaScript

Python

Conclusion

About

Uh oh!

Releases

Packages

Uh oh!

Name	Unicode	HTML Entity	Description
En Space	U+2002	` `	Width of letter 'N'
Em Space	U+2003	` `	Width of letter 'M'
Three-Per-Em Space	U+2004	` `	1/3 of Em width
Four-Per-Em Space	U+2005	` `	1/4 of Em width
Six-Per-Em Space	U+2006	` `	1/6 of Em width
Figure Space	U+2007	` `	Width of a digit
Punctuation Space	U+2008	` `	Width of a period
Thin Space	U+2009	` `	1/5 of Em width
Hair Space	U+200A	` `	Thinner than thin space
Medium Mathematical Space	U+205F	` `	4/18 of Em width
Narrow No-Break Space	U+202F	` `	Non-breaking narrow space
Ideographic Space	U+3000	`　`	Width of ideographic character
No-Break Space	U+00A0	` `	Regular space that doesn't break
Ogham Space Mark	U+1680	` `	Space used in Ogham script
Mongolian Vowel Separator	U+180E	`᠎`	Used in Mongolian script

Name	Unicode	HTML Entity	Description
En Space	U+2002	` `	Width of letter 'N'
Em Space	U+2003	` `	Width of letter 'M'
Three-Per-Em Space	U+2004	` `	1/3 of Em width
Four-Per-Em Space	U+2005	` `	1/4 of Em width
Six-Per-Em Space	U+2006	` `	1/6 of Em width
Figure Space	U+2007	` `	Width of a digit
Punctuation Space	U+2008	` `	Width of a period
Thin Space	U+2009	` `	1/5 of Em width
Hair Space	U+200A	` `	Thinner than thin space
Medium Mathematical Space	U+205F	` `	4/18 of Em width
Narrow No-Break Space	U+202F	` `	Non-breaking narrow space
Ideographic Space	U+3000	`　`	Width of ideographic character
No-Break Space	U+00A0	` `	Regular space that doesn't break
Ogham Space Mark	U+1680	` `	Space used in Ogham script
Mongolian Vowel Separator	U+180E	`᠎`	Used in Mongolian script

License

dawid-ai/ai-text-watermark

Folders and files

Latest commit

History

Repository files navigation

Comprehensive Watermark Characters Reference

Zero-Width Characters and Invisible Separators

Various Space Characters

Combining Diacritical Marks

Special Punctuation

Special Symbols

Homoglyphs (Look-Alike Characters)

Variation Selectors

Special Hyphen Characters

Special Modifier Letters

Miscellaneous Technical Symbols

How to Use These Characters for Watermarking

How to Create These Characters in Code

JavaScript

Python

HTML/CSS

Detecting Watermarks

JavaScript

Python

Conclusion

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Packages

Name	Unicode	HTML Entity	Description
En Space	U+2002	` `	Width of letter 'N'
Em Space	U+2003	` `	Width of letter 'M'
Three-Per-Em Space	U+2004	` `	1/3 of Em width
Four-Per-Em Space	U+2005	` `	1/4 of Em width
Six-Per-Em Space	U+2006	` `	1/6 of Em width
Figure Space	U+2007	` `	Width of a digit
Punctuation Space	U+2008	` `	Width of a period
Thin Space	U+2009	` `	1/5 of Em width
Hair Space	U+200A	` `	Thinner than thin space
Medium Mathematical Space	U+205F	` `	4/18 of Em width
Narrow No-Break Space	U+202F	` `	Non-breaking narrow space
Ideographic Space	U+3000	`　`	Width of ideographic character
No-Break Space	U+00A0	` `	Regular space that doesn't break
Ogham Space Mark	U+1680	` `	Space used in Ogham script
Mongolian Vowel Separator	U+180E	`᠎`	Used in Mongolian script