Description
Is your feature request related to a problem? Please describe.
We currently support ASCII smuggling via Unicode Tags but do not support more flexible, byte-level invisible encoding. We want to increase our ability to simulate modern LLM smuggling and data exfiltration scenarios.
Describe the solution you'd like
Add a new sneaky_bits
encoding mode to AsciiSmugglerConverter
:
- Uses only two invisible Unicode characters (U+2062 for 0, U+2064 for 1)
- Encodes any UTF-8 input at the bit level
- Supports decoding
- Keeps
unicode_tags
as the default
Describe alternatives you've considered, if relevant
We could explore Variant Selectors next next, but let us start with Sneaky Bits.
Additional context
Based on Sneaky Bits. Enables advanced red teaming use cases like prompt injection, data leakage, and hidden triggers using only two invisible characters.
References
https://embracethered.com/blog/posts/2025/sneaky-bits-and-ascii-smuggler/
Note: Unless anyone thinks it's unnecessary or redundant, I’ll like to go ahead and start implementing the converter. I actually already have Sneaky Bits converter ready, just need to write the tests for it. :)