-
Notifications
You must be signed in to change notification settings - Fork 521
FEAT: add superscript converter #818
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
romanlutz
merged 22 commits into
Azure:main
from
paulinek13:feat/528/superscript_converter
Jun 14, 2025
Merged
Changes from all commits
Commits
Show all changes
22 commits
Select commit
Hold shift + click to select a range
e6d3f15
feat: add `SuperscriptConverter` (simple implementation)
paulinek13 f2b99a4
feat: add a simple test for the converter
paulinek13 53a1185
implement `alternate` mode
paulinek13 2d5ec0b
tests: extract conversion logic to a helper function
paulinek13 a8d905b
test 'alternate' mode
paulinek13 aaa04d4
refactor: move `get_n_random` to `utils.py`
paulinek13 9ca5938
rename `get_n_random` -> `get_random_indices` and update its logic to…
paulinek13 e96ae01
rename parameter `percentage` to `sample_ratio`
paulinek13 35499a3
feat: add 'random' mode
paulinek13 0878a9a
add tests for 'random' mode
paulinek13 126e420
fix for `get_random_indices`: when the ratio is 0, return an empty list
paulinek13 7d1e02b
new test case with more words and 20%
paulinek13 a8eafef
formatting
paulinek13 7e00d54
mypy: fix errors
paulinek13 35eea3e
fix `UnicodeDecodeError` flake8 complaining about some superscript ch…
paulinek13 a39daf4
Merge remote-tracking branch 'origin/main' into feat/528/superscript_…
paulinek13 f26bca2
Merge remote-tracking branch 'origin/main' into feat/528/superscript_…
paulinek13 0a2336d
Reset pyrit\common\utils.py to match origin/main
paulinek13 ca6dfc4
refactor: simplify SuperscriptConverter by extending WordLevelConverter
paulinek13 8bd294a
add SuperscriptConverter to api.rst
paulinek13 e64486b
update docstring
paulinek13 4619b11
improve docstring formatting
paulinek13 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,84 @@ | ||
# Copyright (c) Microsoft Corporation. | ||
# Licensed under the MIT license. | ||
|
||
from pyrit.prompt_converter.word_level_converter import WordLevelConverter | ||
|
||
|
||
class SuperscriptConverter(WordLevelConverter): | ||
""" | ||
Converts text to superscript. | ||
|
||
Note: | ||
This converter leaves characters that do not have a superscript equivalent unchanged. | ||
""" | ||
|
||
_superscript_map = { | ||
"0": "\u2070", | ||
"1": "\u00b9", | ||
"2": "\u00b2", | ||
"3": "\u00b3", | ||
"4": "\u2074", | ||
"5": "\u2075", | ||
"6": "\u2076", | ||
"7": "\u2077", | ||
"8": "\u2078", | ||
"9": "\u2079", | ||
"a": "\u1d43", | ||
"b": "\u1d47", | ||
"c": "\u1d9c", | ||
"d": "\u1d48", | ||
"e": "\u1d49", | ||
"f": "\u1da0", | ||
"g": "\u1d4d", | ||
"h": "\u02b0", | ||
"i": "\u2071", | ||
"j": "\u02b2", | ||
"k": "\u1d4f", | ||
"l": "\u02e1", | ||
"m": "\u1d50", | ||
"n": "\u207f", | ||
"o": "\u1d52", | ||
"p": "\u1d56", | ||
"r": "\u02b3", | ||
"s": "\u02e2", | ||
"t": "\u1d57", | ||
"u": "\u1d58", | ||
"v": "\u1d5b", | ||
"w": "\u02b7", | ||
"x": "\u02e3", | ||
"y": "\u02b8", | ||
"z": "\u1dbb", | ||
"A": "\u1d2c", | ||
"B": "\u1d2d", | ||
"D": "\u1d30", | ||
"E": "\u1d31", | ||
"G": "\u1d33", | ||
"H": "\u1d34", | ||
"I": "\u1d35", | ||
"J": "\u1d36", | ||
"K": "\u1d37", | ||
"L": "\u1d38", | ||
"M": "\u1d39", | ||
"N": "\u1d3a", | ||
"O": "\u1d3c", | ||
"P": "\u1d3e", | ||
"R": "\u1d3f", | ||
"T": "\u1d40", | ||
"U": "\u1d41", | ||
"V": "\u2c7d", | ||
"W": "\u1d42", | ||
"+": "\u207a", | ||
"-": "\u207b", | ||
"=": "\u207c", | ||
"(": "\u207d", | ||
")": "\u207e", | ||
} | ||
|
||
async def convert_word_async(self, word: str) -> str: | ||
result = [] | ||
for char in word: | ||
if char in self._superscript_map: | ||
result.append(self._superscript_map[char]) | ||
else: | ||
result.append(char) | ||
return "".join(result) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
# Copyright (c) Microsoft Corporation. | ||
# Licensed under the MIT license. | ||
|
||
import pytest | ||
|
||
from pyrit.prompt_converter import ConverterResult, SuperscriptConverter | ||
|
||
|
||
async def _check_conversion(converter, prompts, expected_outputs): | ||
for prompt, expected_output in zip(prompts, expected_outputs): | ||
result = await converter.convert_async(prompt=prompt, input_type="text") | ||
assert isinstance(result, ConverterResult) | ||
assert result.output_text == expected_output | ||
|
||
|
||
@pytest.mark.asyncio | ||
async def test_superscript_converter(): | ||
defalut_converter = SuperscriptConverter() | ||
await _check_conversion( | ||
defalut_converter, | ||
["Let's test this converter!", "Unsupported characters stay the same: qCFQSXYZ"], | ||
[ | ||
"\u1d38\u1d49\u1d57'\u02e2 \u1d57\u1d49\u02e2\u1d57 \u1d57\u02b0\u2071\u02e2 " | ||
"\u1d9c\u1d52\u207f\u1d5b\u1d49\u02b3\u1d57\u1d49\u02b3!", | ||
"\u1d41\u207f\u02e2\u1d58\u1d56\u1d56\u1d52\u02b3\u1d57\u1d49\u1d48 " | ||
"\u1d9c\u02b0\u1d43\u02b3\u1d43\u1d9c\u1d57\u1d49\u02b3\u02e2 " | ||
"\u02e2\u1d57\u1d43\u02b8 \u1d57\u02b0\u1d49 \u02e2\u1d43\u1d50\u1d49: qCFQSXYZ", | ||
], | ||
) |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.