feat: Add Language Adaptation Transforms for Cross-Lingual AIRT Probing #285

rdheekonda · 2025-12-18T17:31:09Z

Add Language Adaptation Transforms for Cross-Lingual AIRT Probing

This PR introduces a comprehensive suite of language adaptation transforms to enable cross-lingual adversarial testing in AIRT workflows. These transforms allow red-teamers to test model robustness across different languages, dialects, and writing systems without relying on external translation APIs.

Key Changes:

Add language adaptation transforms for testing multilingual airt probing
Restore dataset injection logic for TAP/attack workflows to enable transform functionality
Update TAP example notebook with 6 different language-based attack demonstrations
Fix transform hook to properly pass data to decorated tasks

Also,

In TAP and other attack workflows, transforms were receiving empty data because:

Attack candidates are baked into tasks as default arguments
Dataset is empty [{}] in attack scenarios
Transform hooks couldn't access the actual candidate to transform it

Solution:
When attack mode is detected (empty dataset + Message candidate), the candidate is injected into the dataset so transforms can access and modify it before it reaches the target model.

Added:
dreadnode/transforms/language.py:

adapt_language(): LLM-powered language adaptation with style control
transliterate(): Phonetic script conversion with custom mapping support
code_switch(): Multilingual code-switching for mixed-language testing
dialectal_variation(): Regional dialect adaptation (AAVE, Singlish, etc.)

examples/airt/tree_of_attacks_with_pruning_transforms.ipynb:

6 complete attack demonstrations using different language transforms
Attack 1: Basic character-level transform (baseline)
Attack 2: Spanish formal language adaptation
Attack 3: Swahili (low-resource language testing)
Attack 4: Spanglish code-switching
Attack 5: AAVE dialectal variation
Attack 6: Cyrillic transliteration
Changed:

dreadnode/optimization/study.py:

Restored dataset injection logic in _run_evaluation() for attack scenarios
Simplified condition: if dataset == [{}] and isinstance(trial.candidate, Message)
Added dataset_input_mapping=["message"] when passing to Eval

Removed:

Deleted files/code
Removed dependencies
Cleaned up configurations

Generated Summary:

Refactored variable names in apply_transforms function for clarity and consistency.
Introduced a new file dreadnode/transforms/language.py which includes:
- adapt_language: A transform for adapting text to a target language with style adjustments.
- transliterate: Converts Latin script to various writing systems phonetically.
- code_switch: Mixes multiple languages in a single text.
- dialectal_variation: Adapts text to specific regional dialects or variations.
Updated dreadnode/optimization/study.py to enhance handling of empty datasets and introduce a check for message types in dataset_input_mapping.
Modified the example notebook to reflect new functionality:
- Updated introduction to emphasize language transformation features.
- Added various attacks demonstrating cross-lingual adaptation, code-switching, and dialectal variations.
Enhanced user prompts in the code examples for clarity on the objectives of each attack scenario.

These changes expand the capabilities of the Dreadnode SDK for multilingual and dialectal text transformations, improving its utility for diverse language applications and security testing scenarios.

This summary was generated with ❤️ by rigging

dreadnode/optimization/study.py

rdheekonda added 2 commits December 18, 2025 09:22

add language transforms

aa0b62c

fix precommit

69054db

dreadnode-renovate-bot bot added the area/examples Changes to example code and demonstrations label Dec 18, 2025

rdheekonda requested a review from monoxgas December 18, 2025 17:31

rdheekonda commented Dec 18, 2025

View reviewed changes

dreadnode/optimization/study.py Outdated Show resolved Hide resolved

exclude language module from precommit

c4eb967

dreadnode-renovate-bot bot added the area/python Changes to Python package configuration and dependencies label Dec 18, 2025

rdheekonda added 5 commits December 18, 2025 09:42

Merge branch 'main' into user/raja/eng-3777-add-translation-transform

0f5e7af

fix eval transforms missing candidates context in airt attacks

1412b41

removed task factory

2e913ca

add skip vali func in attack

a0fcd8a

change to task in study foprmatting

97a08f3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add Language Adaptation Transforms for Cross-Lingual AIRT Probing #285

feat: Add Language Adaptation Transforms for Cross-Lingual AIRT Probing #285

Uh oh!

rdheekonda commented Dec 18, 2025 •

edited by github-actions bot

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: Add Language Adaptation Transforms for Cross-Lingual AIRT Probing #285

Are you sure you want to change the base?

feat: Add Language Adaptation Transforms for Cross-Lingual AIRT Probing #285

Uh oh!

Conversation

rdheekonda commented Dec 18, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Add Language Adaptation Transforms for Cross-Lingual AIRT Probing

Generated Summary:

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rdheekonda commented Dec 18, 2025 •

edited by github-actions bot

Loading