Skip to content

JacobxChoi/ContextLeak

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ContextLeak: Auditing Leakage in Private In-Context Learning Methods

ContextLeak is an open-source implementation of the framework presented in the paper “ContextLeak: Auditing Leakage in Private In-Context Learning Methods.” The project provides researchers with an end-to-end toolkit to measure, visualise and mitigate worst-case privacy leakage in In-Context Learning (ICL) scenarios.

Overview

The project implements a framework for:

  • Generating and inserting canary statements into training data
  • Running LLM inference with various configurations
  • Evaluating model responses with privacy metrics
  • Supporting multiple datasets and model architectures

Supported Datasets

  • SAMSum: Dialogue summarization dataset
  • DocVQA: Document Visual Question Answering dataset
  • Subjectivity: Text classification for objective vs. subjective sentences
  • Sarcasm: Sarcasm detection in news headlines

Supported Models

  • LLaMA 70B
  • Qwen 72B

Setup

  1. Install dependencies:
pip install -r requirements.txt
  1. Prepare your data:
  • Place your datasets in the appropriate directories under data/
  • For DocVQA: Place data in data/processed/docvqa_sampled
  • For SAMSum: Place data in data/processed/samsum_sampled_{num_train}_train
  • For Subjectivity: Place data in data/classification/subj_sampled_{num_train}_train
  • For Sarcasm: Place data in data/classification/sarcasm_sampled_{num_train}_train
  1. Prepare canary statements:
  • Place canary files in data/canaries/
  • Format: Pickle files containing canary statements

Usage

Basic Generation

python generate.py \
    --dataset_name [samsum|docvqa|subj|sarcasm] \
    --exampler <number_of_shots> \
    --ensemble <number_of_ensembles> \
    --llm [llama70b|qwen72b] \
    --canary_file <path_to_canary_file> \
    --output_dir <output_directory> \
    --num_train <number_of_training_samples> \
    --test_num <number_of_test_samples>

Auditing Mode

Add the --audit flag to enable auditing mode:

python generate.py \
    --dataset_name docvqa \
    --exampler 2 \
    --ensemble 10 \
    --llm qwen72b \
    --canary_file ./data/canaries/incorrect_statements_docvqa.pkl \
    --output_dir ./data/audit \
    --audit \
    --num_train 20 \
    --test_num 400

Zero-shot Mode

Add the --zero_shot flag for zero-shot evaluation:

python generate.py \
    --dataset_name docvqa \
    --exampler 0 \
    --ensemble 10 \
    --llm qwen72b \
    --canary_file ./data/canaries/incorrect_statements_docvqa.pkl \
    --output_dir ./data/audit \
    --audit \
    --zero_shot \
    --num_train 20 \
    --test_num 400

Output

The script generates two types of output files:

  1. Model Predictions:

    • Location: <output_dir>/<llm>_<shots>shot_<ensemble>ensemble_<canary>canary.jsonl
    • Format: JSONL file containing model predictions and prompts
  2. Canary Selection (in audit mode):

    • Location: <output_dir>/canary_selection_<llm>_<shots>shot_<ensemble>ensemble_1canary.pkl
    • Format: Pickle file containing canary selection information

Project Structure

.
├── data/
│   ├── canaries/           # Canary statement files
│   ├── processed/          # Processed datasets
|	├── audit/				# Audit outputs (auditing and zero-shot mode)
│   └── classification/     # Classification datasets
├── output/                 # Output directory
│   ├── results/           # Evaluation results
│   └── private_output/    # Private predictions
├── generate.py            # Main generation script
└── utils/                 # Utility functions

Additional Folders

  1. audit/
    • Contains scripts used for running audits
  2. calc_utility/
    • Scripts for computing utility metrics (e.g., accuracy, ROUGE) across datasets and
  3. dpicl/
    • Implementation of Differentially Private In-Context Learning (DP-ICL) mechanisms, such as ESA, RNM
  4. experiments/
    • Scripts for experiments, such as worst-case and average-case auditing under system prompt defenses

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages