A Data-Centric Approach to Pedestrian Attribute Recognition: Synthetic Augmentation via Prompt-driven Diffusion Models

Official code of "A Data-Centric Approach to Pedestrian Attribute Recognition: Synthetic Augmentation via Prompt-driven Diffusion Models" by Alejandro Alonso, Sawaiz A. Chaudhry, Juan C. San Miguel, Álvaro García Martín, Pablo Ayuso Albizu and Pablo Carballeira.

Link to Paper | Link to Supplementary Material

Summary

In this paper, we propose a data-centric approach to improve Pedestrian Attribute Recognition through synthetic data augmentation guided by textual descriptions. Specifically, our approach comprises three main steps:

First, we define a protocol to systematically identify weakly recognized attributes across multiple datasets.
Second, we propose a prompt-driven pipeline that leverages diffusion models to generate synthetic pedestrian images while preserving the consistency of PAR datasets.
Finally, we derive a strategy to seamlessly incorporate synthetic samples into training data, which considers prompt-based annotation rules and modifies the loss function.

Getting started

Step 1: Obtain Baseline Results and Select Target Attributes
Step 2: Generate Synthetic Data via ComfyUI
Step 3: Label Synthetic Data
Step 4: Train and Evaluate with Augmented Data

Step 1: Obtain Baseline Results and Select Target Attributes

This step involves using the Rethinking-of-PAR baseline to establish initial performance and identify attributes for augmentation.

Setup Rethinking-of-PAR (if not already done):
- Our work builds upon the Rethinking-of-PAR framework. If you plan to reproduce the full training and evaluation pipeline, you'll need to clone their repository and follow their setup instructions:
- Note: If you only intend to use our data generation scripts and not the training/evaluation parts, you do not need the Rethinking-of-PAR setup. However, the subsequent steps assume Rethinking-of-PAR is correctly set up and its directory structure is accessible.
Obtain Baseline Results: Utilize the Rethinking-of-PAR codebase to evaluate its performance. This will serve as baseline.
Identify Target Attributes for Augmentation: Based on the baseline results, the weak attributs are selected. Our criteria for selecting attributes are detailed in Sec. 3.1 of the paper.

Step 2: Generate Synthetic Data via ComfyUI

This step uses ComfyUI for text-to-image diffusion to generate synthetic pedestrian images.

Setup ComfyUI

Ensure you have a working installation of ComfyUI
After installing ComfyUI, also install additional dependencies from requirements_generation.txt in this repo.
Note: While we use ComfyUI, the general principles of prompt-driven generation should be adaptable to other diffusion model interfaces if you implement the necessary logic.

Load Workflow and Wildcards in ComfyUI:
- Launch ComfyUI.
- Import our provided generation_workflow.json. This file defines the pipeline for prompt generation, image diffusion, and initial post-processing.
- Ensure any custom nodes referenced in the workflow are installed in your ComfyUI setup. None of the nodes used in our workflow were created by us; they should all be publicly available. For direct references, please see our supplementary material.
- For using our wildcard setup, place the wildcard files (e.g., those in the data_augmentaton/wildcards/ directory) where ComfyUI can access them (typically its ComfyUI/custom_nodes/ComfyUI-DynamicPrompts/wildcards/).
- Important: The files under text_files/ in this repository are not directly for ComfyUI's wildcard system; they are used by our add_synthetic_labels.py script in Step 3. If you modify wildcards for ComfyUI, ensure corresponding changes are made in text_files/ for consistent labeling.
Generate Images
- Configure the number of samples, noise levels, etc., in ComfyUI according to your needs.
- For guidance on how many synthetic images to create (e.g., 3× or 5× per real sample) or some specific parameters or generation process, reffer to the supplementary material.
Optional: Manual Verification and Cleaning:
- In our experiments, we checked every image used for augmentation, and so, we highly recommend manually reviewing the generated images.
- Verify that the intended attribute is clearly present in each synthetic image.
- Discard images with significant generation artifacts (e.g., extra limbs, distorted faces, impossible poses).

Step 3: Label Synthetic Data

This step involves assigning PAR labels to the generated synthetic images and merging them with an existing dataset.

Run Labeling Script:
- Set the parameters on the add_synthetic_labels.py script from this repository and execute it. This script uses the .txt files in the text_files/ directory (which map prompt components to attribute presence) to assign extended labels (-1, 0, 1, 2, 3) to the synthetic images.
- For a detailed explanation of these label values and their significance, please refer to Sec. 3.3 of our paper.
- The script will merge your original dataset annotations (e.g., from a .pkl file) with the new synthetic images and their labels, outputting a new .pkl file for training.
- Note: The current add_synthetic_labels.py script is tailored for RAP dataset structures. You may need to adapt it if you are using other PAR datasets with different annotation formats.

Step 4: Train and Evaluate with Augmented Data

This step uses the augmented dataset created in Step 3 to train and evaluate the Rethinking-of-PAR model. This step assumes you have Rethinking-of-PAR set up (from Step 1).

Update Baseline Codebase (Rethinking-of-PAR):
- Configuration Files:
  - Copy the default.py from our repository into the Rethinking-of-PAR configs/ subdirectory.
- Loss Function:
  - Copy our augmented loss function file into the appropriate loss module directory within the Rethinking-of-PAR codebase (Rethinking_of_PAR/losses/).
- Dataset Path in Config:
  - In your Rethinking-of-PAR training configuration file, update the dataset path to point to the new .pkl file generated in Step 3.
  - Ensure the configuration also references the augmented loss function you copied.
  - An example of a full training configuration (example_augmented_config.yaml) demonstrating necessary changes is provided in this repository.
Train and Evaluate:
- Train the model using the updated configuration file and the augmented dataset.
- Compare the results against the baseline performance (obtained in Step 1) to quantify the impact of the synthetic data augmentation.

Pre-generated Synthetic Data

We provide our generated synthetic images to facilitate reproduction or quick testing. You can download them from the following links:

hs-BaldHead: Download Link_BaldHead
lb-ShortSkirt: Download Link_ShortSkirt
AgeLess16: Download Link_Age16
ub-SuitUp: Download Link_SuitUp
attach-PlasticBag: Download Link_PlasticBag

Backup/alternative link: Download Link all data

Citation

If you find this work useful in your research, please consider citing our paper:

@InProceedings{Alonso_2025_AVSS,
    author    = {Alejandro Alonso, Sawaiz A. Chaudhry, Juan C. SanMiguel, \'{A}lvaro Garc\'{i}a-Mart\'{i}n, Pablo Ayuso-Albizu, Pablo Carballeira},
    title     = {A Data-Centric Approach to Pedestrian Attribute Recognition:Synthetic Augmentation via Prompt-driven Diffusion Models},
    booktitle = {Proceedings of the IEEE International Conference on Advanced Visual and Signal-Based Systems },
    month     = {August},
    year      = {2025},
    pages     = {1-6}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
assets		assets
data_augmentation		data_augmentation
network_training		network_training
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

A Data-Centric Approach to Pedestrian Attribute Recognition: Synthetic Augmentation via Prompt-driven Diffusion Models

Summary

Getting started

Step 1: Obtain Baseline Results and Select Target Attributes

Step 2: Generate Synthetic Data via ComfyUI

Step 3: Label Synthetic Data

Step 4: Train and Evaluate with Augmented Data

Pre-generated Synthetic Data

Backup/alternative link: Download Link all data

Citation

About

Uh oh!

Releases

Packages

Languages

License

vpulab/txt2imgPAR

Folders and files

Latest commit

History

Repository files navigation

A Data-Centric Approach to Pedestrian Attribute Recognition: Synthetic Augmentation via Prompt-driven Diffusion Models

Summary

Getting started

Step 1: Obtain Baseline Results and Select Target Attributes

Step 2: Generate Synthetic Data via ComfyUI

Step 3: Label Synthetic Data

Step 4: Train and Evaluate with Augmented Data

Pre-generated Synthetic Data

Backup/alternative link: Download Link all data

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages