LexPref-PTBR

A Brazilian Portuguese Legal Preference Dataset and RLHF Fine-Tuning Pipeline

Overview

LexPref-PTBR is an end-to-end RLHF pipeline focused on Brazilian consumer law reasoning. It covers synthetic preference pair generation, reward model training, and DPO fine-tuning on a small open-source LLM, with full experiment tracking via Weights & Biases.

Motivation

Brazilian Portuguese is underrepresented in legal AI benchmarks. This project addresses that gap by combining domain-specific annotation expertise with a reproducible fine-tuning pipeline.

Pipeline Stages

Phase 1: Environment setup and library familiarization (datasets, transformers, peft, trl)
Phase 2: PT-BR legal preference dataset construction with IAA simulation
Phase 3: Reward model training + DPO fine-tuning with W&B logging

Status

🔨 Active development — Phase 1 in progress

Author

Fabio De Pinho | LLM Training Data Specialist

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
phase1_dataset_build.ipynb		phase1_dataset_build.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LexPref-PTBR

Overview

Motivation

Pipeline Stages

Status

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LexPref-PTBR

Overview

Motivation

Pipeline Stages

Status

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages