Skip to content

FabioLousJay/lexpref-ptbr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

LexPref-PTBR

A Brazilian Portuguese Legal Preference Dataset and RLHF Fine-Tuning Pipeline

Overview

LexPref-PTBR is an end-to-end RLHF pipeline focused on Brazilian consumer law reasoning. It covers synthetic preference pair generation, reward model training, and DPO fine-tuning on a small open-source LLM, with full experiment tracking via Weights & Biases.

Motivation

Brazilian Portuguese is underrepresented in legal AI benchmarks. This project addresses that gap by combining domain-specific annotation expertise with a reproducible fine-tuning pipeline.

Pipeline Stages

  • Phase 1: Environment setup and library familiarization (datasets, transformers, peft, trl)
  • Phase 2: PT-BR legal preference dataset construction with IAA simulation
  • Phase 3: Reward model training + DPO fine-tuning with W&B logging

Status

🔨 Active development — Phase 1 in progress

Author

Fabio De Pinho | LLM Training Data Specialist

About

LexPref-PTBR: A Brazilian Portuguese Legal Preference Dataset and RLHF Fine-Tuning Pipeline

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors