Unofficial solution notes and reference implementation for the TAAC x KDD Cup 2026 Tencent Advertising Algorithm Competition, Industrial Track.
This repository is a cleaned public artifact derived from our competition workspace. It contains code and technical notes, but does not include official data, checkpoints, private logs, platform paths, exact final submission recipes, or any non-public material.
| Item | Value |
|---|---|
| Track | Industrial Track |
| Team rank | 35/689 |
| Percentile | Top 5.1% |
| Best public AUC | 0.851365 |
| Exact final recipe | Withheld from the public repository |
| Task | Large-scale advertising pCVR prediction |
- Starts from a cleaned competition baseline and documents how the solution evolved.
- Sequence-based pCVR modeling for industrial advertising recommendation data.
- Sparse/dense feature tokenization with RankMixer-style non-sequence tokens.
- Time-aware sequence buckets and public-tail-oriented validation.
- Multi-task click/conversion objective for regularizing sparse conversion labels.
- Auxiliary validation windows and leaderboard-correlation analysis for model selection.
- Controlled experiments covering validation design, feature engineering, objectives, seeds, checkpoint selection, and final-sprint ablations.
| Stage | Main change |
|---|---|
| Early cleaned baseline | Fast sequence encoder on the initial HyFormer-style baseline |
| Time-bucket correction | Per-domain sequence recency treatment |
| Stronger baseline | More reliable temporal validation and auxiliary diagnostics |
| Fresh-tail family | Training/selection closer to public-adjacent tail windows |
| MTL family | Click/conversion multi-task regularization |
| Final selected family | Public-positive family selected by validation evidence and limited leaderboard checks |
The central lesson was that public score improvements came more from validation alignment and objective calibration than from simply adding larger or more complex modules.
.
├── src/ # Training, inference, dataset, model, trainer, EDA utilities
├── configs/ # Public reference configs for key milestones
├── scripts/ # Local example commands
├── docs/ # Clean technical report and validation notes
├── experiments/ # Sanitized experiment summary tables
└── examples/ # Small public placeholders; no official data included
- A cleaned implementation of the competition model stack.
- Redacted reference configs for public study. Exact final run arguments are intentionally withheld while the competition/review context may still matter.
- Technical notes on temporal validation, model selection, and final-sprint lessons.
- A concise timeline and sanitized negative-result summary.
- Official train/test data.
- Checkpoints or model outputs.
- Private platform logs, copied leaderboard screenshots, user IDs, or workspace paths.
- Any credential, account, or platform-specific runtime state.
To run the code, place the official competition data under a local data/ directory or pass --data_dir /path/to/data.
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
bash scripts/train_reference_example.sh /path/to/official/dataThe example command is intended as a readable template, not as the exact final submission recipe. Exact platform scores require the official environment, full dataset, competition evaluation service, and private run records.
docs/01_competition_overview.mddocs/02_solution_report.mddocs/03_temporal_validation.mddocs/04_experiment_summary.mddocs/05_timeline.mddocs/06_technical_report.mddocs/07_chinese_retrospective.md
If you reference this repository, please cite it as an unofficial competition solution:
TencentUniRec-TAAC2026: Unofficial TAAC x KDD Cup 2026 Industrial Track solution notes and implementation.
Rank 35/689, Top 5.1%, Public AUC 0.851365. Exact final recipe withheld from the public repository.
This project is not an official Tencent, TAAC, or KDD Cup repository. All competition names belong to their respective organizers. The implementation is provided for educational and portfolio purposes. See NOTICE.md for source, licensing, and data-handling notes.