This repository contains the final version of code and models for submission used for the 2025 K-League Pass Prediction Challenge hosted on DACON. (Competition Link)
- Team Name: Placeholder
- Public rank: 13rd
- Private rank: 5th
- Final rank: 4th
- Private score: 12.66811
- Team Members: knowin_kyeong (Team Leader), oyko, Doru, William Han
The Submission directory follows this directory structure (Simplified):
Note: In colab environment, you have to upload '2025KLeaguePassPrediction/Submission' on your Google Drive.
Note 2: We removed Submission/open_track1 data due to potential risks of violating DACON's data sharing policies. Please replace it with your own copy of the dataset downloaded from DACON website.
2025KLeaguePassPrediction/Submission
│ final_submission.csv
│ requirements.txt
│
├─inference
│ inference_001_tf_w5.ipynb
│ inference_002_lstm_w3.ipynb
│ inference_003_tf_w20.ipynb
│ inference_004_lstm_w20.ipynb
│ inference_005_tf_w32.ipynb
│ inference_006_lstm_w32.ipynb
│ inference_007_cb.ipynb
│ inference_008_ensemble_inference_part.ipynb
│ submission_cb.csv
│ submission_lstm_w20.csv
│ submission_lstm_w3.csv
│ submission_lstm_w32.csv
│ submission_tf_w20.csv
│ submission_tf_w32.csv
│ submission_tf_w5.csv
│
├─open_track1 (Place your own copy of dataset here)
│ │ data_description.xlsx
│ │ match_info.csv
│ │ sample_submission.csv
│ │ test.csv
│ │ train.csv
│ │
│ └─test
│ ├─153363
│ │ 153363_1.csv
│ │ ...
│ │ 153363_9.csv
│ ├─153364
│ │ ...
│ └─153392
│
└─train
best_fold_cb_0.pkl
best_fold_cb_1.pkl
best_fold_cb_2.pkl
best_fold_cb_3.pkl
best_fold_cb_4.pkl
best_fold_cb_5.pkl
best_fold_cb_6.pkl
best_fold_cb_7.pkl
best_fold_cb_8.pkl
best_fold_cb_9.pkl
best_fold_lstm_w200.pth
best_fold_lstm_w201.pth
best_fold_lstm_w202.pth
best_fold_lstm_w203.pth
best_fold_lstm_w204.pth
best_fold_lstm_w205.pth
best_fold_lstm_w206.pth
best_fold_lstm_w207.pth
best_fold_lstm_w208.pth
best_fold_lstm_w209.pth
best_fold_lstm_w30.pth
best_fold_lstm_w31.pth
best_fold_lstm_w32.pth
best_fold_lstm_w320.pth
best_fold_lstm_w321.pth
best_fold_lstm_w322.pth
best_fold_lstm_w323.pth
best_fold_lstm_w324.pth
best_fold_lstm_w325.pth
best_fold_lstm_w326.pth
best_fold_lstm_w327.pth
best_fold_lstm_w328.pth
best_fold_lstm_w329.pth
best_fold_lstm_w33.pth
best_fold_lstm_w34.pth
best_fold_lstm_w35.pth
best_fold_lstm_w36.pth
best_fold_lstm_w37.pth
best_fold_lstm_w38.pth
best_fold_lstm_w39.pth
best_fold_tf_w200.pth
best_fold_tf_w201.pth
best_fold_tf_w202.pth
best_fold_tf_w203.pth
best_fold_tf_w204.pth
best_fold_tf_w205.pth
best_fold_tf_w206.pth
best_fold_tf_w207.pth
best_fold_tf_w208.pth
best_fold_tf_w209.pth
best_fold_tf_w320.pth
best_fold_tf_w321.pth
best_fold_tf_w322.pth
best_fold_tf_w323.pth
best_fold_tf_w324.pth
best_fold_tf_w325.pth
best_fold_tf_w326.pth
best_fold_tf_w327.pth
best_fold_tf_w328.pth
best_fold_tf_w329.pth
best_fold_tf_w50.pth
best_fold_tf_w51.pth
best_fold_tf_w52.pth
best_fold_tf_w53.pth
best_fold_tf_w54.pth
best_fold_tf_w55.pth
best_fold_tf_w56.pth
best_fold_tf_w57.pth
best_fold_tf_w58.pth
best_fold_tf_w59.pth
ensemble_weights_7.pkl
label_encoder_cb_player.pkl
label_encoder_cb_result.pkl
label_encoder_cb_team.pkl
label_encoder_cb_type.pkl
label_encoder_lstm_w20.pkl
label_encoder_lstm_w3.pkl
label_encoder_lstm_w32.pkl
label_encoder_tf_w20.pkl
label_encoder_tf_w32.pkl
label_encoder_tf_w5.pkl
oof_catboost.npy
train_001_tf_w5.ipynb
train_002_lstm_w3.ipynb
train_003_tf_w20.ipynb
train_004_lstm_w20.ipynb
train_005_tf_w32.ipynb
train_006_lstm_w32.ipynb
train_007_cb.ipynb
train_008_ensemble_train_part.ipynb
Submission/final_submission.csv is the final submission file.
- All training-related notebooks are in the Submission/train directory.
- All inference-related notebooks are in the Submission/inference directory.
- Place the 2025KLeaguePassPrediction directory in your colab environment.
- Set up the dataset in the Submission/open_track1 directory from DACON website.
- Run all training notebooks(train_001 ~ train_008) in the Submission/train directory to train models and save them. Running train_008_ensemble_train_part.ipynb requires all previous models to be trained and saved.
- Run all inference notebooks(inference_001 ~ inference_008) in the Submission/inference directory to generate submission files. Running inference_008_ensemble_inference_part.ipynb requires all previous models to be inferenced and saved.
- Submit the final submission file from Submission/final_submission.csv to DACON!
- All required python package is listed in Submission/requirements.txt
- You need GPU environment to run the training notebooks.
- We tested only on Google Colab Pro+, Tesla T4 GPU environment. It may not work on other environments. Therefore even though executing pip command with requirements.txt, some packages in the default colab environments is absent or have different versions, which may cause errors. You can try to install missing packages manually.
All model-related files' name follow the pattern: {train or inference}{number}{model_name}{_window_size if needed}.ipynb
- GRU-based Model ({model_name} = lstm) Replace LSTM to GRU in the baseline model and change some features.
- Transformer-based Model ({model_name} = tf or transformer) A model based on the Transformer architecture for sequence modeling and predict time series data.
- CatBoost-based Model ({model_name} = cb or catboost) A model using CatBoost, a gradient boosting library that handles categorical features well.
- All random seeds are fixed for reproducibility.
- However, according to the Catboost documentation, exact bit-level reproducibility may not be guaranteed when using GPU training.
- Tie-breaking rules for pandas sort operations may vary across different environments because quicksort is not stable.
- Therefore, We assume that our results may vary slightly in different environments, and we recommend re-training Submission/train_008_ensemble_train_part.ipynb in your own environment for the best ensemble weights.
- We also attach our trained models's weights in Submission/train directory for your convenience. If you want to simulate our LB score results, you have to infer using our trained model weights first before re-training models.
- We observed that the final private LB score may vary by about ±0.005 depending on the environment.
If you have any questions, please create an issue in this repository or contact the team leader, knowin_kyeong, via email at juwon0718@snu.ac.kr.
We plan to re-upload the code (w/o dataset) for public repository after the final phase of the competition is over.
We changed this repository to public. (2026-01-25)