Comparing Fine-tuned BERT and Hybrid Classification Model for Log Classification
This project was to implements a log classification system to category log information and send out alert if needed. A Bert model was first fine-tuned to do the task and another approach was done by combining three complementary approaches (Regular Expression (Regex), Sentence Transformer + Logistic Regression and LLM (Large Language Models)) to handle varying levels of complexity in log patterns. The focus was about the accuracy and latency of this two models regarding flexibility and effectiveness in processing predictable, complex, and poorly-labeled data patterns.
- data : check data patterns (grouping/clustering) - DB scan algorithm
- model : fine-tuned BERT vs hybrid system
- model evaluation comparision