Querybot:query forecast and execution borrow ideas like Feisu #1637
Description
Feature Request
Summary
Modeling user's behavior and auto-tuning index and cache
User's behavior locality has been found in many related work. Automatic optimization based on system current state has been devleoped in some Industrial business system. In Feisu, pulished in 2017, had shown the basic application of self-built indexing and caching based on user's action modeling. That's just a beginning.
Solution
Basically, AI system in Feisu may contains following components:
- Data collection
- Cleaning and labeling
- model traning and procedure
- model shipping
- online predict
- feedback flow
The whole process is working as:
Data collection ----> Cleaning and labeling ----> model traning and procedure ----> model shipping --> online predict ----> actions
| |
| <----------------- feedback ----------------------- |
Data collection: it is a set of metric subsystem which facilitate developers embedded in anywhere to collect system state informaction, including workload, data distribution, scheduling, network traffic, etc. These are raw data for system brain input.
Cleaning and labeling: This is key step to refine useful features for system brain's training. The definition of features depends on the output of brain's models (here, it can be one model or a set of models).
Model traning and procedure: This step is offline model training. Actually, system brain is training offline in Feisu, and then shipped to Decision subcomponent (In action step) to execute online prediction. Previously, we used to do online model training, but the effect was not very good. Online realtime training and predicting occurs very drastic shaking and result in model useless.
model shipping: the challenges here are mainly the model consistency. Manytime, a set of models, which are work together in system brain, should be turn on in the same time.
online predict: In reality, it is often necessary for rules, policies and models to work together at the same time, and it is basically impossible to simply depend the model. Especially in the cold start, the rule strategy often plays a greater role, and the model prediction is often effective for previously inexperienced scenes.
feedback flow: Feedback subsystem is very important in AI system. The next-generation model needs to be more labeled with learning feedback to improve itself, so that it can achieve better and better model iterations and achieve closer and closer to real problems. At the same time, using offline model training, the evaluation set can be used to confirm that the newly produced model is better than the previous generation before it can be launched online.
Misc: Debug and intervention mechanisms are also an important part. Early systems were often stupid when applying artificial intelligence technology, and needed a mechanism for online testing, problem tracking, and intervention