Feature request / 功能建议
At present, in many scenarios, "input box text input" is divided into "CLICK" first, and then "TYPE".
It's a human input habit.
However, in fact, this is not necessary for computers. The coordinates output by "CLICK" and "TYPE" are the same. If the operation logic is put into Type Action to realize the theory, there is no problem, and the time and coast of a model call can be saved.
Motivation / 动机
Improve Action Execution Efficiency
Your contribution / 您的贡献
If necessary, I can prepare the training data set under simple guidance.
Feature request / 功能建议
At present, in many scenarios, "input box text input" is divided into "CLICK" first, and then "TYPE".
It's a human input habit.
However, in fact, this is not necessary for computers. The coordinates output by "CLICK" and "TYPE" are the same. If the operation logic is put into Type Action to realize the theory, there is no problem, and the time and coast of a model call can be saved.
Motivation / 动机
Improve Action Execution Efficiency
Your contribution / 您的贡献
If necessary, I can prepare the training data set under simple guidance.